A Sensitivity Study of High ‐ Resolution Climate Simulations for Greece

: In the present study, the ability of the Advanced Weather Research and Forecasting numerical model (WRF ‐ ARW) to perform climate regionalization studies in the topographically complex region of Greece, was examined in order to explore the possibility of a more reliable selection of physical schemes for the simulation of historical and future high resolution (5 km) climate model experiments to investigate the impact of climate change. This work is directly linked to a previous study investigating the performance of seven different model setups for one year, from which the need was derived for further examination of four different simulations to investigate the model sensitivity on the representation of surface variables statistics during a 5 ‐ year period. The results have been compared with observational data for maximum and minimum air temperature and daily precipitation through statistical analysis. Clear similarities were found in precipitation patterns among simulations and observations, yielding smoothly its inter ‐ annual variability, especially during the wettest months and summer periods, with the lowest positive percentage BIAS calculated at about 19% for the selected combination of physics parameterizations (PP3). Regarding the maximum and minimum temperature, statistical analysis showed a high correlation above 0.9, and negative bias around 1 − 1.5 °C, and positive bias near 2 °C, respectively.


Introduction
It is well known that the General Circulation Models (GCMs) are the main numerical tools that can be used by climate scientists to study the physical mechanisms of climate, and how these mechanisms are modified due to the increased emissions of greenhouse gases and pollutants, and other changes of the Earth system like changes in land use, causing the observed climate changes. The GCMs are also used extensively for the simulation of future climate projections in view of global climate change for the application of mitigation and adaptation measures. According to El-Samra et al. [1] however, a number of physical mechanisms (e.g., convection, clouds and precipitation, heterogeneity of surface fluxes and planetary boundary layer (PBL) turbulence) is not accurately represented in the GCMs for the simulation of physically consistent regional and local circulations, particularly for the regions characterized by complex topographical features, due to the rather coarse spatial resolution of approximately 80 to 300 km. To overcome such obstacles, the dynamical downscaling technique has been developed using Regional Climate Models (RCMs), and applied successfully in a number of climate modeling studies so far, such as those of [2][3][4][5][6][7][8][9][10][11].
Recent research by Sorland et al. [12] on the biases of the GCMs and RCMs has indicated that RCMs can reduce systematically the biases of the driving GCMs. The impact on the climate modeling studies of the evolution of RCMs has been reported in, e.g., [13] while in [14], a review about the current development of coupled Regional Earth System Models (RESMs) is provided. Ensemble climate simulations at 0.11 degrees spatial resolution within Euro-CORDEX, e.g., [15][16][17][18][19][20], and Med-CORDEX, e.g., [21,22] initiatives have significantly contributed towards our understanding of regional climate processes and their response to climate change.
RCMs of improved representations of climate information with high spatial resolution can explicitly resolve local features of complex terrain, areas with land-sea interactions and various land cover/land use types [3,[23][24][25]. Simulations of high spatial resolution can be used for climate impact assessments, as they can better resolve physical processes of regional, mesoscale and local scale circulation effects (surface fluxes, breezes, convection and heavy precipitation) [26]. In addition, such high spatial resolution simulations are imperative when the topography of the region is rather complex with mountainous features and rough coastlines, because of the improved representation of surface characteristics and their spatial variability [27].
Several studies reported an improvement in simulated climate variables through the use of high-resolution modeling, such as, rainfall amount [4,[28][29][30][31] and precipitation intensity [32,33]. Olsson et al. [34] found that using a high resolution (6 km) RCM (RCA3), low-frequency sub-daily extremes were in good agreement with the values found in point observations in Sweden. A similar study for Denmark revealed that RCM simulations at higher spatial resolutions (8 km and 12 km) represent extreme precipitation events better, and future projections depend on the combination of GCM-RCM, the spatial resolution and the temporal aggregation [35]. Nevertheless, for local scales where climate is mainly controlled by large-scale external features, the increase in spatial resolution of the RCMs does not improve the simulated results [13,24].
Recent research has focused on the performance of the Advanced Weather Research and Forecasting numerical model (WRF-ARW) to reproduce climate with high spatial resolutions at several time scales, as this model allows researchers to easily choose among a large number of physical parameterizations used to represent phenomena, including microphysics, cumulus, PBL, radiation, etc. In addition, Pieri et al. [36] pointed out the importance of investigating the sensitivity of model performance when applying different spatial resolutions, as well as convective and microphysics parameterization schemes in climate change regional impact assessment studies. A number of studies refer to climate modeling over Europe [2,[37][38][39] and other on country level domains [3,4,31,40], in order to demonstrate the ability of WRF to represent extreme events of temperature or precipitation, climate indicators and drought variability, moreover aiming to generally reproduce climate features spatially and temporally with improved results, due to efficiently capturing local processes over complex topography. Many of these studies were realized in the framework of CORDEX and other projects, with spatial resolution increased from 50 km to 5 km [41]. Simulations of future climate projections by applying downscaling can be performed following the evaluation of the model for historic periods, using the same set-up [1].
For the study area that focuses on Greece, studies of forecast predictability and applications with the WRF model have been carried out, with a number of different parameterization combinations. WRF has been used to perform analysis and operational and seasonal forecasting [42,43] by the group of EREL, NCSR "Demokritos" (https://forecast.ipta.demokritos.gr/). Other studies of the area of Greece have focused on WRF sensitivity to the boundary layer parameterizations for the simulation of intense precipitation and to different physical parameterizations (microphysics, boundary layer and convective schemes) [44]. Previous sensitivity research included studies for the characteristics of convective activity over Greece, predicting lightning activity investigation [45,46], studying the urban heat island over Athens [47], or the structure of PBL over the Aegean Sea. Sensitivity studies were also carried out to estimate model performance on extreme weather events over Greece [48], and model a tornado event [49].
An extensive research was assessed by Sindosi et al. [50], remarking that for the Epirus region (north-western Greece), the inclusion of the high-resolution convective parameterization scheme improved substantially the results, apart from mountainous areas, where the results were comparable with or without the use of convective schemes. Similar conclusions were adopted through the studying of forecasts of summer thunderstorms over Athens and convective parameterization [51,52].
The geomorphological complexity of Greece enhances the need of high resolution climatology studies due to: 1) the orographic chain along the central part together with the moisturized air masses coming from the central Mediterranean Sea, 2) the extended coastal zones and numerous islands, creating inhomogeneous geographical distribution of climatic variables (rainfall, temperature, etc.). The aim of this study is to establish high resolution dynamical downscaling over the complex topography of Greece. The initial work [26], which is linked to the current, included a number of simulations that investigated the sensitivity of the WRF-ARW surface outputs to different combinations of model settings and physical processes, including horizontal resolution, PBL, convective or not, microphysics schemes, and the performance of model simulations with the optimal combination through reanalysis data. In the current work, we performed 5 years of high resolution dynamical downscaling experiments, from 2000 to 2004 inclusive, with the use of the WRF model and 5 km spatial resolution. We downscale the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-Interim) data [53] to the region of Greece. The objective of this work is to validate the high resolution downscaling simulations with the use of the available observed values of minimum and maximum daily temperature and daily precipitation, in order to select among different physics parameterizations (e.g., cloud microphysics, boundary layer and cumulus), the sufficient setup to investigate the overall model's performance over the area of interest for subsequent high resolution historic and future climate model experiments. Section 2 focuses on the description of model setup with the different physical configurations, the observational data used for the study, along with the metrics of statistical analysis for the validation. In Section 3, the results of different simulations compared to the station data through the statistical errors and diagrams are illustrated and discussed, while the final section includes the summary and conclusions about the main findings of the study.

Model Setup
In the present study, the WRF version 3.6.1 [54] was used to perform the simulations of dynamical downscaling. WRF was setup with two nested grids ( Figure 1); the outer domain, centered in the Mediterranean basin, has been configured using 265 × 200 grid cells with 20 km horizontal resolution (D01 -Europe), and the innermost domain that uses 185 x 185 grid cells of 5 km horizontal grid spacing covering the area of Greece (D02 -Greece). This setup has used 40 vertical levels arranged according to terrain-following hydrostatic pressure vertical coordinates, and one-way nesting has been applied, to avoid possible noise during feedback from the inner domain to the coarse domain. As the simulation evolves, the internal solution computed by the RCM drifts away from the driving analysis, thus spectral nudging is applied in order to reduce the effects of domain position and geometry, and to prevent any inconsistencies along boundaries over an open system [2,55,56]. The spectral nudging was applied in this study for temperature, winds and geopotential height, but not for humidity. The period for the evaluation of all the WRF simulations spans from 1 January, 2000 until 31 December, 2004. This time period was selected based on the satisfactory number of high quality available observational data. For each simulation, the last four days of the previous month were regarded as model spin-up for the following month and were discarded, thus the model was re-initialized every month. The simulations have run independently of each other with parallel integration in order to decrease the total time needed to complete the 5-year climatology. The frequent re-initialization of the runs retains the model solution from diverging from long-term forcing, outperforms the continuous simulation runs, and distinguishes model errors that develop quickly from those over a long period [57][58][59]. In addition, similarity between modeled soil moisture suggests that monthly re-initialization does not affect the simulated surface temperature and precipitation fields. We have realized two extra experiments [60] to verify this, that have not been included in this work: with seasonal re-initialization and one continuous simulation for one year, starting December 1st of the previous year of simulation, and compared the model output to the control integration. Contingency tables and probability density function (pdf) analysis showed that the results were almost indistinguishable between the three simulations.
As already mentioned, the ERA-Interim reanalysis fields of 0.75° × 0.75° horizontal resolution were used for the initial and boundary conditions. The lateral boundary conditions and sea surface temperature were both updated every 6 hours from ERA-Interim. According to Dulière et al. [61], reanalysis data can be used for the evaluation of regional models, as they represent adequately the large-scale forcing necessary for the models to simulate the physical processes and surface interactions.
In the current study, four WRF simulations have been performed, using different combinations of physics parameterizations, in order to investigate their effects on temperature and precipitation fields in the inner domain of Greece. These different experiments were finally selected considering previous results for the same area, employing a larger set of WRF simulations covering 1-year period and more variables, such as relative humidity and surface wind speed [26]. For convenience and link to this previous research, we kept the names of the four best selected combinations of physics parameterizations (as PP2, PP3, PP5, PP7).  [62] and Mellor-Yamada-Janjic (MYJ); [63] were employed, associated with the corresponding surface layers (SLP) schemes, which provide the surface fluxes of momentum, moisture and heat to the PBL scheme. The MYJ scheme is a local closure model which applies a local approach to determine eddy diffusion coefficients, based on the local turbulent kinetic energy (TKE) equation. No information from lower or higher levels directly influences these terms. In this scheme, the entrainment develops only from local mixing. In contrast, the YSU scheme is a non-local closure scheme, where the critical Richardson number that describes the top of the PBL is set to 0.25 over land for enhancing mixing in the stable boundary layer. In this case, entrainment is explicitly treated.
Regarding cloud microphysics, three schemes were used in total: the WRF single-moment six-class (WSM-6) containing ice, snow and graupel processes [64], the Ferrier (FE) and the Thompson (THOM), which includes six classes of moisture for ice as prognostic variables [65]. For the cumulus parameterization, the Grell-3D (G3) [66] and Betts-Miller-Janjic (BMJ) [67] schemes were chosen, taking into consideration the gray zone between 5 and 10 km for cumulus option.
The radiation scheme was set to the newer version of the Rapid Radiative Transfer Model, RRTMG; [68] for both longwave and shortwave radiation. Only the Noah LSM was employed as the land surface model (LSM), as it is widely adopted for climate studies [69][70][71]. According to Cavan et al. [72], the scheme allows the simulation of soil and land surface temperature, snow depth and snow water equivalent, both water and energy fluxes, among others e.g. [69,73,74]. The Noah Scheme has four distinct soil layers (0.1, 0.3, 0.6 and 1.0 m) that reach a total depth of 2 m, and one vegetation canopy layer. For the estimation of potential evapotranspiration (PET), the Penman equation is used, while 16 soil and vegetation parameters are utilized for the estimation of soil temperature, soil moisture, snow cover and atmospheric feedbacks [75].

Observational Datasets
The observational data were provided by the Hellenic National Meteorological Service (HNMS) of Greece in the ECA and D project. The availability of continuous observations covering the selected period of 5 years was limited, as complete daily data existed only from a limited number of stations representing various geoclimatic regions of Greece. In total, 28 stations provided daily minimum and maximum temperatures, and 23 stations daily precipitation during the 5-year study period. It is emphasized that there is not any additional high-resolution gridded dataset covering the domain of interest. Furthermore, gridded data, such as the E-OBS dataset, are constructed with a relatively coarse network density for certain areas (including the Greek territory), with a small number of stations, implying over-smoothing across extreme percentiles of temperature and precipitation values in the resulting interpolation [76].

Analysis and Data Handling
The analysis did involve comparisons of the WRF model simulations with available measurements from the various stations of the inner (nested) domain. To evaluate the WRF downscaling results and model performance, daily maximum and minimum statistics are derived from the 6-h data simulations. Four statistical indices were estimated, the root mean square error (RMSE), which gives an overview of the accuracy of the simulations, the BIAS that indicates whether the model over-or under-predicts the given meteorological value, the mean absolute error (MAE), a measure of the absolute values of the model errors and the Pearson's correlation coefficient (COR), a measure of linear correlation between observations and simulation data. The model error is calculated as the difference between the modeled and observed value. Further analysis of meteorological variables was done on a daily basis based on percentiles and probability density function plots.
In addition, Taylor diagrams were used to provide a comprehensive statistical and graphical assessment of how well the simulated and observed values compare with each other in terms of their correlation and normalized standard deviation, depending on the choice of the physical parameterizations [77]. Taylor diagrams and statistical metrics were calculated for seasonal and yearly time periods for the examined variables of daily maximum temperature (TX) maximum temperature (TX) and precipitation (PR). The white circle represents the standard deviation of the observational data. The formula of the calculated statistical metrics is given in Table 2: Table 2. Summary of statistical formulas calculated for model evaluation in this study.

Parameter
Formula Range Ideal value Where is the value of the observational data, is the simulated data, and "n" is the total number of gridded locations that the model data are compared against observations.
The simulated values are taken at the center of the grid cell for temperature and precipitation. The evaluation of daily variables was made using the nearest grid point of the model to the corresponding location of the observed station, in order to find the statistical metric at that grid point [1,3,78]. The total RMSE, for example, for the Greek area, was then found by integrating the area averages (WRF model grid cell) and stations points. Because of the limited and easy to handle number of Greek stations, results are also examined and presented for each station in order to spatially evaluate the WRF model simulations skills in detail.
Height differences between model topography and stations are observed because of the complexity of the topography and the coastlines of the area. Thus, before the statistical analysis for temperature, a constant lapse-rate elevation correction of 6 °C/km had to be applied [3,23,79,80]. The accuracy of the simulated precipitation was also determined by statistical scores of a contingency table (Table 3): probability of detection (POD), critical success index (CSI) and false alarm ratio (FAR). In this study we use for four distinct threshold values of precipitation for low rainfall (>1mm), medium rainfall (>2.5 mm), heavy rainfall (>10 mm) and extremely heavy rainfall days (>20 mm) [81][82][83][84], to evaluate small and large rainfall events, for the location of each station separately.

Results and Discussion
In this section, the results of the WRF sensitivity experiments are presented and discussed, as emerged through the comparison with weather station datasets. The analysis is divided per daily precipitation and maximum and minimum temperatures, which are the most important for the definition of climate indices, related to future impact studies, such as heat and cold waves and drought events.

Precipitation
Monthly precipitation time-series are illustrated in Figure 2. The seasonal pattern of monthly precipitation is well captured by the majority of the schemes, also observed in the study of García-Díez et al. [57], remarking the highest precipitation during winter and lowest during summer. The black, dashed line illustrates the seasonal variability of monthly precipitation derived from the average values of the available stations, which is noticed on the models results. There are obvious similarities in the precipitation patterns among all experiments and observed data for the 5 years of comparison, yielding smoothly the precipitation's inter-annual variability, especially during the wettest months and summer periods characterized of limited rainfall. Some differences show that in general, there is an overestimation of precipitation compared to ground data for all setups, and this probably is caused by excessive wintertime precipitation [85]. On the other hand, ERA-Interim appears to underestimate winter precipitation from November to January. Some cases of precipitation's underestimation are related to the PP3 setup during the study period, and as this simulation appears to have the less overestimation among setups. This fact is also confirmed with the 5-yearly estimation of statistical errors based on daily precipitation values in Table 4, where PP3 shows the best performance, with positive percentage BIAS of about 19%, while the other models have values of over 40%. As being observed, RMSE, MAE and COR results as well show slightly better performance for PP3 compared to the other simulations. It is worthy to note that PP3 bias (19%) is significantly smaller than the ±50% reported in the Third Assessment Report of the Intergovernmental Panel on Climate Change by Giorgi et al. [86] for RCMs, and while not small, it is close to the range of the best performing RCMs shown in several studies [3,23,71,87]. PP3 uses the Betts-Miller-Janjic cumulus parameterization, WSM6 for microphysics, the Mellor-Yamada-Janjic as PBL scheme and Monin-Obukhov similarity theory (Table 1)  The study of the spatial distribution of 5-annual mean precipitation in WRF simulations and observations shows a clear topographical dependency ( Figure S1). The analysis of the amount of precipitation yields large differences between plain areas and higher elevations, with maximum values of annual precipitation found in the western part of the country, related to fronts passages with increased vertical lift due to orographic enhancement in mountainous locations. All simulations depict similarly the spatial pattern of precipitation, with excessive rain being observed only in mountainous locations; however, there are no representative stations to validate such precipitation amounts. The same behavior is observed for all model setups, although PP7 seems to produce higher amounts of precipitation during the examined 5-year period than PP2 and PP3. This fact could be related to the interaction of the microphysical scheme with the association of PBL (MYJ) scheme, which is in line with other studies with higher precipitation totals and more convective precipitation [88,89]. Additionally, for precipitation over areas of complex topography, wet bias is particularly found to be common to several RCMs [90,91], and is probably caused by an overestimation of orographic precipitation enhancement [92], and/or to an inaccurate PBL simulation [7,11,93]. The Taylor diagrams shown in Figure 3 provide the comparative assessment of the four different model experiments to the choice of the physical parameterizations, to simulate the seasonal spatial pattern of daily precipitation during the examined period. The simulated results are compared to all observational data from 23 stations. The best simulation is marked by the largest correlation, smaller CRMSE and being closer to the observed standard deviation. It is found that the highest correlation in the range of 0.5-0.6 is observed during the winter and spring seasons, with the poorest correlation in summer, resulting though in the lowest centered RMSE values (~3.5 mm). It is well-known that the satisfied representation of summer precipitation is a demanding field for any model, and as the convective processes prevail, it is not easy to determine confidently the appropriate cumulus scheme. The poor performance of model setups is observed during winter and autumn, where the highest CRMSE values are displayed. All models' simulations seem to have similar performance; however, PP3 appears to have slightly lowest errors during all seasons, thus yielding a better performance compared to the rest of the setups.
Because of the heterogeneous spatial pattern of precipitation in Greece, which is strongly associated with the orography, the extended coastline (see Figure 4) and the limited number of stations for comparison, statistical errors are also displayed per station in detail in Figure S4:  This analysis allows also exploring which of the simulations outperforms by station, which of the stations statistically is better validated from the WRF model, and finally which setup represents outmost the majority of the stations in Greece.
It is evident that the prevailing simulation with the lowest errors among stations is represented by PP3, showing a significant improvement regarding the others. In general, regional models still misrepresent daily precipitation and precipitation extremes because of resolution or/and parameterization deficiencies [3]. The precipitation analysis was extended on studying the intensity of daily precipitation; thus, the 80th, 90th, 95th and 99th higher ranking percentiles were calculated. This analysis is very important for climate change assessment related to extreme weather events, drought and flooding events, that have significant socioeconomic impacts on the global community. The results are shown in Figure 5, where the WRF percentiles distribution is plotted versus the observational data for the domain of Greece and several indicative stations distributed all over Greece during these 5 years. The straight (blue) line depicts a perfect performance, indicating the over-or under-estimation of the simulated values compared to the observations.
The percentiles, obtained for the area of Greece, show that all WRF simulations follow very well the observational percentiles. The PP3 model setup outperforms remarkably well the extreme percentile 99%, while the other setups tend to slightly overestimate rainfall for 90% and 95% extremes. A general inspection of the percentiles results by station shows that WRF dynamical downscaling simulations overestimate extreme precipitation events, with few exceptions regarding the 99% percentile, and also that PP3 model simulation tends to reproduce rainfall extremes better than the other setups. In Table 5, precipitation verification statistics are presented in the contingency  table for four distinct threshold values of precipitation compared to the 5-year observational data.  This table depicts the results by station, only for the best performed simulation PP3. The forecasts show reasonable skills for both low and medium intensity rainfall days, as the model runs show POD values of (0.7-0.86) and (0.6-0.85), respectively. Regarding extreme rainfall events, the majority of the stations indicate POD values close to 0.5-0.6, followed by FAR values of (0.5-0.7), meaning that a very low percentage of these rain events (observed and/or predicted) were correctly forecast. Probably the rare episode of convective precipitation is often missed or underestimated by the model and the convective scheme.

Maximum Temperature/Minimum Temperature
Annual and seasonal changes in the daily minimum (TN) and maximum (TX) temperatures for the selected period have been analyzed. In general, it was found that physics parameterizations appear to have less noticeable effect on temperature than on precipitation [56].  In Figure 6, the inter-annual cycle of daily-average minimum and maximum values of temperature by month is displayed for the total number of grid stations' points for the whole region of the study. The colored lines show the results of the simulations, and the dashed, black lines indicate observational data. In general, the observational seasonality is precisely captured during 2000-2004, while the summer/winter peaks are clearly identified as well. Similar representation and behavior of temperatures are observed regarding all physical schemes. Both temperature measures are in agreement over the study period, but WRF TX results are consistently colder, while TN is much warmer than observational data for all physical schemes. This bias appears to result mainly from the summertime over-prediction of daily-minimum temperature and summertime under-prediction of TX; daily-maximum biases tend to be smaller in magnitude and seasonally invariant, while the warm bias is mainly confined to the maximum temperatures. Additionally, the spatial distribution ( Figure S2) of the simulated 5-years mean that the daily TX is characterized normally by a warm decreasing gradient from the coasts and low altitudes regions to mountainous chains, verified by the weather station values in spite of the limited number of observational data. Similar results are observed for minimum temperature as well (not shown). ERA-interim also performed well the inter-annual cycle for both temperatures, indicating lower values than the observed and modeled values during summer maximum values in the case of maximum temperature. Table 6 presents the statistical metrics calculated for daily values. A high correlation coefficient of 0.96 is observed between the station and the simulated daily maximum temperature TX, with an overall negative BIAS from −1.1 to −1.4 °C, indicating a slightly better performance of the model for the PP3 (−1.06 °C) simulation. The RMSE and MAE errors have values close to 2.6 °C and 2 °C, respectively, with similar values found for all simulations. Regarding the daily TN, a high correlation coefficient of 0.92 is observed between observations and model data, with a consistent positive bias of around 2 °C, ~3.5 °C RMSE and ~ 2.7 °C MAE values. These findings are in good agreement with high resolution climate analysis for temperature by Berg et al. [4] for Germany, and little higher values regarding RMSE/MAE values, (especially in the case of TX BIAS, which is found negligible −0.4 °C) in a similar study of Soares et al. [3] for Portugal. The results showed an improvement in maximum temperature with respect to the ERA-Interim dataset, and higher bias regarding minimum temperature, but without significant discrepancies on the other statistical errors.
An overall performance of the simulations is illustrated in Figure 7 by Taylor plots for seasonal periods (winter, spring, summer and autumn -7a) and annual (7b) during 2000-2004. The diagrams for maximum temperature showed a good match between model results and observations at seasonal time scales. The lowest performance was obtained during the summer months, with correlations close to 0.75 that increase to 0.8 for PP5 simulation results. In addition, in the representation of metrics by station in Table 5, high RMSE values are observed, of about 3-3.5 °C for several stations. It was found in the initial study [26] that the cell of the certain model points that correspond to the location of the observed stations is characterized by the sea dominant land use category (e.g., Corfu, Heraklion, Mitilini, Argostoli etc.), and consequently during summer period could affect the results, with higher differences in temperature leading to stronger sea-land interaction in combination with the appearance of more intense thermal instability.
The correlations in the other seasons are much higher in the range of 0.9-0.95, and lower errors are observed with very similar values for all models. The Taylor diagram in a yearly time scale showed a very good agreement of models performance, with no distinct differences among them during the 5-year period. The correlation coefficient results showed a good match with values above 0.9 arising in the climatological study of Marta-Almeida et al. [94] for seasonal time scales in Spain.
Minimum temperatures showed a slightly lower performance than TX, with correlation values around 0.85 during spring, autumn and winter regarding all simulations results, while during summer months, a correlation lower than 0.8 is observed with no significant changes with respect to errors. A good agreement with observations is also found, and no significant statistical differences are yielded among PPs models for TN, as illustrated in the Taylor plot of Figure 7b with respect to the yearly timescale. The values of the correlations are higher than 0.9, as an increasing number of days were averaged. It is noticed that for all simulations the correlation coefficients of TN yield lower values (0.92-0.93) for high resolution domains, which is in agreement with other works [3,71]. It could be deduced that the model performs better for TN with the PP2 scheme with positive BIAS near 2 °C. Table 7 depicts the best setup based on the daily values of statistical metrics BIAS (not shown), RMSE, MAE and the correlation coefficient (not shown) for the daily maximum temperature separately for each of the 28 stations, during the 5 years and the four different experiments. These same calculations were derived as well as for TN (not shown). Some exceptions in Table 5 concern 6-7 stations, that their BIAS is in the range of 2-3.5 °C, and probably is related to the selection of the nearest model point to station that is not located in land cell, or displays a significant height difference. From this analysis, it is evident that the setup that statistically outperforms with the lowest errors among stations is PP3, showing a significant improvement regarding the others, and thus representing the majority of the stations in Greece. The range of errors appears to have values of 1.2 to 4 °C. It is also observed that the underestimation of the model is consistent for the entire area, and is close to −1.2 °C for the majority of the observational stations used in this comparison. The comparison of model results and observations in terms of probability distribution (see Figure 8) for some indicative stations (for all stations see Figure S3) shows satisfied agreement for the majority of the stations for the 5-year period daily TX, as all WRF simulations follow the pattern distribution of observed data without having distinct differences among setups.
In a few stations, a lower model observations correspondence is found. More specifically, the Argostoli, Corfu, Florina, Heraklion, Kithira, Methoni, Milos and Mitilini stations illustrate a large shift towards colder values in the medium temperature range with higher density values. In Chania station all simulations appear to have higher density values for hotter temperatures, while in Ierapetra, the opposite behavior is observed. On the other hand, the TN probability distributions of WRF simulations (not shown), appear to have a large shift towards hotter values in the temperature range corresponding to either higher or lower density. This behavior justifies the consistent model's overestimation, especially during the summer period.
Percentiles of TX and TN (the 1st, 5th, 25th, 75th, 95th and 99th) of daily values for the 28 stations, as well as for the total region of Greece were calculated, in order to focus on the examination of extremes description by the different simulations. Percentiles for the WRF simulations versus observational percentiles are shown indicatively in Figure 9 and Figure 10, for TX and TN for several stations, and their average through the domain of Greece. As in the case of precipitation, the over-or underestimation of the simulations is indicated by the blue line, which represents the perfect description. It is evident that the maximum temperature is very well reproduced by WRF, with no significant differences between the different simulations, and with slight underestimation mostly for percentiles higher than 50%. There are some stations, e.g., Argostoli or Mitilini, that appear to have larger deviations in the extreme percentiles, and others like Naxos that show very good performances in predicting the extremes. In accordance, regarding the minimum temperature, all models' setups indicate no significant differences in simulating percentiles as well, however in what concerns their behavior an almost systematic overestimation, is observed overall. Probably, as Pérez et al. [56] point out in the case study of the Canary Islands, these deviations could be due to the insufficient temperature correction on the representation of mountainous areas, because the altitude difference between the model and stations points has strong influence. From the set of the simulations carried out in this study, obvious similarities were found in the precipitation patterns among simulations and observations during the 5-yr period, verifying smoothly the precipitation's inter-annual variability, especially during the wettest months and summer periods. Additionally, the lowest positive percentage BIAS of about 19% was calculated for the selected combination of physics parameterizations PP3, while for the rest of the setups, values of over 40% were obtained, highlighting as the best configuration for the simulation of precipitation in the PP3 setup. Furthermore, all simulations regarding the maximum temperature demonstrated improved statistical metrics with respect to ERA-Interim data, and not significant differences in the case of the minimum temperature. Although some verification results among ERA-Interim and the WRF results indicated some loss of model performance for daily precipitation, the results demonstrated the necessity to use a high resolution RCM model for climate studies in Greece, due to the difficult geomorphology of the region (complex orography, irregular coastlines and regions with heterogeneous land cover).
Statistical analysis of the maximum and minimum temperatures for all tested simulations showed satisfactory results, characterized by a good match between modeled and observed data, with high correlation above 0.9, negative bias around 1-1.5 °C, and a positive bias of around 2 °C, respectively. Moreover, good performance was deduced with regards to the examination of extreme percentiles for temperatures and precipitation, with some deviation due to complex topography. PP3 showed a slightly better performance for the maximum temperature in the majority of the stations, while PP2 for the minimum temperature.
Given the small discrepancies between the results of these two setups in temperatures, and taking into account the noticeable difference in the results of PP3 for precipitation, we recommend PP3 as a good choice for the upcoming climate simulations. Several studies have supported the PP3 WRF setup (MYJ, WSM6, RRTMG, NOAH, BMJ) for climate or forecasting applications as overall, more balanced behavior is displayed for both surface variables; the annual precipitation cycle is captured adequately, and closer agreement with the observational datasets is found regarding temperatures and their extreme values [2,[37][38][39]44,88,94,95].
It should be mentioned that this study was based on previous research that has already examined a combination of physics parameterizations, and performed sensitivity tests for the area of interest, analyzing the effect of the chosen schemes; therefore an in-depth analysis of physical Greece scheme inter-comparison was not in the scope of the current work. The use of RCMs for the simulation of historical, current and future climate, particularly in view of the warming climate, is continuously increasing, as it is considered important for studying regional climate changes. It is considered important to emphasize that the current study aimed at identifying an appropriate WRF model set up in order to perform in the future high-resolution historical climatology simulation, by downscaling ERA-interim reanalysis to the domain of Greece. For the realization of the future work related to the hindcast evaluation of the WRF model, different interpolation methods will be considered for the determination of the nearest model points to the stations to decrease statistical errors, particularly for those located within coastal and mountainous areas. In the future analysis, we will also consider the application of bias correction methods to reproduce more reliable results regarding precipitation. Additionally, according to Emery [96], if the mean error for temperature exceeds the ±0.5 threshold, it is suggested to adjust bias correction methodologies, before the results are applied to other studies.

Conclusion
In this study, we performed a preliminary investigation on high resolution (5 km × 5 km) regional climatology over the domain of Greece, by downscaling 5-years ERA-Interim reanalysis data using the WRF model. The dynamical downscaling was performed using in total four different setups to investigate the model performance for precipitation and surface maximum and minimum temperature variables that are relevant to climate impact studies. The results were compared with the observational data of HNMS through detailed statistical analysis. As there were no available high resolution observational gridded datasets, the comparison of the simulations was realized through the closest model grid point of the inner domain to the station. In general, clear similarities were found in the precipitation patterns among simulations and observations, yielding the lowest positive percentage BIAS calculated at about 19% for the PP3 setup. Overall, this PP3 model simulation reproduced rainfall extremes better than the other setups, even though the model overestimated extreme precipitation events with few exceptions regarding the 99% percentile. The same conclusion was drawn for maximum and minimum temperatures, as PP3 simulations yielded the lowest statistical errors for the majority of the stations.
Within its limitations, the study advocates that the PP3 model setup, which corresponds to the combination of physics schemes, including WSM6, MYJ, and BMJ, is suitable for high resolution climate modeling studies (hindcast and future climate scenarios runs) for the domain of Greece, as the specific parameterization schemes simulate better the temperature and precipitation fields compared to the rest of the investigated setups.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, nor interpretation of data; in the writing of the manuscript, nor in the decision to publish the results.