VHR-REA_IT Dataset: Very High Resolution Dynamical Downscaling of ERA5 Reanalysis over Italy by COSMO-CLM

This work presents a new dataset for recent climate developed within the Highlander project by dynamically downscaling ERA5 reanalysis, originally available at ≃31 km horizontal resolution, to ≃2.2 km resolution (i.e., convection permitting scale). Dynamical downscaling was conducted through the COSMO Regional Climate Model (RCM). The temporal resolution of output is hourly (like for ERA5). Runs cover the whole Italian territory (and neighboring areas according to the necessary computation boundary) to provide a very detailed (in terms of space–time resolution) and comprehensive (in terms of meteorological fields) dataset of climatological data for at least the last 30 years (01/1989-12/2020). These types of datasets can be used for (applied) research and downstream services (e.g., for decision support systems).


Summary
The development of convection-permitting regional climate models (CP-RCMs, spatial resolution <4 km) is returning a step change in the ability to understand past climate and future climate change at local scales, supporting the characterization of extreme weather events that most impact society [1]. Some European initiatives (e.g., H2020 EUCP, CORDEX-FPS convection), as well as an increasing number of scientific works [2][3][4][5][6][7][8][9][10][11], have provided encouraging evidence for the improvement of CP-RCMs for the representation of hourly precipitation characteristics (i.e., diurnal cycle, spatial structure of precipitation, intensity distribution, and extremes [12]) towards dynamics matching reality. In addition, they have also brought out the capacity of CP-RCMs to detect surface heterogeneities (e.g., mountains, coastal regions, and urban areas; [2,7,13]), and a better feature of landatmosphere feedback [14], which is necessary to preserve/amplify other extremes such as droughts or summer heat waves [15]. These improvements can also have a knock-on effect on other variables (e.g., energy fluxes such as latent heat and sensible heat, and soil moisture), usually scarcely monitored but of considerable interest for different types of applications. This is one of the major strengths of developing climate analyses on limited areas (e.g., specific countries) to support existing monitoring networks.
In this view, climate reanalysis represents a solution to ensure homogeneity and continuity of data for past and current climate, providing a comprehensive set of variables (in addition to traditional precipitation and temperature). Climate reanalysis "delivers a complete and consistent picture of the past weather" (https://climate.copernicus.eu/ climate-reanalysis, accessed on 4 August 2021), relying on a numerical weather prediction model to assimilate historical observations (e.g., from satellite, in situ, multiple variables) that are not homogeneously distributed around the globe. Recently, the European Centre for Medium Range Weather Forecast (ECMWF) released the ERA5 reanalysis, currently representing the most plausible description for climate [16]. It has a global coverage with a native spatial resolution of 0.28 • ( 31 km) and provides outputs at an hourly scale from 1950 to the present (with a latency of five days). Such features make ERA5 suitable for a wide range of applications, e.g., monitoring climate change, research, education, policy making and business, and in sectors such as renewable energy and agriculture [17].
A step forward is to bring the potential of ERA5 to the convection permitting (CP) scale with the aim of synergistically exploiting CP-RCM features and ERA5 reliability and creating new very-high resolution (VHR) climate dataset for past climate, using a model setup specific for areas of interest. This is the rationale adopted in the framework of the HIGHLANDER project (https://highlanderproject.eu/, accessed on 4 August 2021) to create a new additional gridded dataset over Italy, labelled as VHR-REA_IT (Very High Resolution REAnalysis for ITaly), derived from the dynamical downscaling of ERA5 reanalysis from their native resolution ( 31 km) to a resolution of 2.2 km for the period 1989-2020.
The downscaling activity was performed by the Centro euro-Mediterraneo sui Cambiamenti Climatici (CMCC) Foundation exploiting the Consorzio Interuniversitario del Nord-Est per il Calcolo Automatico (CINECA) supercomputer cluster GALILEO. The outputs have been stored as NetCDF [18] files at CMCC Supercomputing Center facilities and they have been integrated into the CMCC data delivery system (DDS) (http://dds.cmcc.it, accessed on 4 August 2021). Through the DDS web user interface (UI), users can easily build queries related to the VHR-REA_IT dataset, choosing from a list of available variables, selecting the geographical area of interest or a location, and/or the time period, and then, according to the selected criteria, users can retrieve data by using the DDS Python client (https://anaconda.org/Fondazione-CMCC/ddsapi, accessed on 4 August 2021). Data is Data 2021, 6, 88 3 of 15 also available on the Highlander platform (https://highlanderproject.eu/data, accessed on 4 August 2021) and can be accessed in a similar manner.
The present work reports a general description of VHR-REA_IT dataset, explaining the data production steps and outputs included in the dataset and providing some insights about a comparison performed with respect to state of the art datasets available over Italy (i.e., E-OBS gridded observations [19], and ERA5 parent reanalysis) to give a reference for potential users.

Data Production
Data are produced by dynamical downscaling of ERA5 reanalysis at the convection permitting scale (horizontal grid spacing 0.02 • , 2.2 km) over the domain covering the Italian Peninsula (Lon = 5 • W-20 • E; Lat = 36 • N-48 • N) for the period from January 1989 to December 2020; the first year, 1988, is assumed to be a spin up. The downscaling activity is performed with the regional climate model COSMO model in CLimate Mode (COSCOSMO-CLM) [20] switching on the module TERRA-URB [21] to account for the urban parameterizations.
COSMO-CLM [20] is a non-hydrostatic, limited-area model designed for climate simulations at different horizontal resolutions varying from the meso-β scale (~20-200 km) to the meso-γ one (~2-20 km). Such a model exploits finite difference methods to solve the fully compressible governing equations of fluid dynamics on a structured grid. TERRA-URB [21] is a bulk scheme tailored for properly parameterizing urban physics in COSMO-CLM. Such a scheme makes use of a tile approach to discern for each grid cell between urban canopy and natural land cover and computes adjusted soil and water fluxes considering urban environment features. Figure 1 displays the computational domain used for the downscaling activity while Table 1 reports the main features of the experimental configuration.
DDS Python client (https://anaconda.org/Fondazione-CMCC/ddsapi, accessed on gust 2021). Data is also available on the Highlander platform (https://highland ject.eu/data, accessed on 4 August 2021) and can be accessed in a similar manner.
The present work reports a general description of VHR-REA_IT dataset, exp the data production steps and outputs included in the dataset and providing so sights about a comparison performed with respect to state of the art datasets av over Italy (i.e., E-OBS gridded observations [19], and ERA5 parent reanalysis) to reference for potential users.

Data Production
Data are produced by dynamical downscaling of ERA5 reanalysis at the con permitting scale (horizontal grid spacing 0.02°, ≃2.2 km) over the domain cover Italian Peninsula (Lon = 5° W-20° E; Lat = 36° N-48° N) for the period from Janua to December 2020; the first year, 1988, is assumed to be a spin up. The downscaling is performed with the regional climate model COSMO model in CLimate Mode COSMO-CLM) [20] switching on the module TERRA-URB [21] to account for the parameterizations.
COSMO-CLM [20] is a non-hydrostatic, limited-area model designed for clima ulations at different horizontal resolutions varying from the meso-β scale (~20-200 the meso-γ one (~2-20 km). Such a model exploits finite difference methods to so fully compressible governing equations of fluid dynamics on a structured grid. T URB [21] is a bulk scheme tailored for properly parameterizing urban physics in CO CLM. Such a scheme makes use of a tile approach to discern for each grid cell b urban canopy and natural land cover and computes adjusted soil and water flux sidering urban environment features. Figure 1 displays the computational domain used for the downscaling activit Table 1 reports the main features of the experimental configuration.    The configuration derives from the COSMO-DE setup used by the Deutscher Wetterdienst (DWD) for numerical weather prediction application. It has also been adopted by several institutes acting in the Climate Limited-area Modelling-Community as a reference for climate mode experiments in the frame of the Coordinated Downscaling Experiment (CORDEX) [28,29] of the World Climate Research Programme (WCRP) for the Flagship Pilot Study (FPS) on convection [4]. Such an FPS focuses on the investigation of convective-scale events in a few key regions of Europe and the Mediterranean basin with convection-permitting regional climate models. To produce the VHR-REA_IT dataset, the configuration has been properly configured by performing a series of sensitivity tests.
Formally, the default COSMO convective parameterization is the Tiedtke mass-flux scheme with moisture convergence closure [23]. Such a scheme distinguishes between shallow, deep, and midlevel convection. In the convection-resolving setup (i.e., that used for ERA5@2km), only the shallow convection part of the scheme is active, while for deeper clouds the scheme is turned off.
The setup was borrowed from the ERA5 evaluation downscaling experiments performed by Raffa et al. [10] over part of central Europe, including urban areas such as Cologne (Germany) and Paris (France). These experiments were performed to identify the most reliable nesting strategy to be adopted for localizing ERA5 climate signal at a convection permitting scale ( 2.2 km) with COSMO-CLM. These sensitivity tests highlighted the advantages of direct nesting into ERA5 against the adoption of intermediate simulations.

Computing Resources
The climate run was performed by CMCC on the GALILEO supercomputer of CINECA, the Italian computing centre. CINECA, coordinator of the Highlander project, designed, set up, and made available all the necessary HPC and CLOUD infrastructure required. The hpc is equipped with 1022 36-core compute nodes. Each one contains two 18-core Intel Xeon E5-2697 v4 (Broadwell) at 2.30 GHz. All the compute nodes have 128 GB of memory.
The long-term run was performed using 60 nodes, corresponding to 2160 cores, and employed about 61 h to perform a 1-year simulation. The long-term simulation produced a large amount of data, 8 TB of output data and greater than 70 TB of forcing data, including the 3-dimensional boundary data needed for the downscaling.

Data Records
Hourly data from the downscaling at the very fine resolution of ERA5 reanalysis over Italy are on a rotated grid ( 2.2 km, irregular/rotated pole grid). Their temporal coverage is 01/01/1989 00:00 to 31/12/2020 23:00. These data are provided in NetCDF format (dimensions = time, longitude, latitude, single vertical level) and generally on single levels (i.e., 2 or 10 m from surface depending on the selected variables) except soil moisture, which is available at seven soil levels (i.e., depth = 1, 3, 9, 27, 81, 243, or 729 cm from surface). The reference coordinate system is WGS84 (EPSG 4326). This information is summarized in Table 2. In addition, Table 3 lists the variables provided by this dataset with a short description to support the users and interpret the convention of the meteorological fields. Such variables were selected for specific use cases to be conducted in the HIGHLANDER project, in which the time series of climate data will be further post-processed (e.g., spatio-temporal aggregation, combination into indices) and used, for instance as indicators and/or impact models concerning animal and human wellbeing, crop water requirements, land suitability for forests and crops, surface water availability and variability, and soil erosion (advancing what done in [30]). Thermal radiation (also known as longwave or terrestrial radiation) refers to radiation emitted by the atmosphere, clouds and the surface. This parameter is the difference between downward and upward thermal radiation at the surface Surface snow amount W_SNOW m Liquid water equivalent thickness of surface snow amount Soil (multi levels) water content W_SO m Liquid water equivalent thickness of moisture content of soil layer

Notes on the Comparison of VHR-REA_IT against Gridded Observations and Other Reanalysis
In this section, the VHR-REA_IT dataset is compared with a reference gridded observational dataset available over Europe, acknowledged as E-OBS [19,31], and with the parent ERA5 reanalysis [16]. Such a comparison is conducted in terms of precipitation and temperature at a daily scale for the period 1989-2020, for mean tendencies and extremes.
In brief, E-OBS represents a daily gridded European land-only observational dataset at a horizontal resolution of 0.1 • (~11 km) relying on the "blended" time series from the station network of the European Climate Assessment & Dataset (ECA&D) project. It contains data for precipitation amount, mean/maximum/minimum temperature, relative humidity, sea level pressure, and surface shortwave downwelling radiation. Its latest version (Volume 23), delivered by Copernicus Climate Data Store, covers the period 1950-2020.
While E-OBS represents a valuable resource for climate research in Europe, some limitations in the dataset exist and taking it as reference does not mean it represents the reality but rather a useful independent product for comparison purposes [32]. Indeed, E-OBS data are affected by the same constraints and limitations [9,33] that are typical of observational gridded datasets. Concerning precipitation, an averaging effect in precipitation magnitude (i.e., the lower the spatial resolution, the larger the smoothing effect) could occur, as well as underestimations at high elevation due to the precipitation lapse rate not being properly accounted for or induced by stations sparseness. For E-OBS, these limitations are recognized as underestimation (typically 10-20%) at high intensities (smoothing effect) and overestimation at low intensities (moist extension into dry areas), while systematic errors are more substantial for convective rainfall [33]. Moreover, by inspecting the spatial distribution of stations for Italy as reported in Cornes et al. [19], we can note how north Italy and the Po' Valley are well covered by stations for precipitation while the rest of the territory is characterized by a scarcity of measurements; such a scarcity is amplified in terms of stations for temperature measurements.
As a spatial reference, the Nomenclature of Territorial Units for Statistics (NUTS) classification is used to subdivide Italy into specific areas. Two levels of NUTS are considered: the former represents the whole Italian territory; the latter divides the Italian territory into five sub-areas (i.e., Northwest Italy, Northeast Italy, Central Italy, South Italy, and Insular Italy), identified in Figure 1b.
Specifically, the following issues are analyzed: • 2 m temperature and total precipitation; this provides a general overview about the reliability of the new produced data in terms of mean patterns; • a set of climate indicators related to extremes derived from a core set of extreme indices for temperature and precipitation provided by the experts of the CCl/CLIVAR/JCOMM Team on Climate Change Detection and Indices (ETCCDI), along with some relevant percentiles (see Table 4 for indicators related to precipitation and Table 5 for indicators related to temperature); these indicators turn data produced by climate models into significant information for impact studies, highlighting various characteristics of extremes, including frequency, amplitude, and persistence.  Operatively, 2 m temperature and total precipitation are investigated at a seasonal scale, while the ETCCDI indices are calculated on a yearly basis. For these variables, data are first computed on a yearly basis and then averaged to obtain a climatological mean. Conversely, percentiles are obtained from the distribution representing the whole period.
3.1.1. Temperature Figure 2 shows the seasonal spatial distribution of 2 m temperature computed for the period 1989-2020 with E-OBS, ERA5, and VHR-REA_IT (the first three columns) on their relative native grids. The same figure also depicts the seasonal spatial distribution of 2 m temperature bias with ERA5, and VHR-REA_IT (the last two columns), assuming E-OBS as a reference. For bias representation, data have been interpolated onto a coarser grid (i.e., the one of ERA5) considering a constant lapse rate of −6.5 K·km −1 to account for differences in the elevation of grid cells.  Figure 2 shows the seasonal spatial distribution of 2 m temperature computed for the period 1989-2020 with E-OBS, ERA5, and VHR-REA_IT (the first three columns) on their relative native grids. The same figure also depicts the seasonal spatial distribution of 2 m temperature bias with ERA5, and VHR-REA_IT (the last two columns), assuming E-OBS as a reference. For bias representation, data have been interpolated onto a coarser grid (i.e., the one of ERA5) considering a constant lapse rate of −6.5 K·km −1 to account for differences in the elevation of grid cells. To provide a more comprehensive overview, Table 6 provides a summary of results for different areas, indicating how the spatial patterns of simulated data are related to To provide a more comprehensive overview, Table 6 provides a summary of results for different areas, indicating how the spatial patterns of simulated data are related to observations. Similarity between observed and simulated spatial fields are summarized by assessing model ability to reproduce the spatial mean value (overall bias), and the spatial variability (ratio between the standard deviation of RCM data and observations σ σ mod /σ obs ). Table 6. Seasonal 2 m temperature analysis for the period 1989-2020 provided by E-OBS, ERA5, and VHR-REA_IT. Data are aggregated over Italy and consider the five subareas identified in Figure 1b. For E-OBS, reference values are reported in italic and bold (units = • C); for ERA5 and VHR-REA_IT, bias (mod-obs) and ratio between the standard deviations (σ mod /σ obs ) are reported. The colors are used to classify differences. observations. Similarity between observed and simulated spatial fields are summarized by assessing model ability to reproduce the spatial mean value (overall bias), and the spatial variability (ratio between the standard deviation of RCM data and observations σ σmod/σobs). Table 6. Seasonal 2 m temperature analysis for the period 1989-2020 provided by E-OBS, ERA5, and VHR-REA_IT. Data are aggregated over Italy and consider the five subareas identified in Figure 1b. For E-OBS, reference values are reported in italic and bold (units = °C); for ERA5 and VHR-REA_IT, bias (mod-obs) and ratio between the standard deviations (σmod/σobs) are reported. The colors are used to classify differences. From a spatial viewpoint, the spatial distribution of temperatures for all the seasons returns a relatively cool climate for the inland northern areas of Italy and a typical Mediterranean tendency for the other areas. Specifically, mean observed temperatures across Italy are 5.3 °C for DJF, 11.7 °C for MAM, 21.4 °C for JJA, and 13.9 °C for SON. VHR-REA_IT amplifies these values in spring, summer, and autumn, especially in some specific hotspot areas of Italy such as the Po Valley (see Figure 2 during summer).

Bias
By looking at the summarized statistics reported in Table 6, ERA5 returns a slight bias against observations except for the winter season where a cold bias (−0.5 °C) arises. It is interesting to note how such a cold bias is emphasized in areas characterized by a complex orography (e.g., Northwest Italy and Northeast Italy where the Alpine region represents most of the area). On the contrary, VHR-REA_IT provides a cold bias (−0.7 °C) during winter, while such an underestimation is negligible in north-eastern Italy (−0.1 °C) and most relevant in other areas, in particular in southern Italy and the insular area (−1 °C); on the other side, VHR-REA_IT amplifies temperature during summer resulting in a σ mod /σ obs Data 2021, 6, x FOR PEER REVIEW 9 of 15 observations. Similarity between observed and simulated spatial fields are summarized by assessing model ability to reproduce the spatial mean value (overall bias), and the spatial variability (ratio between the standard deviation of RCM data and observations σ σmod/σobs). Table 6. Seasonal 2 m temperature analysis for the period 1989-2020 provided by E-OBS, ERA5, and VHR-REA_IT. Data are aggregated over Italy and consider the five subareas identified in Figure 1b. For E-OBS, reference values are reported in italic and bold (units = °C); for ERA5 and VHR-REA_IT, bias (mod-obs) and ratio between the standard deviations (σmod/σobs) are reported. The colors are used to classify differences. From a spatial viewpoint, the spatial distribution of temperatures for all the seasons returns a relatively cool climate for the inland northern areas of Italy and a typical Mediterranean tendency for the other areas. Specifically, mean observed temperatures across Italy are 5.3 °C for DJF, 11.7 °C for MAM, 21.4 °C for JJA, and 13.9 °C for SON. VHR-REA_IT amplifies these values in spring, summer, and autumn, especially in some specific hotspot areas of Italy such as the Po Valley (see Figure 2 during summer).

Bias
By looking at the summarized statistics reported in Table 6, ERA5 returns a slight bias against observations except for the winter season where a cold bias (−0.5 °C) arises. It is interesting to note how such a cold bias is emphasized in areas characterized by a complex orography (e.g., Northwest Italy and Northeast Italy where the Alpine region represents most of the area). On the contrary, VHR-REA_IT provides a cold bias (−0.7 °C) during winter, while such an underestimation is negligible in north-eastern Italy (−0.1 °C) and most relevant in other areas, in particular in southern Italy and the insular area (−1 °C); on the other side, VHR-REA_IT amplifies temperature during summer resulting in a From a spatial viewpoint, the spatial distribution of temperatures for all the seasons returns a relatively cool climate for the inland northern areas of Italy and a typical Mediterranean tendency for the other areas. Specifically, mean observed temperatures across Italy are 5.3 • C for DJF, 11.7 • C for MAM, 21.4 • C for JJA, and 13.9 • C for SON. VHR-REA_IT amplifies these values in spring, summer, and autumn, especially in some specific hotspot areas of Italy such as the Po Valley (see Figure 2 during summer).
By looking at the summarized statistics reported in Table 6, ERA5 returns a slight bias against observations except for the winter season where a cold bias (−0.5 • C) arises. It is interesting to note how such a cold bias is emphasized in areas characterized by a complex orography (e.g., Northwest Italy and Northeast Italy where the Alpine region represents most of the area). On the contrary, VHR-REA_IT provides a cold bias (−0.7 • C) during winter, while such an underestimation is negligible in north-eastern Italy (−0.1 • C) and most relevant in other areas, in particular in southern Italy and the insular area (−1 • C); on the other side, VHR-REA_IT amplifies temperature during summer resulting in a warm bias (+1.9 • C), with a maximum bias in Northeast Italy (+2.1 • C) and a minimum in Insular Italy (+1.2 • C). The transitional seasons (autumn and spring) show a slight warm bias; such a bias is because of temperatures during early autumn and late spring.
In general, VHR-REA_IT benefits-with respect to ERA5-from the use of CPS, mainly for the refinement of the orography (as is noticeable from an analysis of the temperature in northern Italy, especially if the northwest or winter are considered). Further investigations will be performed in this sense. Finally, in terms of spatial variability, the analysis of standard deviation returns a good agreement with observations. This is expected, as temperature represents quite a homogeneous field. Figure 3 depicts the seasonal spatial distribution of daily precipitation computed for the period 1989-2020 with E-OBS, ERA5, and VHR-REA_IT (the first three columns) on their relative native grids. The same Figure also shows the seasonal spatial distribution of 2 m temperature bias with ERA5 and VHR-REA_IT (the last two columns) assuming E-OBS as a reference. In this case, no correction for elevation has been applied. warm bias (+1.9 °C), with a maximum bias in Northeast Italy (+2.1 °C) and a minimum in Insular Italy (+1.2 °C). The transitional seasons (autumn and spring) show a slight warm bias; such a bias is because of temperatures during early autumn and late spring. In general, VHR-REA_IT benefits-with respect to ERA5-from the use of CPS, mainly for the refinement of the orography (as is noticeable from an analysis of the temperature in northern Italy, especially if the northwest or winter are considered). Further investigations will be performed in this sense. Finally, in terms of spatial variability, the analysis of standard deviation returns a good agreement with observations. This is expected, as temperature represents quite a homogeneous field. Figure 3 depicts the seasonal spatial distribution of daily precipitation computed for the period 1989-2020 with E-OBS, ERA5, and VHR-REA_IT (the first three columns) on their relative native grids. The same Figure also shows the seasonal spatial distribution of 2 m temperature bias with ERA5 and VHR-REA_IT (the last two columns) assuming E-OBS as a reference. In this case, no correction for elevation has been applied.  Additionally, in this case, to support the validation activity and provide a more comprehensive overview, a summary of results for different areas is reported in Table 7. Table 7. Seasonal mean precipitation analysis for the period 1989-2020 provided by E-OBS, ERA5, and VHR-REA_IT. Data are aggregated over Italy and consider the five subareas identified in Figure 1b. For E-OBS, reference values are reported in italic and bold (units = mm/day); for ERA5 and VHR-REA_IT, percent bias (100 * (mod-obs)/obs) and ratio between the standard deviation (σ mod /σ obs ) are reported. The colors are used to classify differences.

Bias (%)
σ daily precipitation bias for the period 1989-2020 provided by ERA5 and VHR-REA_IT (in column), assuming E-OBS as a reference.
Additionally, in this case, to support the validation activity and provide a more comprehensive overview, a summary of results for different areas is reported in Table 7. Table 7. Seasonal mean precipitation analysis for the period 1989-2020 provided by E-OBS, ERA5, and VHR-REA_IT. Data are aggregated over Italy and consider the five subareas identified in Figure 1b. For E-OBS, reference values are reported in italic and bold (units = mm/day); for ERA5 and VHR-REA_IT, percent bias (100 * (mod-obs)/obs) and ratio between the standard deviation (σmod/σobs) are reported. The colors are used to classify differences.  Table 7, these values are different in the various areas; specifically, the northern part also exhibits high precipitation during summer while the southern part, as well as the insular area, are rather dry. In comparison with observations, ERA5 tends to increase precipitation across Italy, especially during spring and summer; conversely, the spatial refinement of ERA5 at 2.2 km reduces this wet bias providing values in line with observations, with some exceptions in central, southern, and insular Italy. It is of interest to show how VHR-REA_IT increases summer precipitation against ERA as expected for CP models excepting for Northwest Italy and Central Italy. This is evident, especially in Southern Italy (bias = +86%). Moreover, in terms of spatial variability, the ratio between the standard deviation of RCM data and observations increases, moving from ERA5 to VHR-REA_IT as the refinement of spatial resolution reduces the smoothing of precipitation. σ mod /σ obs daily precipitation bias for the period 1989-2020 provided by ERA5 and VHR-REA_IT (in column), assuming E-OBS as a reference.

Bias
Additionally, in this case, to support the validation activity and provide a more comprehensive overview, a summary of results for different areas is reported in Table 7. Table 7. Seasonal mean precipitation analysis for the period 1989-2020 provided by E-OBS, ERA5, and VHR-REA_IT. Data are aggregated over Italy and consider the five subareas identified in Figure 1b. For E-OBS, reference values are reported in italic and bold (units = mm/day); for ERA5 and VHR-REA_IT, percent bias (100 * (mod-obs)/obs) and ratio between the standard deviation (σmod/σobs) are reported. The colors are used to classify differences.  Table 7, these values are different in the various areas; specifically, the northern part also exhibits high precipitation during summer while the southern part, as well as the insular area, are rather dry. In comparison with observations, ERA5 tends to increase precipitation across Italy, especially during spring and summer; conversely, the spatial refinement of ERA5 at 2.2 km reduces this wet bias providing values in line with observations, with some exceptions in central, southern, and insular Italy. It is of interest to show how VHR-REA_IT increases summer precipitation against ERA as expected for CP models excepting for Northwest Italy and Central Italy. This is evident, especially in Southern Italy (bias = +86%). Moreover, in terms of spatial variability, the ratio between the standard deviation of RCM data and observations increases, moving from ERA5 to VHR-REA_IT as the refinement of spatial resolution reduces the smoothing of precipitation.
Mean precipitation observations are in general reduced during summer (1.44 mm/day) and relevant during autumn (3.02 mm/day). By looking at Table 7, these values are different in the various areas; specifically, the northern part also exhibits high precipitation during summer while the southern part, as well as the insular area, are rather dry. In comparison with observations, ERA5 tends to increase precipitation across Italy, especially during spring and summer; conversely, the spatial refinement of ERA5 at 2.2 km reduces this wet bias providing values in line with observations, with some exceptions in central, southern, and insular Italy. It is of interest to show how VHR-REA_IT increases summer precipitation against ERA as expected for CP models excepting for Northwest Italy and Central Italy. This is evident, especially in Southern Italy (bias = +86%). Moreover, in terms of spatial variability, the ratio between the standard deviation of RCM data and observations increases, moving from ERA5 to VHR-REA_IT as the refinement of spatial resolution reduces the smoothing of precipitation.

Climate Indicators Evaluation
This section provides an overview about extremes considering climate indicators listed in Tables 4 and 5. Specifically, these indicators are computed with E-OBS, ERA5, and VHR-REA_IT for each grid cell considering datasets in their native resolution and then aggregated at different spatial units (i.e., over Italy and for the five subareas reported in Figure 1b). The results are reported in Table 8 for indicators derived from temperature and  Table 9 for indicators derived from precipitation. In general, the analysis of extreme climate indicators reflects the tendency of VHR-REA_IT to amplify climate dynamics due to the spatial resolution refinement.

User Notes
The dataset VHR-REA_IT aims to provide a set of unprecedented high-quality and very high-resolution historical climate data. It allows users to assess recent trends in average and extreme climatic conditions that led to numerous cascading hazards over the land surface and connected sectors. Typical use of this dataset is research and downstream services, e.g., for decision support systems in different sectors highly affected by changes in climate trends, variability, and extreme events, as in the case of Italy. For example, starting from the time series of climate data, process-based hydrological modelling can be applied to Data 2021, 6, 88 13 of 15 simulate water cycle components, like evapotranspiration, soil moisture, and runoff up to discharge, and to produce indicators of meteorological-hydrological-agricultural drought attributes, i.e., frequency, magnitude, duration, and timing. The same time series can drive crop or forest growth models assessing vegetation productivity through reproduction of carbon, water, and energy exchanges, as well as feed fire hazard indicators and fire behavior simulations. Last but not least, diurnal, seasonal to interannual variability of extreme conditions can help in assessing changes to dangerous conditions for people and animals, allowing us to discriminate between rural and urban environments thanks to the high spatial resolution implemented.
The VHR-REA_IT dataset features some limitations that must be properly considered for correct use. Although it is obtained by dynamically downscaling a reanalysis (i.e., ERA5), some biases may be noticed due to the absence of a data assimilation procedure, which, considering the resolution of this new dataset, is hard to include over all domains with the same characteristic.
In this sense, it is also important to stress how the use of an urban parameterization such as TERRA-URB correctly leads to an increasing of temperature over urban centers. Such an increase is hard to detect by ERA5 reanalysis or E-OBS observations due to their resolution (at least five times lower for E-OBS and 15 times lower for ERA5). It could be evaluated against observations provided by urban meteorological stations. However, these measurements are hard to retrieve as synoptic stations are often used.
To sum up, some biases may be encountered; they should be evaluated with punctual observations and appropriately removed through bias correction procedures to correctly feed impact models.

Data Availability Statement:
The study produced an extensive dataset that can be accessed at https://doi.org/10.25424/cmcc/era5-2km_italy.