Dynamical Downscaling of ERA5 Data on the North-Western Mediterranean Sea: From Atmosphere to High-Resolution Coastal Wave Climate

: A 29-year wind/wave hindcast is produced over the Mediterranean Sea for the period 1990–2018. The dataset is obtained by downscaling the ERA5 global atmospheric reanalyses, which provide the initial and boundary conditions for a numerical chain based on limited-area weather and wave models: the BOLAM, MOLOCH and WaveWatch III (WW3) models. In the WW3 computational domain, an unstructured mesh is used. The variable resolutions reach up to 500 m along the coasts of the Ligurian and Tyrrhenian seas (Italy), the main objects of the study. The wind/wave hindcast is validated using observations from coastal weather stations and buoys. The wind validation provides velocity correlations between 0.45 and 0.76, while signiﬁcant wave height correlations are much higher—between 0.89 and 0.96. The results are also compared to the original low-resolution ERA5 dataset, based on assimilated models. The comparison shows that the downscaling improves the hindcast reliability, particularly in the coastal regions, and especially with regard to wind and wave directions.


Introduction
Accurate and reliable knowledge of nearshore wave climate is of paramount importance. The increasing urbanization and the exposure to flooding of coastal areas produce significant socio-economic consequences, both at local and regional scales [1,2].
A number of coastal applications, including extreme value analysis [3], coastal vulnerability assessment [4,5] and engineering design [6,7], are based on long-term time series of wave climate, which can be produced by wave hindcasts when measurements are missing or not sufficient. Furthermore, the evaluation of the spatial and temporal availability of wave energy, aimed at the exploitation of the resource, both inshore and offshore, is carried out using large databases produced by wave hindcasts. The same source of information can be used as a basis to study long-term morphodynamic processes [8] and/or to force nearshore hydrodynamic models for several purposes, including sediment transport and water quality applications [9].
The main contribution of wave hindcasts is a reliable and uniform dataset of wave climate values. This is particularly true for the coastal seas, where observations are much scarcer than those available for the atmosphere. Several wave climate databases are currently available [10][11][12][13][14]. Such datasets are forced using wind data obtained from different atmospheric reanalyses.
In most cases, winds from ERA-40 [15], ERA-Interim [16] and Climate Forecast System reanalysis (CFSR) [17] data are employed to force wave models; the former is produced by the European Centre for Medium-Range Weather Forecast (ECMWF), the latter by the National Centers for Environmental Predictions (NCEP). Stopa and Cheung [11] found that the ERA-Interim dataset is more homogeneous and has better error metrics than the CFSR one; however, CFSR data are preferable when dealing with extremes, both for wave height and wind speed. Lavidas et al. [13] and Stopa [14], who revised an even larger number of available reanalyses, drew similar conclusions. Recently, the novel fifth generation atmospheric reanalysis ERA5 dataset [18,19], produced by ECMWF and distributed through the Copernicus Service, is raising the standard of reanalysis products. Indeed, it performs better than the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2) data [20] with respect to wind power assessments [21]. It also increases the accuracy of wind and wave data, especially in coastal areas, and more accurately represents extreme events such as tropical cyclones with respect to ERA-Interim [22][23][24]. This is because the advances in model formulations and technological capabilities allow a considerable increase in spatial resolution (from 80 km of the ERA-Interim to 31 km), hourly high-resolution temporal outputs and 3-hourly uncertainty information, obtained from the underlying ten-member ensemble system [19].
Regional scale datasets are built by downscaling wind reanalysis, by means of multiple nested grids [29,32,[35][36][37] implemented on the Wavewatch III (hereinafter WW3) [38] and/or Simulating WAves Nearshore (SWAN) [39] models, or employing unstructured grids with increasing resolution where needed, using the TOMAWAC model [30,34,40]. The range of the maximum resolutions adopted by these wave model settings spanned between 10 km to 500 m, depending on the spatial extent covered by the domain and the degree of complexity of the involved morphology. However, authors who employed grids coarser than 2 km suggest the use of unstructured grids to accurately resolve coastal features, principally in presence of islands due to their sheltering effect [26,27,32]. Furthermore, multiple nested structured grids may simplify the analysis at the expense of focusing exclusively on specific and restricted areas and the occurrence of potential errors due to internal boundary conditions [41]. Covering a whole coastline at a regional scale O(10 2 ) km (e.g., Tuscany region) would require a multiple nest configuration, considerably increasing the computational effort. On the other hand, an unstructured grid allows one to consider large domains with high-resolution coastal areas, saving computational time and avoiding boundary condition errors [41].
When tackling long-term wave hindcasts, it is challenging to find the trade-off between adequate spatial resolutions and the available computational resources, even for a relatively small area such as the Mediterranean Sea. In fact, the Mediterranean basin is a marginal sea characterized by a complex morphology and climate-i.e., very irregular coastlines, extremely variable bathymetry and restricted fetches. These conditions require wave models able to describe wave formation and propagation in both deep and shallow water and atmospheric models capable of reproducing mesoscale patterns to acquire reliable wind inputs. Tiberi-Wadier et al. [34] made an effort to fulfil these requirements building the ANEMOC-2 database, covering the period from 1979 to 2010 for the Atlantic Ocean and the Mediterranean Sea. The latter is modelled by means of an unstructured grid with resolution ranging from 8 km to 800 m toward the coast and wind forcing at a resolution of 0.312 • x 0.312 • , provided hourly by the CFSR reanalysis.
In the present work, we push forward the development of wave hindcast by performing a two-level nested atmospheric dynamical downscaling [42] of the ERA5 reanalyses, which force an unstructured grid wave model for the Mediterranean Sea. In the following, we describe the modelling chain to downscale the ERA5 atmospheric dataset to the coastal wave climate. To produce the wind forcing, we implemented a nested configuration using the BOLAM [43] and MOLOCH [44] atmospheric models. The state-of-the-art third-generation WW3 unstructured grid model, version 5.16 [45], is implemented on a high-resolution mesh, up to 500 m in the coastal area of Tuscany Archipelago and surrounding areas (North-Western Mediterranean Sea). The wave hindcast calibration is carried out with buoy data for several extreme events and the wind/wave hindcast validation is focused on the North-Western Mediterranean Sea. Results are compared to ERA5 wind and wave products to show the improvement gained through the downscaling procedure.

Atmospheric Forcing
To provide atmospheric forcing data to the wave model, a dynamical downscaling of the ERA5 reanalysis data was implemented through a nested domain configuration based on the BOLAM and MOLOCH models, which are limited-area numerical weather models, developed at the Institute of Atmospheric Sciences and Climate of the Italian National Research Council (CNR). They were initially developed for research purposes, but today are being used operationally by various regional meteorological services both in Italy and abroad. Davolio et al. [46] provides a list of the several applications over which the BOLAM and MOLOCH models are implemented.
BOLAM is a primitive equations hydrostatic model with parameterized convection (using a modified version of the scheme proposed in Kain [47]). In our work, it was employed with a grid spacing of approximately 7 km to provide lateral boundary conditions to MOLOCH every hour. MOLOCH is a nonhydrostatic, fully compressible model that uses a hybrid terrain-following coordinate, relaxing smoothly to horizontal surfaces. The microphysical scheme is an upgrade of the parameterization proposed by Drofa and Malguzzi [48], which describes the interactions of cloud water, cloud ice, rain, snow and graupel. In this study, the grid spacing of the MOLOCH model is about 2.5 km and the model was set to allow the explicit treatment of convective processes. We used the model version released in late 2017. Some basic settings about the model implementations and references to the adopted schemes are summarized in Table 1. Further and more in-depth descriptions about the physics of the BOLAM and MOLOCH models are found in the study of Buzzi et al. [43]. Daily data of the high-resolution atmospheric hindcast were produced as follows: every day at 18 UTC, a BOLAM simulation was performed using ERA5 data of initial conditions, while boundary conditions were provided every 6 h for the following 30 h. The domain of integration is shown in Figure 1 (outer rectangle) and it approximately covers the Med-CORDEX domain [53]. Hourly outputs from the BOLAM simulation provide the initial and boundary conditions to the MOLOCH simulation, which starts each day at 21 UTC and has a forecast length equal to 27 h. The MOLOCH model produces outputs every hour over the domain of integration shown in Figure 1 (inner rectangle).  The daily data of the BOLAM/MOLOCH hindcast were built using the last 24 h of the two model simulations, while the first six and three hours of integration of the BOLAM and MOLOCH models, respectively, were considered as spin-up times and thus discarded. The numerical simulations were carried out for the period 1990-2018. A scheme of the numerical chain is shown in Figure 2.
The wind hourly results of the atmospheric downscaling were used to force the WW3 model. To obtain a single gridded forcing field for the unstructured wave model on the whole Mediterranean Sea, at the best possible resolution, the data from BOLAM and MO-LOCH were merged together. More precisely, a 2.5 km grid was built over the entire Mediterranean domain, then it was filled with MOLOCH data in the inner domain and with interpolated BOLAM data outside. Furthermore, to achieve a smooth transition between the high-and low-resolution winds, the two datasets were averaged using linear weights, within an appropriate buffer zone about 150 km wide, around the boundary between internal (high resolution) and external (low resolution) domains. Data at 24 UTC on day 0 and 00 UTC on day +1 were averaged to obtain a smooth temporal resolution. The daily data of the BOLAM/MOLOCH hindcast were built using the last 24 h of the two model simulations, while the first six and three hours of integration of the BOLAM and MOLOCH models, respectively, were considered as spin-up times and thus discarded. The numerical simulations were carried out for the period 1990-2018. A scheme of the numerical chain is shown in Figure 2.

Wave Model Setup
The extent of the computational domain of the wave model includes the entire Med- The wind hourly results of the atmospheric downscaling were used to force the WW3 model. To obtain a single gridded forcing field for the unstructured wave model on the whole Mediterranean Sea, at the best possible resolution, the data from BOLAM and MOLOCH were merged together. More precisely, a 2.5 km grid was built over the entire Mediterranean domain, then it was filled with MOLOCH data in the inner domain and with interpolated BOLAM data outside. Furthermore, to achieve a smooth transition between the high-and low-resolution winds, the two datasets were averaged using linear weights, within an appropriate buffer zone about 150 km wide, around the boundary between internal (high resolution) and external (low resolution) domains. Data at 24 UTC on day 0 and 00 UTC on day +1 were averaged to obtain a smooth temporal resolution.

Wave Model Setup
The extent of the computational domain of the wave model includes the entire Mediterranean Basin and an area 150 km West of the Strait of Gibraltar (Figure 3). This domain has been discretized by an unstructured mesh with a variable resolution up to 500 m along the coasts on the North-Western Mediterranean Sea. The highest coastal resolution is dedicated to the coasts of Tuscany and the Tuscan Archipelago, Eastern Liguria (La Spezia-Levanto area) and the Straits of Bonifacio and Messina. Along the coasts of Sardinia and Corsica, the resolution is about 1 km; along the other Tyrrhenian coasts and on the Straits of Gibraltar, it is about 3 km; while for the remaining Mediterranean coasts, it is roughly 6 km. The minimum resolution in deep offshore areas reaches 30 km.  The output of the wave model was recorded hourly at all grid points for the integrated quantities: significant wave height (Hs), mean wave period (Tm), peak wave period (Tp), mean wave direction (Dirm) and peak wave direction (Dirp). The output was also recorded hourly in 2048 points for both mean and spectral wave parameters (Figure 4). Part of these points matched with the locations of buoys, whereas the others were distributed in the Mediterranean Sea with a resolution of about 111 km (~1°). In the North-Western area the resolution was increased to about 55 km (~0.5°). Finally, additional points were added at a resolution of 1 km along the Tuscany coast, 4 km around Sardinia and Corsica, variable between 3.5 and 9.0 km along the Liguria coast and between 9 and 17 km along the French coast up to Marseille. The mean wave parameters at the points corresponding to the locations of buoys were used for model calibration and validation. The EMODnet bathymetry version 2018 [54] was employed for the whole domain. In the Tuscany and Ligurian areas, data from available bathymetric surveys and nautical charts replaced the EMODnet bathymetry for average depths lower than 100 m. A minimum water depth of 4 m was set offshore and a constant water depth of 2 m was set in the wet grid points along the coastline in order to avoid numerical instabilities.
The spectral domain spans 36 directions and 30 frequencies, which are not equally spaced, ranging from 0.0418 to 1.1181 Hz with a 12% increment. Both the maximum global time step and the maximum Courant-Friedrichs-Lewy (CFL) time step for x-y and k-theta were set equal to 60 s (see Section 2.4 Wave model calibration procedure), whereas the minimum source term time step was set equal to 5 s. Nonlinear wavewave interactions were modelled using the discrete interaction approximation [55]. The wind-wave interaction term (Sin) and the dissipation due to wave breaking (Sds) were implemented with the source term package ST4 described by Ardhuin et al. [56] and updated by Leckler et al. [57]. The numerical scheme used is the explicit N scheme (1st order), as this scheme is considered the most efficient although it suffers for an excessive diffusion [45]. However, this diffusion can compensate for the possible garden sprinkler effect when, for example, the waves travel around islands surrounded by deep water [45].
The output of the wave model was recorded hourly at all grid points for the integrated quantities: significant wave height (Hs), mean wave period (Tm), peak wave period (Tp), mean wave direction (Dirm) and peak wave direction (Dirp). The output was also recorded hourly in 2048 points for both mean and spectral wave parameters (Figure 4). Part of these points matched with the locations of buoys, whereas the others were distributed in the Mediterranean Sea with a resolution of about 111 km (~1 • ). In the North-Western area the resolution was increased to about 55 km (~0.5 • ). Finally, additional points were added at a resolution of 1 km along the Tuscany coast, 4 km around Sardinia and Corsica, variable between 3.5 and 9.0 km along the Liguria coast and between 9 and 17 km along the French coast up to Marseille. The mean wave parameters at the points corresponding to the locations of buoys were used for model calibration and validation. The output of the wave model was recorded hourly at all grid points for the integrated quantities: significant wave height (Hs), mean wave period (Tm), peak wave period (Tp), mean wave direction (Dirm) and peak wave direction (Dirp). The output was also recorded hourly in 2048 points for both mean and spectral wave parameters (Figure 4). Part of these points matched with the locations of buoys, whereas the others were distributed in the Mediterranean Sea with a resolution of about 111 km (~1°). In the North-Western area the resolution was increased to about 55 km (~0.5°). Finally, additional points were added at a resolution of 1 km along the Tuscany coast, 4 km around Sardinia and Corsica, variable between 3.5 and 9.0 km along the Liguria coast and between 9 and 17 km along the French coast up to Marseille. The mean wave parameters at the points corresponding to the locations of buoys were used for model calibration and validation.

Observed Data
To validate wind/wave model outputs, observations were collected from different in-situ measurement stations located in the North-Western Mediterranean Sea ( Figure 5). Eleven wind stations were selected among those available along the Ligurian and Tuscany coasts. Such wind stations were evaluated to be representative of the wind climate over the sea, because of their proximity to the coast and relatively long historical time series (at least 3 years).

Observed Data
To validate wind/wave model outputs, observations were collected from different insitu measurement stations located in the North-Western Mediterranean Sea ( Figure 5). Eleven wind stations were selected among those available along the Ligurian and Tuscany coasts. Such wind stations were evaluated to be representative of the wind climate over the sea, because of their proximity to the coast and relatively long historical time series (at least 3 years). Figure 5. Location of wind stations (orange points) and wave buoys (light green points) used to calibrate and validate the wind/wave hindcast. The length of the time series period is shown after the underscore ("_") symbol following the name of the wind station or wave buoy.
The Regional Functional Center of Tuscany Region (CFR [58]) database, the LaMMA database [59], the PianosaLAB project [60] and the National Mareographic Network (RMN [61]) provided wind observations along the Tuscany coast (nos. 1-2, 5-9 and 11 in Table 2). Other observed wind data (nos. 3, 4 and 10 in Table 2) were obtained by the Ondametric Network Liguria (ROL [62]) and the Hydrological Weather Observatory of Liguria Region (OMIRL [63]). The time series periods range from 3 to 13 years. The percentage of missing data and the 33rd percentile for the mean wind speeds are reported in Table 2.
Wave observations for the Italian coasts were obtained from the Italian National Institute for Environmental Protection and Research (ISPRA [64], nos. 1 and 12 in Table 3), ROL (no. 5 in Table 3), CFR (nos. 6 and 8-10 in Table 3) and LaMMA database (no. 13 in Table 3). The French buoys along Corsica and the PACA Region (nos. 2-4, 11, 14 in Table  3), were made available through the Centre d'Archivage National de Donnés de Houle in Situ (CANDHIS [65]) project.
The length of the time series varied between 2 and 23 years. The buoy dataset includes the main wave parameters: Hs, Tm, Tp, Dirm or Dirp. In Table 3, we report the percentage of missing data for the significant wave height. The sampling rates varied between 30 min and 3 h and differed among buoys and time periods for the same buoy. In fact, for the buoys of the RON network (nos. 1 and 12 in Table 3) the dataset was divided in two parts: in the first, the sampling interval is 3 h and reduces to 30 min during the storm events; in the second, the sampling interval is always 30 min. For the Cap Corse buoy (no. 4 in Table 3), the dataset has a sampling interval of 3 h, and 30 min for Hs values above 2 meters, except for two months (April 2008 and May 2008) where the sampling rate is 30 min. Since for this buoy the time period with different sampling rates is short, the dataset was not divided in multiple parts. Figure 5. Location of wind stations (orange points) and wave buoys (light green points) used to calibrate and validate the wind/wave hindcast. The length of the time series period is shown after the underscore ("_") symbol following the name of the wind station or wave buoy.
The Regional Functional Center of Tuscany Region (CFR [58]) database, the LaMMA database [59], the PianosaLAB project [60] and the National Mareographic Network (RMN [61]) provided wind observations along the Tuscany coast (nos. 1-2, 5-9 and 11 in Table 2). Other observed wind data (nos. 3, 4 and 10 in Table 2) were obtained by the Ondametric Network Liguria (ROL [62]) and the Hydrological Weather Observatory of Liguria Region (OMIRL [63]). The time series periods range from 3 to 13 years. The percentage of missing data and the 33rd percentile for the mean wind speeds are reported in Table 2.  Table 3), ROL (no. 5 in Table 3), CFR (nos. 6 and 8-10 in Table 3) and LaMMA database (no. 13 in Table 3). The French buoys along Corsica and the PACA Region (nos. 2-4, 11, 14 in Table 3), were made available through the Centre d'Archivage National de Donnés de Houle in Situ (CANDHIS [65]) project. Table 3. Buoy stations with locations, water depths, time periods used to validate model data and percentages of gaps in the significant wave height (Hs) records. The wave direction was not available for buoy number 4.

No.
Station The length of the time series varied between 2 and 23 years. The buoy dataset includes the main wave parameters: Hs, Tm, Tp, Dirm or Dirp. In Table 3, we report the percentage of missing data for the significant wave height. The sampling rates varied between 30 min and 3 h and differed among buoys and time periods for the same buoy. In fact, for the buoys of the RON network (nos. 1 and 12 in Table 3) the dataset was divided in two parts: in the first, the sampling interval is 3 h and reduces to 30 min during the storm events; in the second, the sampling interval is always 30 min. For the Cap Corse buoy (no. 4 in Table 3), the dataset has a sampling interval of 3 h, and 30 min for Hs values above 2 meters, except for two months (April 2008 and May 2008) where the sampling rate is 30 min. Since for this buoy the time period with different sampling rates is short, the dataset was not divided in multiple parts.
To homogenize the time series, all data have been interpolated at 1 h resolutions, and time steps with missing data have been assigned a no-data code.
The comparison between model results and observations was performed for the following parameters: mean wind speed (V) and mean wind direction (Dir) for the atmospheric models; significant wave height (Hs), mean wave period (Tm) and mean wave direction (Dirm) for the wave model.

Wave Model Calibration Procedure
The calibration was carried out in three subsequent phases by comparison of statistics from simulated and observed wave climates for twelve case studies, including both calm and severe weather conditions. Each phase corresponds to the calibration of a specific parameter/setup, namely: (i) time step duration, (ii) numerical scheme and (iii) physical parameterization. For each of them, the calibrated parameter/setup was chosen on the basis of the agreement between modelled and observed data, reported in the Taylor diagrams [66], for the variables Hs and Tm. In these diagrams the similarity between two datasets is quantified in terms of their correlation (r), centered Root Mean Squared Error (cRMSE) and standard deviations. The model output that agrees better with observations is the one closest to the point on the x-axis.
In addition, several statistical parameters were determined for each calibration phase for Hs and Tm: Mean Bias Error (MBE), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) [67]. The correlation between two directional variables (i.e., mean wind direction and mean wave direction) was assessed by computing the circular version of the Pearson's product moment correlation coefficient (circ-r) [68].
In each case, the maximum Hs was greater than 4 m at the Gorgona and/or Giannutri buoy, and it gives a reliable representation of the wave conditions in the area of interest shown in Figure 5. In Table 4, we reported the start and end dates of each case study, and the corresponding observed maximum Hs and Tm. Table 4. Time period of the case studies considered for calibration analysis. In the last two columns we also report the observed maximum significant wave height and corresponding mean period.

No.
Case The calibration of the time step was carried out in the first phase. Simulations with two different time step configurations-i.e., maximum global time step and maximum CFL time step for x-y and k-theta, all equal to 120 s or 60 s, with the minimum source term time step always equal to 5 s-were performed with the ST4 source term parameterization, and the explicit N scheme. In Table 5, the statistical indicators for Hs, Tm, and Dirp or Dirm (in the case of Capo Mele buoy) were computed for seven Italian buoys (nos. 1, 5-6, 8-10 and 12 in Table 3). The statistical indicators in Table 5, and the Taylor diagrams for Hs and Tm (Figure 6a,b) show small differences (i.e., mean RMSE around 4% for Hs and 1% for Tm) between the two configurations-in particular for Hs at coastal buoys, such as Castiglione (no. 6 in Table 3), and Gombo (no. 9 in Table 3). In these Taylor diagrams, cRMSE was not reported.
For Tm, the 60 s time step gives slightly better results than 120 s for every buoy (Figure 6b). As a precautionary measure, the 60 s time step was chosen for all subsequent simulations.
The second phase of calibration concerns the three explicit numerical schemes available for the triangle-based grids. Tested schemes are: N scheme (1st order), PSI scheme (2nd order) and FCT scheme (2nd order in space and time [43]). In these simulations, the ST4 source term parameterization was used. Table 5. Statistical indicators for Hs, mean wave period (Tm) and peak wave direction (Dirp) (or mean wave direction (Dirm)), evaluated for time steps equal to 120 and 60 s with ST4 source term parameterization and explicit N scheme.

Buoy
Time Step (   The second phase of calibration concerns the three explicit numerical schemes available for the triangle-based grids. Tested schemes are: N scheme (1st order), PSI scheme (2nd order) and FCT scheme (2nd order in space and time [43]). In these simulations, the ST4 source term parameterization was used.
The statistical indicators for Hs, Tm and Dirp (Table 6) and the Taylor diagrams for Hs and Tm (Figure 7) show small differences among the three numerical schemes, especially for Hs. The N scheme gave the better results, providing, on average, an RMSE value for Hs of approximately 0.37 and for Tm around 1.09 and a circular correlation coefficient of approximately 0.59 for Dirp.
Since the N scheme proved to be the less computational-demanding scheme and provided negligible differences with respect to other schemes, it was used for all subsequent simulations. Moreover, it has the advantage that its major diffusion can compensate the garden sprinkler effect [69], which can be remarkable in some cases (e.g., around some The statistical indicators for Hs, Tm and Dirp (Table 6) and the Taylor diagrams for Hs and Tm (Figure 7) show small differences among the three numerical schemes, especially for Hs. The N scheme gave the better results, providing, on average, an RMSE value for Hs of approximately 0.37 and for Tm around 1.09 and a circular correlation coefficient of approximately 0.59 for Dirp.    In the third calibration phase, three different physical parameterizations were compared: ST2 [70], ST3 [71] and ST4 [56], leaving the 60 s time step and the N scheme unchanged. For the ST3 parameterization, the STAB3 switch was activated to take into account a stronger gustiness in unstable atmospheric conditions [45]. Statistical indicators  Table 3. Results are relative to Hs (a) and Tm (b) are reported. The plus (+), circle ( ) and triangle (   In the third calibration phase, three different physical parameterizations were compared: ST2 [70], ST3 [71] and ST4 [56], leaving the 60 s time step and the N scheme unchanged. For the ST3 parameterization, the STAB3 switch was activated to take into account a stronger gustiness in unstable atmospheric conditions [45]. Statistical indicators ) symbols refer to the N scheme, PSI scheme and FCT scheme, respectively. Each buoy is represented by a different color: Alghero (A) in red, Capo Mele (Cm) in blue, Castiglione (C) in pink, Giannutri (Gi) in purple, Gombo (Go) in green, Gorgona (G) in orange and La Spezia (letter S) in black.
Since the N scheme proved to be the less computational-demanding scheme and provided negligible differences with respect to other schemes, it was used for all subsequent simulations. Moreover, it has the advantage that its major diffusion can compensate the garden sprinkler effect [69], which can be remarkable in some cases (e.g., around some islands in the Tuscany Archipelago).
In the third calibration phase, three different physical parameterizations were compared: ST2 [70], ST3 [71] and ST4 [56], leaving the 60 s time step and the N scheme unchanged. For the ST3 parameterization, the STAB3 switch was activated to take into account a stronger gustiness in unstable atmospheric conditions [45]. Statistical indica-tors for Hs, Tm and Dirp (or Dirm) and Taylor Diagrams for Hs and Tm are reported in Figures 7 and 8   In the third calibration phase, three different physical parameterizations were compared: ST2 [70], ST3 [71] and ST4 [56], leaving the 60 s time step and the N scheme unchanged. For the ST3 parameterization, the STAB3 switch was activated to take into account a stronger gustiness in unstable atmospheric conditions [45]. Statistical indicators  Table 3). In fact, ST3 and ST4 provide, on average, RMSEs for Hs that differ less than 6%. For Tm (Table 7 and Figure 8b), ST4 is the best parameterization. The circular correlation coefficients for Dirp (Table 7) also give better results for the ST4 parameterization, with the exception of the Gombo Buoy (no. 9 in Table 3).  Table 8 reports the settings evaluated in the calibration process-those kept constant at each phase and those employed for the hindcast: 60 s time step, explicit N scheme, and ST4 source term parameterization. During the last phase of the calibration process, a visual comparison of the time series of Hs was also performed (see Section 1 in Supplementary Materials).

Results
Based on the outcome of the wave calibration procedure (Section 2.4), we produced a 29-year (1990-2018) wind/wave hindcast.

Wind Validation
The BOLAM and MOLOCH wind hindcasts were validated using the entire set of records from the 11 wind stations listed in Table 2, located along the Liguria and Tuscany coasts.
To highlight the possible improvements obtained with the high-resolution simulation, a comparison was made with the mean wind speed (V) and wind direction (Dir), extracted from the ERA5 wind dataset [72], with a horizontal resolution of 0.25 • × 0.25 • and 1 h in time. The gridded mean wind parameters were interpolated at the wind station positions by means of bilinear interpolation.
For the wind speed validation, we used values above the 33rd percentile of the cumulative distribution of measurements [73], computed for each wind station. The selected thresholds are reported in the last column of Table 2 as "P33 (m/s)". The statistical indicators for mean wind speed and direction are reported in Table 9.  The wind speed correlations are similar among the three atmospheric models, with the exception of Capalbio station (no. 2), where the MOLOCH values show a 20-24% improvement over those of the other models (see also the normalized Taylor diagram in Figure 9b). The higher wind speed correlations (greater than 0.7) were obtained for Capo Mele (no. 3), Livorno Offshore (no. 6) and Vada (no. 11) wind stations. Figure 9 shows the normalized Taylor diagrams for the data from the BOLAM, MOLOCH and ERA5 datasets, which are compared with measurements recorded at the Bocca d'Arno (no. 1), Capalbio (no. 2), Livorno Offshore (no. 6) and Vada (no. 11) wind stations. By looking at the position of each symbol and its relative distance to the point lying on the X-axis, we note that the MOLOCH data provide the best overall performance as regards the Bocca d'Arno (no. 1) and Capalbio (no. 2) stations (Figure 9a,b). For the Livorno Offshore (no. 6) station, the ERA5 data instead show a standard deviation close to that of observed data (in the normalised Taylor diagram equal to 1), the highest correlation coefficient (approximately 3-7% higher) and the lowest cRMSE (see Table 9). Regarding the Vada station (no. 11) the high-resolution models provide standard deviations closer to the observed data than ERA5 data, but also higher errors and slightly lower correlation coefficients (around 3-5%).
The wind direction correlations (circ-r) are quite similar among the three models, with some exception: the BOLAM model at the Bocca d'Arno station (no. 1) has a circ-r value of 0.19 compared to the 0.64 and 0.67 of the others; the ERA5 dataset at the San Vincenzo (no. 9) and Vada stations (no. 11) has a circ-r values of 0.59 and 0.46, respectively, compared to values higher than 0.68 for the BOLAM and MOLOCH models. The highest values (all the models above 0.70) were obtained for Genova P. Vagno (no. 4), Livorno Offshore (no. 6) and Pianosa (no. 8) wind stations. (a) (b) (c) (d)      Table 3. Results are relative to Hs (a) and Tm (b) are reported. The plus (+), circle (◯) and triangle (⧍) symbols refer to the N scheme, PSI scheme and FCT scheme, respectively. Each buoy is represented by a different color: Alghero (A) in red, Capo Mele (Cm) in blue, Castiglione (C) in pink, Giannutri (Gi) in purple, Gombo (Go) in green, Gorgona (G) in orange and La Spezia (letter S) in black.
In the third calibration phase, three different physical parameterizations were compared: ST2 [70], ST3 [71] and ST4 [56], leaving the 60 s time step and the N scheme unchanged. For the ST3 parameterization, the STAB3 switch was activated to take into account a stronger gustiness in unstable atmospheric conditions [45]. Statistical indicators ) symbols refer to the BOLAM, MOLOCH and ERA5 data, respectively.
The wind roses in Figure 10 show a good agreement between the wind stations and the BOLAM/MOLOCH models, especially for the Livorno Offshore station (Figure 10i-l). A slight northward rotation of the main directions for the BOLAM model was observed with respect to the Bocca d'Arno wind rose (Figure 10b). This is in agreement with the results of the circ-r value in Table 9. Instead, in the ERA5 roses dataset a quite homogeneous distribution of the mean wind directions was observed. In particular, in Figure 10d (Bocca d'Arno) and in Figure 10l (Livorno offshore), the ERA5 wind roses show a slight predominant direction from NE-E, whereas the observed wind roses show a predominant direction from SE-E (Figure 10a) and from E (Figure 10i), respectively. At the Capalbio station the ERA5 wind rose (Figure 10h) shows a predominant direction from NE, quite similar to the MOLOCH data (Figure 10g), even with a lower percentage of occurrence (around 10%, instead of around 12%). For Capalbio, the wind rose direction interval is 22.5 • instead of 10 • , which is coherent with the observed data direction range.

Observations
BOLAM data MOLOCH data ERA5 data  The quantile-quantile (hereinafter Q-Q) plots [67] of the wind speed ( Figure 11) were obtained considering all the measurement data. Results show that the MOLOCH model is the most reliable-i.e., data lie closer to the diagonal line, even though in Figure 11b,h a slight underestimation of wind velocity above 10 m/s is observed at Bocca D'Arno and Vada wind stations, and a small overestimation bias is detectable at the Capalbio wind station (Figure 11e). For the Livorno Offshore station (Figure 11g-

Wave Validation
The wave hindcast was validated using data from 14 buoys (Table 3) located in the North-Western Mediterranean Sea. The synthetic wave parameters were computed directly by the WW3 point output at the buoy locations.
To highlight the possible improvements obtained with the high-resolution simulation, a further comparison was performed with the Hs of combined wind waves and swell, and Tm and Dirm, from the ERA5 wave dataset [72], which have horizontal resolutions of 0.5 • × 0.5 • and temporal resolutions of 1 h. In this case, the gridded wave parameters were interpolated at the buoy positions by bilinear interpolation. For coastal buoys where the bilinear interpolation reported a missing value, a distance-weighted average remapping interpolation was used.
The statistical indicators for Hs, Tm and Dirm (or Dirp for the observed data that do not provide the Dirm) were reported in Table 10 for both offshore and coastal (i.e., located at water depths around 15 m) buoys. The correlation coefficients r between observed and simulated Hs values are around 0.9 or higher. Moreover, it can be observed that the performances of the wave hindcast and the ERA5 dataset in terms of statistical indicators for Hs are rather similar at the offshore points, but in the coastal areas the wave hindcast performs better than ERA5. In fact, on average, the correlations for Hs differ less than 2.5% for the offshore buoys, and stay between 8 and 29% for the coastal ones. This is also clear by observing the Hs normalized Taylor diagrams for both the offshore (Figure 12c,g) and coastal (Figure 12a,e) buoys. Only the La Revellata buoy Taylor diagram (see Figure  S6m in Supplementary Materials) shows slightly better results for the ERA5 dataset: the Hs correlation is around 2% higher (Table 10) and the Hs cRMSE is around 20% lower (Table 10). The correlations for Tm are around 0.7 or higher with the exception of the Bastia coastal buoy (no. 3), for both models. The statistical indicators of Tm for the wave hindcast and ERA5 are very similar for all buoys, but the normalized Taylor diagrams (Figure 12b,d,f,h) show, in general, better results for the ERA5 dataset. The Tm correlations differ, in general, less than 4%, with the exception of Capo Mele (no. 5, about 10%), Monaco (no. 14, about 14%) and Bastia (no. 3, about 29%).
Finally, the circular correlations between observed and simulated Dirm values (or Dirp) are higher than 0.5, with the exceptions of two coastal buoys: Castiglione and Gombo (nos. 6 and 9, respectively). This reduced value of circ-r for the coastal area is probably due to the intrinsic difficulty of modelling complex wave patterns at variable depths and with jagged coastlines even in the presence of reliable bathymetric data.
As regards the comparison with the ERA5 data, the wave hindcast circular correlation coefficients of the wave directions are always higher (differences greater than 2.5%), with the sole exception of the Monaco buoy. We underline the fact that the comparison with ERA5 data was only possible for the five buoys (nos. 2, 3, 5, 11, 14 in Table 3) that were able to record the mean wave direction; the other buoys measured only the peak wave direction, excluding the Cap Corse buoy (no. 10), which did not record any wave direction.
A noteworthy result was obtained for the Bastia coastal buoy: the circ-r is equal to 0.65 for the wave hindcast and negative for ERA5 (see Table 10). This behavior is also confirmed by the wave roses reported in Figure 13a-c. The buoy (Figure 13a) and the wave hindcast (Figure 13b) wave roses show that the main mean wave directions are from 120 • N e 130 • N, respectively, with a percentage that is higher than 14%. Furthermore, they both detect waves from 50 • N, albeit with slightly different percentages (lower than 10% for the buoy and lower than 6% for the wave hindcast). Instead, the ERA5 wave rose (Figure 13c) shows mean wave directions from SSW-SW, not observed at the coastal buoy. For the offshore buoy of Capo Mele (Figure 13d-  The Q-Q plots for Hs (Figure 14) show that the results of the high-resolution wave hindcast are reliable, apart from a slight underestimation of the values above 5 m for the Capo Mele buoy and above 1.5-2.0 m for the Bastia (Figure 14a) and Castiglione (Figure 14e) coastal buoys. In the case of the Giannutri buoy, the wave hindcast shows an excellent agreement between observed and measured data, yielding points very close to the diagonal line in the Q-Q plots (Figure 14g). The wave hindcast Q-Q plots (Figure 14,left column) show an improvement in model performance with respect to the ERA5 Q-Q plots ( Figure 14, right column). The latter model gives a more marked underestimation of Hs for the two offshore buoys and an overestimation of the Hs for the coastal buoys.

Discussion and Conclusions
In this work, we presented the results of a dynamical downscaling of ERA5 reanalysis dataset [19], focused on the wind and wave climates of the Ligurian and Tyrrhenian seas (North-Western Mediterranean Sea). In the wave hindcast, we used an unstructured grid, which allows one to consider large domains with high resolutions in coastal areas, saving computational time and avoiding boundary condition errors [41].
A wave model calibration procedure was carried out sequentially, first by setting the time step, then the numerical scheme, and, finally, the source term parameterization. We acknowledge that a more rigorous approach would have explored all the possible combinations in the parameters space. However, to reduce computational efforts, and considering the fact that the first two adjusted parameters (time step and numerical scheme) do not strongly affect the results, we opted for the proposed sequential calibration procedure. Indeed, the source term parameterization, which may more significantly affect model output [32], was adjusted in the end.
The wind validation procedure shows generally good results for all the three weather datasets: BOLAM, MOLOCH and ERA5. The wind speed correlation values range between 0.45 and 0.76, and the wind direction circ-r values are higher than 0.50, with only a few exceptions. The BOLAM and MOLOCH models are able to identify the main directions, as shown in the wind roses ( Figure 10 and Figure S4 in the Supplementary Materials). For some wind stations, the ERA5 wave roses show an almost omnidirectional trend, only slightly highlighting the main directions (see in particular Figure 10d,l,p). Despite a generally good circ-r, the ERA5 dataset does not show the main wind directions. We argue that this is due to the lower resolutions with respect to the BOLAM and MOLOCH models. In general, all three models tend to underestimate the wind speed above 10-15 m/s for most of the analyzed stations ( Figure 11 and Figure S5 in the Supplementary Materials), while for Capalbio (Figure 11d-f) station, an overestimation is shown. However, as expected, the nested high-resolution MOLOCH model generally shows better results compared to the lower resolution models, with the exceptions of Pianosa and Savona (Figure S3e Our results confirm that the increase in the resolution on a delimited portion of the domain of interest has a two effects. On the one hand, we acquire a more accurate description of the processes involved, resolving smaller scales of motion, and obtaining more informative (and generally complex) physical patterns. On the other hand, the risk of encountering errors due to the misrepresentation of small scale features not resolved in the parent model, or due to the spatial and temporal shifts of accurately resolved smaller scales, is generally present. The latter, known as the "double penalty effect" [74] may also affect both wind and wave models via the increased resolution of the wind field [10,32,75].
We stress the fact that in the case of the Livorno Offshore, the only open sea wind station among those analyzed, the BOLAM and MOLOCH models show excellent agreement between observed and measured data, yielding points very close to the diagonal line in the Q-Q plots (see Figure 11g-i). In the wind analysis, the orography of the station is fundamental, as it contributes to the energy distribution. In the mesoscale atmospheric models, the turbulent energy motions are entirely parameterized and the effective resolution remains around 4-6 ∆x for horizontal grid spacing between 2.5 km and 250 m [76]. Homogeneous wind conditions for around 15 km can only be found in open sea.
Despite the tendency of the atmospheric models analyzed herein to underestimate wind speeds, the statistics obtained for Hs, show a good capability of the wave hindcast to reproduce observed wave climate (Table 10). Furthermore, the downscaled values of Hs better agree with the analyzed buoys, especially for the coastal ones (Bastia, Castiglione and Gombo, Figure 12), with respect to ERA5 data. The Hs correlation coefficients stayed between 0.89 and 0.96. Moreover, to assess the convenience of the downscaling procedure, it is reasonable to focus also on the improvements made to the wave direction, which is less directly affected by wind intensity and mostly influenced by the accuracy of the bathymetric constraints, especially toward the coast. Both wave roses and circ-r values at the Bastia coastal buoy show that the wave hindcast outperforms the ERA5 wave dataset. The lack of accuracy of ERA5 is likely due to the low resolution of the reanalysis dataset. With regard to the Tm, the results show correlation coefficients between 0.48 and 0.89 and a general overestimation. Additionally, Amrutha et al. [37] found a large overestimation of the mean wave period in the deep water, using the ST4 source term parameterization. The normalized Taylor diagrams (Figure 12 and Figure S6 in Supplementary Materials) show the higher ability of ERA5 to reproduce the mean wave period Tm. This higher performance may be due to the production of the ERA5 dataset using the 4D-Var data assimilation in ECMWF's Integrated Forecast System (IFS) Cycle 41r2 [18]. A more thorough examination might consider different sea states to identify which conditions produce such results.
The work we presented demonstrated a good reliability of the downscaling process aimed at the improvement of high-quality reanalysis products such as ERA5 data. The hindcast will be further expanded to cover 40 years in order to provide a dataset for different scopes such as climatological studies and extreme value analysis.