Assimilating X- and S-band Radar Data for a Heavy Precipitation Event in Italy

During the night between 9 and 10 September 2017, multiple flash floods associated to a heavy-precipitation event affected the town of Livorno, located in Tuscany, Italy. Accumulated precipitation exceeding 200 mm in two hours, associated with a return period higher than 200 years, caused all the largest streams of the Livorno municipality to flood several areas of the town. We used the limited-area Weather Research and Forecasting (WRF) model, in a convection-permitting setup, to reconstruct the extreme event leading to the flash floods. We evaluated possible forecasting improvements emerging from the assimilation of local ground stations and X- and S-band radar data into the WRF, using the configuration operational at the meteorological center of Tuscany region (LaMMA) at the time of the event. Simulations were verified against weather station observations, through an innovative method aimed at disentangling the positioning and intensity errors of precipitation forecasts. By providing more accurate descriptions of the low-level flow and a better assessment of the atmospheric water vapour, the results demonstrate that assimilating radar data improved the quantitative precipitation forecasts.

Short duration HPEs often assume the form of V-shaped, quasi-stationary convective systems [37,38]. These storms are characterized by a back-building dynamics, where convective cells developing upstream of the affected area continuously replace dissipating cells, resulting in pulsating heavy rain [18,31]. This continuous cell replacement is frequently initiated offshore [34]. This structure becomes quasi-stationary due to the perduration of a favorable synoptic and mesoscale environment [68]. Important factors for persistence and evolution of these precipitation phenomena can be found at the surface (e.g. moist and conditionally unstable air associated to warm sea surface temperatures [17,33]), within the boundary layer (e.g. convergence associated to a frontal system [84], induced by orography [10,33,37] or land-sea differences [17,29]), and throughout the troposphere (e.g. coupling between low and upper level jet streaks The paper is organized as follows: in Section 2 we describe the meteorological context of the Livorno case and the modelling setup implemented, in Section 3 we give the details about the method used to evaluate model outputs, in Section 4 we present the main results, and in Section 5 we discuss implications of the results, limitations of this study and we explain our vision of the path forward in the last Section.

Synoptic conditions
During the first hours of the 9th of September, a large trough deepening over the Eastern Atlantic Ocean approached the Mediterranean Sea. At 0000 UTC of the 10th of September 2017, the axis of the trough was oriented from the Scandinavian Peninsula to the Western Mediterranean Sea (see Figure 1a). The trough slowly moved eastward causing the deepening of a low pressure area (see Figure 1b) over the Ligurian sea and lee of the Italian Alps [9,12]. As a result the pre-existing warm and humid air masses over the Tyrrhenian Sea interacted with the dry and relatively cold flow from France (see Figure 1b and Figure  7b in [36]), making the environment conducive for persistent precipitation systems. Furthermore, upper level divergence, sustained convective available potential energy values in the range 500-1000 J· kg −1 and strong wind shear (plots not shown) favoured convective motions. A well-defined line of convergence of low-level winds over the Tyrrhenian Sea acted as a trigger to overcome the convective inhibition energy (see blue wind vectors in Figure 1b). : mean sea level pressure in hPa and 10-metre wind vectors in blue color at the same time. In both panels, the red rectangle indicates the WRF domain of integration and the red triangle (" ") indicates the location of the Livorno town. ERA5 data [42] were used to plot the maps.
Following the classification of [58], the Livorno case belongs to the HPEs having a short duration (less than 12 hours) and a spatial extent smaller than 50 × 50 km 2 .

Observed precipitation measurements
The Livorno case can be described by using the data obtained from the weather stations managed by the Hydrological Service of Tuscany (www.sir.toscana.it, link accessed in April 2021). The network is composed by approximately 400 rain-gauges, located over an area of about 23000 km 2 , reporting conditions every 15 minutes [16]. In Figure 2a we report the precipitation map of the accumulated precipitation in the 6-hour period ending on 0300 UTC of 10 September 2017. One precipitation maxima, located in the coastal area, and exceeding a cumulative rainfall amount of about 240 mm can be noticed, to the South of the Livorno town (indicated with the red triangle in Figure 2a). The extreme is located in a widespread area of total precipitation exceeding 100 mm, which covers the entire central coast of Tuscany.  [25] is used to estimate rainfall amounts (shown with the shaded colours) when no rain-gauge data are available. The Livorno town is indicated with the red triangle (" "). (b): accumulated precipitation registered by four rain-gauges located close to the Livorno town. Two of them are located to the North of the city (Metato and Bocca d'Arno) and two to the South (Valle Benedetta and Quercianella).
The variability of precipitation duration and intensity across the territory played a major role in the downstream hydrological responses. The sector South of the Livorno town, which was responsible for the flash floods in the urban areas, produced a precipitation that was limited in time: 210.2 mm were registered in the 2-hour period between 10 September, 0045 UTC and 0245 UTC (see Figure 2b reporting the data collected at the station of Valle Benedetta). This corresponds to 86% of the total event precipitation registered by this station. Another weather station, Quercianella, still located in the Southern sector, measured similar values: 188.6 mm in 2 hours, corresponding to 89% of the total precipitation during the event. Analogies in the maximum hourly rain rate are even stronger, since the first station measured 120.8 mm in 1 hour, and the second 1 mm more. For both rain-gauges, [79] estimated a return period > 200 years for the 1 and 3 hours duration rainfall events. The sector North of the town of Livorno, represented in Figure 2b by the stations of Bocca d'Arno and Metato, was characterized by a peak rain rate between 50 and 70 mm in 1 hour and a total event duration of 3 to 4 hours. This precipitation was mostly concentrated between 2000/2100 UTC of the 9th of September and 0000 UTC of the 10th. In the rest of the region (plots not shown), precipitation was either intense and brief (further South: Castellina Marittima rain-gauge, return period > 200 years for 1 and 3 hours duration rainfall events [79]), or long lasting and less intense (North of Livorno: Coltano and Stagno rain-gauges, return period 90 years for 12 hours duration rainfall events [79]).

Radar data
Several weather radars were operational during the rainfall event, belonging to different networks: the Italian national [81,82], the French national [76] and the Tuscany regional [2,26] ones. Among these available radars, two have been selected for this work: the Aleria and the Livorno radars, whose locations are sketched in Figure 5 with the red triangle. The former is a Doppler S-band (the signal frequency is 2802 MHz) system located in Aleria in the central Eastern Corsica and was detecting the severe weather system from a distance of approximately 70-80 km. The latter is a non-Doppler single polarization X-band (frequency is 9410 MHz) radar located in the Livorno harbor, and monitored the dynamics of the meteorological event moving towards the radar site. These systems have been designed and produced by different manufacturers and have different technical characteristics, with particular reference to the operational frequencies, system dimensions and architectures [2]. The Aleria radar (S-band) is characterized by low attenuation impacts on radio signals due to the atmosphere and rainfall and consequently can reliably monitor rainfall fields up to a distance of approximately 100-150 km. The Livorno radar system (X-band) is more sensitive to smaller rainfall drops and can provide a finer resolution, but with a limited range due to the strong effects of atmosphere and rainfall attenuation on radio signals. Moreover the effects of path and wet radome attenuation due to the rain falling directly over the radar system are increased in this case by the strong precipitation occurring at the Livorno site.  These two radar systems detected the rainfall event since its origin over sea between North Corsica and Ligurian Gulf, while approaching the Tuscany coasts. The Livorno radar localized the weather event and followed its dynamics with high spatial detail, as shown in Figure 3a. Then the longitudinal shape of the strong precipitation front, well detected when approaching the Tuscany coasts (Figure 3c), underestimated the reflectivity intensity of the storm, at least during its beginning and early development on the sea (Figures 3b and d). Later, when the storm reached Livorno, the Aleria radar detected the strong rainfall occurrence and persistence over the area (Figure 3f), while the Livorno radar system underestimated the intensity of precipitation, presumably due to a significant path and radome attenuation (Figure 3e). These two radars, in association with the Italian weather radar network [81,82], were used by [26] to characterize the Livorno event and were found to provide valuable information for the quantitative precipitation estimation.

Satellite and lighting data
The deep convection of the Livorno event was detected by the Rapid Scan High Rate (5-minute) mode of the Spinning Enhanced Visible and Infrared Imager on board the Meteosat Second Generation satellite [69]. Satellite images (maps not shown) indicate a cloud top brightness temperature below -65 • C, a temperature that can be found at approximately 12.5 km height, according to the atmospheric sounding measured at 0000 UTC on the 10th of September at Pratica di Mare (data not shown), a sounding station located approximately 250 km South of the area of interest. Further and in-depth analyses of the Livorno case using satellite-based observations are available in [66].
The deep convection of the Livorno event produced (see Figure 4a) an average of four lightning strikes per second over Northern Tuscany in the six hours between 2150 UTC of the 9th of September and 0350 UTC of the 10th of September; this number is approximately 10% of the global flash rate [22]. For those rain-gauges that recorded the greatest amount of precipitation (Valle Benedetta and Quercianella), corresponding lightning time series graphs are shown in Figures 4b and 4c as histograms of number of strokes per hour within ±0.1 degree of the two weather stations centres, from 2200 UTC of the 9th September to 0400 UTC of the 10th of September 2017. A total number of 9015 and 7177 strokes were detected during the 6-hour period for Quercianella and Valle Benedetta stations, respectively. The highest number of flashes was recorded at 0100 UTC of the 10th of September for both Quercianella and Valle Benedetta (3053 and 2528, respectively).

Modelling setup
The numerical model used in this work to simulate the 9-10 September Livorno case is the WRF model [72]. The Mesoscale and Microscale Meteorology (MMM) Laboratory at the National Center for Atmospheric Research (NCAR) has led the development of the WRF model since its inception in the late 1990s. It is a fully compressible, Eulerian, non-hydrostatic mesoscale model, designed to provide accurate numerical weather forecasts both for research activities and operations. In this work, we implemented the Advanced Research WRF (ARW) version of the model updated to version 4.0 (June 2018). The model dynamics, equations and numerical schemes implemented in the WRF-ARW core are fully described in [72] and [47], while the model physics, including the different options available, is described in [21]. A summary of the model settings used in this study is given in Table 1, while the domain of integration is indicated in Figure 1 with the red rectangle. All the WRF simulations (with and without assimilation) performed in this study share the same basic set of physical parameterisations listed in Table 1. This configuration mirrors the one that was operational at the weather service of the Tuscany regional (LaMMA, see www.lamma.toscana.it, link accessed in April 2021) at the time of the Livorno event, although in this study we used a more updated WRF model version (4.0 instead of 3.9).  [20] To start the WRF simulations, initial and boundary conditions are derived from the ECMWF-IFS global model data operational at the time of the event (model cycle 43r3). The spectral resolution is TCO1279, (where CO means that a cubic-octahedral grid is used), which corresponds to a grid spacing of approximately 9 km; the number of vertical levels is 137 and the top of atmosphere is set to 0.01 hPa.
The control forecast is initialized at 1200 UTC on the 9th of September 2017 and the forecast length is set to 15 hours (ending time is 0300 UTC of the 10th of September). To test the impact of assimilating conventional and radar data, we utilised the WRF data assimilation (WRFDA) system [3]. The WRFDA software was used to perform a 3-hour cycling 3D-Var data assimilation using the rapid update cycle approach, similar to what is found in [70]. We implemented an assimilation step every 3 hours starting from 1200 UTC on the 9th of September up to 1800 UTC of the same day with an assimilation window of ± 30 minutes, i.e., all the observations between t − 30 minutes and t + 30 minutes are assumed to be valid at the analysis time t. The 3D-Var implementation of the WRFDA package relies on previous developments designed for the fifth-generation Penn State/NCAR Mesoscale Model [4]. Although detailed descriptions of the 3D-Var method can be found in [83,86,87], the technique can be summarised as follows: the basic idea is to estimate the optimal state of the atmosphere (in the model space) by using the observations available, a short-term forecast (often referred as first guess or background) and information about errors statistics on both the observations and the model. Let t ∈ R be the time of the analysis and let x = x(t) ∈ R n the model analysis at time t. Using an iterative process, the 3D-Var method looks for the minimum value of the cost function J(x), defined as: where, following the notations of [24], x b ∈ R n is the background model state at time t, B ∈ R n×n is the covariance matrix of the background errors, y o ∈ R p is the vector of observations (p < n) at time t, H : R n → R p is the observation operator that transforms the variables from the model state to the observation space and R ∈ R p×p is the covariance matrix of observation errors.
Accurate estimates of the B and R matrices determine the quality of the analysis. As regards the R matrix, default values provided by the WRFDA software (see User's Guide [72]) were used for diagonal elements, whereas off-diagonal's elements are set to zeros. In fact, correlations between different instruments are usually assumed as null. Although some studies demonstrated that including inter-correlations in the R matrix may provide better analyses [51], this mainly holds when satellite radiance data are considered [5,6], thus we claim that the assumption on the R's structure is valid in our study. We computed the background error correlation matrix B by means of the National Meteorological Center (NMC) method [61], which estimates the value of the elements of the matrix statistically, by averaging the differences between two short-term forecasts valid at the same time but initialised one shortly after the other (we set 12 hours later). The B matrix we used is the result of the NMC method applied for ten months (from December 2018 to October 2019).
Following [74], the observation operator H for the radial velocity V r is defined by the relationship: where (u, v, w) are the modelled wind components, (x i , y i , z i ) are the coordinates of the radar antenna, (x, y, z) are the coordinates of the radar observation, r i is the distance between (x, y, z) and (x i , y i , z i ) and v T is the mass-weighted terminal velocity of the precipitation. To calculate v T , the default formula [75] was used, that is: where p 0 is the surface pressure,p is the base-state pressure and q r (unit g· kg −1 ) is the model predicted rainwater mixing ratio. Let Z be the reflectivity data expressed in dBZ, then the nonlinear Z − q r relationship is defined by [74]: where c 1 = 43.1 and c 2 = 17.5 are constants and ρ (unit kg· m −3 ) is the air density. Following the performances achieved by [48], radar reflectivity data were assimilated using the indirect technique proposed by [83], that is by inverting the Z − q r relation given in Equation 4 and assimilating observed rainwater mixing ratio estimated from reflectivity values. Before assimilation, radar reflectivity data undergo to a thinning procedure. This consists in eliminating ground/sea clutter and georeferencing radar data onto the model grid; then for each model grid point the procedure selects the maximum reflectivity value (and the corresponding radial velocity) among radar observations belonging to that grid point. Finally, to reduce the spin-up in the first hours of integration, we implemented the digital diabatic filter initialization [62] at each starting time of the WRF simulationsw.
To assess the impact of assimilating different types of observations, three experimental forecasts have been deployed and their results are compared with those obtained with the control forecast (named with the code C), which doesn't assimilate any observational dataset. The first experimental forecast (code S) assimilates data from ground weather stations, the second (code S + R) assimilates radar data in addition to ground weather station data. Conventional data assimilated in both experiments are: pressure, temperature and relative humidity at 2-metre above ground level, wind speed and direction at 10-metre above ground level. The total number of weather stations reporting valid data is 1036 for the assimilation step performed at 1500 UTC on the 9th of September and 1051 for the 1800 UTC assimilation. Remote sensed data assimilated are reflectivity data from the X-and S-band radars and radial velocity data from the Aleria S-band radar. The locations of weather stations and radars are indicated with the blue points and red triangles, respectively, in Figure 5. To investigate the role of sea surface temperatures (SSTs) in the Livorno case, we performed a further numerical experiment (code S + R + M), which shares the same settings and design of the S + R experiment, but includes a simplified ocean model that modifies its status (SSTs and ocean mixed layer depth) thanks to the interface heat, momentum and mass fluxes. In fact, the interaction between atmosphere and ocean can play a key role in the formation and intensification of extreme atmospheric events [39] through the energy fluxes, which can impact the generation of HPEs by modifying the structure of the boundary layer, the distribution of wind fields and therefore the position of the convergence line [19,60]. Historically, numerical weather models implemented the SSTs as a static boundary, taken from the global models or from low-resolution satellite datasets. However, following this approach it is difficult to reproduce complex energy feedback between air and sea [65], in particular in the coastal areas. In order to limit this impact, without slowing down the calculation time with the coupling of an ocean model, we proceeded by using a "slab ocean", also known as a simple Ocean Mixed Layer [28], which updates the SSTs every hour, according to the energy fluxes at the air-sea interface. This approach is based on a simplified model that aims at implementing the SSTs and ocean mixed layer depth into the WRF model. This one-dimensional approach is initialized with SST data, a mixed layer depth value of 40 m and a lapse rate 0.14 K · m −1 . These data are provided by the Copernicus Marine Services (CMEMS) numerical model. WRF modifies the SSTs value according to the energy and momentum fluxes at the interface, not taking into account the ocean dynamics, but making each grid point of the evolve as a function of the energy budget at surface. Table 2 shows a summary of the various forecasts and the codes adopted to name them, while Figure  6 shows a scheme of the modelling setup. Table 2. Codes assigned to name the forecasts and description of the data used in each assimilated forecast.

Forecast code Data assimilated
C none (control run)

S
Convectional data from weather stations: pressure, 2-metre temperature 2-metre relative humidity 10-metre wind speed and direction S + R as in S plus reflectivity data from X-and S-band radars, and radial velocity data from the S-band radar  Figure 6. Modelling setup of the WRF control run C and assimilated runs S, S + R and S + R + M.

Quantitative precipitation forecast verification
When evaluating the performance of different forecasts, a quantitative evaluation of the spatial agreement between predicted and observed values is crucial. A direct numerical comparison can be misleading, especially for variables whose spatial distribution is highly concentrated in a small area, as happens during HPEs. In fact, any forecast that correctly predicts the occurrence of a highly localized heavy rain, may incur in the so called "double penalty" error [35] if it places the event in a nearby area, producing, for example, a root mean squared error (RMSE, see [85]) higher than another forecast which completely misses the prediction. To overcome such a limitation, object-based verification methods have been developed by the scientific community [27] and, currently, software packages for practical applications exist [7]. Anyway these methods exhibit some drawbacks, like the smoothing and filtering the observations undergo and the large number of parameters whose setting turns out somewhat arbitrary.
Considering the features of the phenomenon under examination, we have devised a simple and robust ad-hoc verification method, which does not alter in any way the recorded rainfall values. No arbitrary or subjective parameters are required, except for the ones needed to define the area where the event occurred, namely the center and radius of the circle in which the area of interest is assumed inscribed. For the flash flood under examination, we chose a circle with radius 0.4 degrees in the longitude-latitude forecast domain and centered at the point P C with coordinates (x C,lon , x C,lat ) = (10.64 • E, 43.75 • N), about 30 km to the North-East of the Livorno town. This area includes 91 rain-gauges located at points P i (i = 1, . . . , 91), which measured the highest rain rate in the 6-hour period ending on 0300 UTC of the 10th of September (see Figure 2). By applying affine transformations 1 to the 91 position vectors x i = (P i − P C ) ∈ R 2 , we searched Figure 7. The gray circle contains the 91 rain-gauges (coloured points) that registered the heaviest rainfall in the 6-hour ending at 0300 UTC on the 10th of September. The sample black circle contains the transformed (i.e., roto-translated) 91 rain-gauges obtained by varying the parameters d and φ. For each forecast, the verification procedure extracts the predicted precipitation values (samples shown with the red contours) at the black point locations and calculates the root mean squared error against values reported at the coloured points.
for the transformation minimising the RMSE between the cumulative rain r i measured at station P i and the one f (y i ) predicted by the forecast model f at the transformed points y i ∈ R 2 (see Figure 7). Since we applied a roto-translation, any point y i is determined by: 1 More precisely rigid roto-translations preserving the Euclidean distance; these transformations are usually denoted as the SE (2) group.
where φ is the rotation angle, R φ ∈ R 2×2 is the rotation operator and has the form: and x 0 = (x 0,lon , x 0,lat ) ∈ R 2 is the translation vector. All the parameters of the roto-translation are defined in the longitude-latitude space. We note that the distance, in km, between P C and the centre of any circle translated by a vector x 0 is given by: where x 0,lon , x 0,lat and x C,lat are expressed in radians and R is the Earth radius (approximately 6370 km). The minimisation of the positive defined error function where · stands for the average operator over the index i, determines the particular set of parameters for which the agreement between predicted and observed cumulative rain is the highest possible, at least in terms of RMSE. In this way, the location and intensity errors are disentangled, the former given by the affine transformation parameters amplitude (φ and x 0 ), the latter by the minimised RMSE.
To contend with the validation method described above, standard verification scores were computed. Observed rainfall data registered at the 91 rain-gauges were compared with corresponding modelled data, which were extracted at the 91 nearest grid points containing rain-gauge locations. Standard verification statistics considered are: RMSE (defined in Equation 8), mean error (ME) and the multiplicative bias (mbias). Let r i and f (P i ) be the observed and modelled rainfall values at location P i , respectively, then ME and mbias are defined as follows (see also [85]): Furthermore, to summarise multiple aspects of the model performance in a single diagram, we made use of the performance diagrams (see [67]). Such diagrams plot four skill measures of dichotomous (yes/no) forecasts: probability of detection (POD), success ratio (SR), frequency bias (bias) and critical success index (CSI). Using the 2 × 2 contingency table shown in Table 3

Results
In Figure 8, we show with the shaded colours the quantitative precipitation forecast (QPF) in the 6-hour period ending at 0300 UTC of the 10th of September for the four experiments listed in Table 2. In each panel, the gray circle indicates the area containing the 91 rain-gauges (closed gray points) that registered the highest precipitations. The visual inspection of Figure 8 indicates that the C and S forecasts provide QPF values close to zero in the Livorno area. On the other hand, both numerical experiments predict QPF peaks up to 100-120 mm in an area few tens of kilometres (approximately 40-60 km) to the North of the area of interest, where rain-gauges registered accumulated precipitations up to 30-40 mm (see Figure 2a). Both the S + R and S + R + M forecasts predict high QPF values inland of the Livorno area, however, QPF maxima (in the range 125-150 and 80-100 mm for the S + R and S + R + M forecast, respectively) largely under-estimate actual observed values, which reached approximately 200-230 mm (see Figure 2b).
In Table 4, we show the verification skill scores defined in Equations 8, 9 and 10. The scores were computed using observed rainfall registered at the 91 rain-gauges shown in the gray circle of Figure 8 and the corresponding modelled precipitations extracted at the grid points closest to the rain-gauge locations. The S + R and S + R + M forecasts exhibit the more satisfactory scores; in fact the RMSEs are approximately 40/45 mm, whereas C and S errors are higher and reach approximately 65 mm. The MEs and biases demonstrate the underestimation of all the four predictions; such underestimation is remarkable for C and S (ME -50 mm and mbias < 10 −3 ). On the other hand, the S + R and S + R + M forecasts provide more skillful scores, with the mbias close to the perfect score 1 and MEs approximately -2.5/-3 mm.
In Figure 9, we show the performance diagram for selected rainfall thresholds corresponding to approximately the 20th, 40th, 60th and 80th percentiles of the observational dataset. The plots show that, for precipitation thresholds equal to 17, 34 and 49 mm, the S + R and S + R + M forecasts behave similarly and outperform the C and S forecasts, which have no skill since both matching points lie close to the bottom-right corner. As the precipitation threshold equals 83 mm, which approximately corresponds to the 80th percentile of the observations, the S + R forecast doesn't provide any valuable information, since the matching point lies close to the bottom-left corner. The S + R + M forecast holds some skill as regards the bias (approximately 0.5) and the SR score (approximately 0.3), whereas the skill of CSI ( 10) and POD (< 0.2) is limited. Skill scores for precipitations thresholds greater than the 80th percentile of the observations are poorer or null for all the forecasts (performance diagrams not shown).  In each panel of Figure 8, we also show the result of the roto-translation of the gray circle minimising the RMSE between the observed and predicted rainfall data (the transformed circle and rain-gauge locations are indicated with the black colour). The angle of rotation φ is measured, by definition, counterclockwise from the North direction. For each experimental run, the parameters of the RMSE minimising roto-translation are shown in Table 5. Table 5. Parameters of the roto-translation minimising the RMSE between observed and predicted values extracted at the 91 rain-gauge locations located within the gray and black circles, respectively, shown in Figure 8. RMSE min is the error after the minimising roto-translation; x 0,lon and x 0,lat are the coordinates of the translation vector x 0 defined in Equation 5; d 0 indicates the distance, in km, between the centres of the black and gray circles (i.e, the dashed segment of Figure 8) and is given in Equation 7. The angle φ indicates the rotation amplitude with respect to the North direction, counterclockwise. After the roto-translations, all the four experiments exhibit similar RMSE min , with the experiment S providing the lowest value (29.0 mm) and experiment S + R + M having the largest one (30.1 mm). However, the parameter d 0 indicates that the S + R experiment displaces the transformed black circle by approximately 27 km, whereas the other experiments provide higher displacements, ranging from 60 to 100 km. We stress the fact that not only the displacement is lower when radar data are assimilated, but also oriented along the axis of the perturbation hitting the Livorno town (from South-West to North-East). The angle of the rotation φ reported in Table 5 spans from 196 • for S + R to 351 • for S + R + M.

Forecast code
To further evaluate the impact of assimilating conventional and radar data, in Figure 10 we show the 10-metre wind speed and direction at 0000 UTC on the 10th of September for the four numerical experiments. All the four forecasts reconstruct a well defined convergence line over the Ligurian Sea between southerly and westerly winds, which is responsible for triggering convective rainfall. However, in the C experiment, such convergence line is positioned few tenths of kilometres (approximately 70 km) to the North of the Livorno area, causing the wrong displacement of precipitations. The impact of the conventional data (experiment S) is negligible and doesn't modify the position of the convergence line with respect to the control run C (see panels (a) and (b) in Figure 10). On the other hand, assimilating radar data modifies substantially the 10-metre wind speed and direction forecast, with a better localization of the convergence line, positioned close to the area that registered the heaviest precipitations (see panel (c) in Figure 10). The implementation of the simplified ocean model (see panel (d) in Figure 10) generates prefrontal winds (southern part of the convergence line) more intense of about 8 m/s, and a local pressure decrease of about 1.5 hPa (map not shown); as a consequence, the front-genesis evolves more quickly with greater intensity in the event area. To understand how the precipitation field is modified in the experimental runs, in Figure 11, we show the specific humidity averaged over the entire column of atmosphere at 0000 UTC on the 10th of September. We stress the fact that these maps are the results of both the assimilation procedure and the model evolution in the first hours of integration. The S + R run (panel (c) in Figure 11) clearly shows a tongue of high water vapor intensity extending from the northern tip of the Corsica Island to the Tuscany northern coasts (and further inland), which is not simulated nor in the control run C (panel (a) in Figure  11) and in the assimilated run S (panel (b) in Figure 11). The addition of the simplified marine model to the S + R experiment (panel (d) in Figure 11) moves this line of high water vapour content northwards, close to the Ligurian coasts, slightly increasing its intensity, with values greater than 6.3 g· kg −1 .

Discussion
As regards the infamous rainfall event occurred in the Livorno area (Central Italy) during the night between 9 and 10 September 2017, we evaluated possible improvements arising from the assimilation of both conventional and radar data in the setup of the WRF model, operational at the meteorological agency of the Tuscany region (LaMMA) at the time of the event. In addition, we implemented a simplified ocean model, having a relatively low computational cost, to evaluate any further improvements due to a better description of energy exchanges between air and sea, which are known to play a crucial role in the formation and intensification of HPEs [19,39,60]. Such experimental forecasts were compared against a control run, which didn't assimilate any data neither implement the simplified ocean model.
Since no reliable gridded observed rainfall dataset was available for the Livorno case, as ground-truth data, we used the observations collected at the rain-gauges managed by the Hydrological Service of Tuscany [16]. To assess the accuracy of QPFs, we introduced a novel method (see the sketch in Figure 7), aimed at a joint evaluation of both the intensity and position errors of the forecasts, while preserving the recorded rainfall data (i.e., no interpolation or filtering of the observations was performed to comply with the model data). To further validate the predictions, we also computed standard verification scores and constructed performance diagrams for predefined precipitation thresholds. Figure 8a demonstrates that the control simulation C is not able to correctly predict rainfall peaks in the Livorno area; this is not new and confirms the findings of similar studies [36,48,66]. However, the results shown in Figure 8 and Table 5 suggest that the S + R forecast is the more valuable. In fact, looking at the position error, the d 0 parameter is the lowest, indicating the optimal positioning of the transformed black circle close to the original gray circle. We note that the parameter d 0 for the S + R forecast is about one third the length of d 0 for the S forecast and about one quarter the length of d 0 for the C control run, indicating the relatively higher accuracy of S + R with respect to C and S. For all the forecasts, with the exception of the S + R case, the RMSE minimisation transformation moves the circle including the representative rain observation points inland, next to the area with the highest predicted rain and applying a translation with a slight rotation (the parameter φ is close to 360 • ). In the R + S case, on the contrary, the translation moves the circle towards the seaside with a rotation of φ = 196 • that flips coastline and inland observation points, reversing North with South. Indeed the S + R forecast locates the heaviest rain inland, instead of near the coast as actually happened and further South. Consequently the minimisation process tends to rotate the stations circle by about 180 • , flipping coastline and inland rain-gauges, as well as northern with southern ones. The translation towards seaside indicates that the heaviest rain predicted inland is even heavier than the maximum values recorded on the coast. As regards the intensity error, the lowest RMSE min value is obtained for the S forecast; however all the forecasts provide comparable intensity errors (at least for the selected rain-gauges).
Similar conclusions can be drawn looking at the results shown in Table 4 and Figure 9. Both the S + R and S + R + M forecasts outperform C and S, yielding better scores. However, in lack of any assessment regarding the position errors, it is difficult to evaluate which forecast between the S + R and S + R + M forecast provide the more valuable predictions. We deduce that the position/intensity error information shown in Table 5 and Figure 8 confirm and complement the information obtained from standard verification skill scores shown.
Our conclusions agree with those of [36], who found that assimilating radar (and lighting) data provide better QPFs by changing the water vapour mixing ratio during the assimilation step. Incidentally, we note that the performance diagram shown in Figure 9b well agrees with Figure 16f in [36] and provides similar, or slightly better, results, supporting common conclusions. However, we acknowledge that a direct comparison should be treated with care, because ingested radar data and assimilation techniques are different and because the period over which rainfall data are accumulated is shorter (3 vs. 6 hours).
Our findings further confirms [36,48,54,78,87] that 3D-Var radar data assimilation is an effective method for improving QPFs by correcting the initial conditions of limited-area weather numerical models. In fact, concerning the assimilation of radial velocity data (available at the Aleria S-band radar only), Figure 10 suggests that the impact of remote sensed data is beneficial for a better positioning of the low-level winds convergence line. Moreover, the indirect assimilation of radar reflectivity (available at both the Aleria and Livorno radars) and in turn the modification of water vapour mixing ratio (see Figure  11) is supposed to provide an environment that is conducive for convection; this latter carries towards higher atmospheric levels large contents of moisture (see panels (c) and (d) in Figure 11) and energy favoring cloud formation. Thus, water vapour spatial and temporal variation indirectly provides some indications on how the different model setups may impact the event prediction. Our findings agree with those of [87], who discussed how radial velocity assimilation has a large impact on wind velocity, whereas reflectivity data has a direct impact on hydrometeor analyses. We conclude that, since the origin of the Livorno event took place on the sea, the assimilation of conventional data, available only on land, has a low impact on the accuracy of QPF for the S experiment. On the other hand, assimilating radar data, which are available offshore, provides pertinent and crucial information regarding rainfall triggering and atmospheric moisture content. Furthermore, we claim that such remote sensed data modifies the model dynamics in a way that persists few hours (up to 9 hours) after the assimilation step (which ends at 1800 UTC on the 9th of September).
When looking at the outputs of the S + R + M experiment (see panels (d) in Figures 8, 10 and 11), the forecasts are not dramatically affected by the use of the simplified ocean model. This is consistent with the conclusions drawn by [50], who find that the evolution of the SSTs during the model integration has a marginal impact on the short-range predictions (forecast length less than 18 hours). We can deduce that, for the Livorno case, the exchange of both heat and water vapour between air and sea doesn't play a crucial role as also found by [49]. In fact, the authors argued that ingesting SSTs estimates from Sentinel-3 satellite observations into the WRF model do not improve the Livorno QPF for high precipitation rate. This happens possibly because at the kilometre scale, the SSTs determine the intensity of the warm low-level jet [65], which, in our case, was partially corrected by the ingestion of radial velocity radar data. In our S + R + M forecast, it is observed that the heat southerly fluxes are more intense by about 10-15% if compared to S + R data (map not shown). We found that, although the skill scores of S + R and S + R + M are similar (see Table 4 and Figure 9), the simplified ocean model is able to modify the precipitation field by reducing the QPF peaks close to the coast and thus producing a position error (see panels (c) and (d) in Figure 8 and Table 5).
We claim that assimilating radar data turned out to be effective because we used a covariance matrix of the background errors B, that is the result of a long-term application of the NMC method (approximately ten months). In fact, to estimate B, we used the operational forecasts issued twice a day at the regional meteorological service of Tuscany (LaMMA), which share the same setup (in terms of resolution, number of vertical levels, physical parameterisations, etc. . . ) of the runs here presented. This is a remarkable improvement with respect to previous similar studies; to mention a few works [54] and [48] employed a 1-week and a 1-month period, respectively, to compute the B matrix, [36] and [57] applied the NMC method during the Hydrological cycle in the Mediterranean Experiment -First Special Observing Period (HyMeX-SOP1, which lasted for about two months in 2012, see [33] ), whereas [78] used the default matrix provided by the WRFDA system, which is produced with global data and its use for regional cases is sometimes discouraged [72]. Also more recent works (see [56]) used a time period shorter than one month to compute the covariance matrix of the background errors. However, we acknowledge that further investigations are needed to assess the impact of a long-term application of the NMC method to build a high-level quality B matrix and obtain better QPF data.

Conclusions
Although tremendous improvements were achieved in recent years, the operational forecasting of HPEs in the Western Mediterranean basin still remains a challenging task, in particular in the context of a warming climate. As concerns the Livorno case, the main findings of this work are: • assimilating reflectivity data from X-and S-band radars and radial velocity data from S-band radar significantly improves the description of atmospheric humidity content and low-level winds, resulting in better QPFs, • the application of a simplified ocean model, although modifies the low-level jet associated with the event, scarcely impacts the short-range forecast (length shorter than 12 hours) of precipitation, • the novel QPF verification method introduced in this paper, based on roto-translation RMSE-minimisation, confirms and reinforces the results achieved with standard verification scores, adding more information about the position error of the WRF simulations.
The conclusions we drew support the deployment of a dense network of relatively small radars in the coastal areas of the Western Mediterranean Sea. Such high-resolution (both spatial and temporal) data may have beneficial impacts on short-range numerical weather predictions. In fact, by extracting the maximum value from local observations, such as those collected by X-band radars which are not processed by international meteorological organizations, higher quality analyses can improve the description of finer scale features and thus provide better initial conditions to limited-area weather models.