1. Introduction
Numerical weather prediction (NWP) models are powerful tools of weather forecasting that employ a set of equations describing the flow of fluids. These equations are translated into computer codes and by using governing equations, numerical methods, parameterizations of other physical processes combined with initial and boundary conditions. In particular, limited area models (LAM) are currently used in order to get detailed information over a geographic area of interest. They are driven by global models (GM) with the specific goal of providing atmospheric variables at very high temporal and spatial resolution. LAM formulations include a variety of parameterization schemes, aimed to take into account in a statistical way the effects of those phenomena that are not described by the governing equations or that take place on unresolved scales. A major source of uncertainty both in GM and in LAMs arises from the large number of unconstrained model parameters associated to parameterization schemes. Several studies have demonstrated the importance of this “parameter uncertainty”, by perturbing single and multiple model parameters within plausible ranges determined by expert judgment. Oreskes et al. [
1] showed that sensitivity could help to investigate the aspects of the system, which need further study, and the addition of more information. Beven [
2] stated that the numerical values of parameters to be used in the model must be set properly, in order to ensure that the main features of the highly heterogeneous real domains are properly reflected in the model. Jarvinen et al. [
3] developed a theory to use the existing ensemble prediction infrastructures and operational ensemble simulations for an estimation of model closure parameters. This method was applied by Ollinaho et al. [
4] to the medium-range forecast skill of the ECMWF model HAMburg version (ECHAM5) atmospheric general circulation model. Ihshaish et al. [
5] used genetic algorithms (GA) in order to find an optimal set of values of model closure parameters that appear in physical parameterization schemes. Then, they [
6] tested the same scheme with different GA configurations by variating its initial population size in order to get better predictions. The importance of model tuning has been assessed even for long-term climate simulations [
7,
8]. Since uncertain parameter values are responsible for a part of modelling errors, this uncertainty is constrained by calibration or tuning methods to improve the agreement of the model values with available observations. This process is one of the aspects that requires highly skilled technical human resources in order to distinguish among the most sensitive physical and numerical parameters.
The sensitivity of a LAM to a parameter perturbation has been examined in several studies. Baldauf et al. [
9] presented results of the operational NWP COSMO-LM and the related sensitivity activity performed for the convective scale. In 2017, Voudouri et al. [
10] examined the feasibility to calibrate a COSMO-LM model using an approach based on an objective multi-variate calibration method built on a quadratic meta-model (MM), originally developed in [
11] and then adapted for applications to regional climate models [
12]. The MM is a model emulator that performs a calibration based on sampling of the parameter space using several COSMO-LM simulations, and then fitting a (continuous) quadratic regression in this space. This fit allows reproducing the forecasted field for a given day/region for any parameter combination (taken within predefined parameter range). It was found that this method is affordable in terms of computing resources and effective in terms of improved forecast quality. Successively, Voudouri et al. [
13] applied the proposed methodology for the calibration of COSMO-LM at high horizontal resolution (2 km) over a domain including Switzerland and Northern Italy. They found that this method allows a temperature bias reduction of about 0.2 °C and an improvement of the overall performances of the model. However, it should be noted that the results indicate a relatively low benefit with respect to the computational cost of the method, as it remains expensive for a regular usage of the calibration procedure. The possibility of using automatic model calibration platforms was investigated also by Duan et al. [
14], who developed a platform called “Uncertainty Quantification Python Laboratory”. This platform allows reducing the number of tunable parameters to a tractable level and implements an optimization algorithm that uses only a small number of model simulations. It could be achieved by constructing a meta-model to represent the error response surface of the dynamical NWP model using a finite number of simulations.
In the present work, the results of a sensitivity analysis performed with COSMO-LM over a domain located in southern Italy at 0.009° spatial resolution (about 1 km) are discussed and analyzed. At this very high resolution, the convection resolving NWP models pose further challenges in the process of configuration optimization. The sensitivity analysis to parameters, through a tuning procedure, is aimed to select those that have been shown to play a significant role in determining model response [
12]. As stated in [
13], the computational resources required for the application of automatic calibration methods are rather heavy due to the high number of parameters to be considered and the related number of simulations to be performed. For example, the cost of the method proposed in [
13] is associated with the number of simulations required to fit the meta-model: as demonstrated in [
11], the minimum number of simulations for calibrating n parameters is equal to 2n + n × (n − 1)/2. Thus, calibration of a model over an entire year and adopting a fine resolution, such as 1 km, requires a considerable amount of computing power. This is the reason why the choice of key parameters plays a crucial role in order to avoid useless increase of computational resources.
The importance of the model calibration has been acknowledged by the COSMO Consortium through the establishment of the priority project CALMO (2013–2016), aimed to develop a method supporting an objective calibration of the input parameters of the model [
15]. In particular, the asymptotic turbulence length scale, the mean entrainment rate for shallow convection, and the surface area index of evaporative soil surfaces were object of tuning. In this frame, an optimized set of parameters was evaluated for a domain centered over Central Europe, at resolution of about 2.2 km and applied to smaller domains too, e.g., North Italy [
16]. Moreover, since 2017 the priority project CALMO-MAX is in progress, aiming to extend and consolidate the findings of the previous project, including also a focus on extreme events. From this point of view, this work is aimed to provide a contribution to the selection of model parameters (and their values) that result to be more effective for a proper representation of intense weather events. For this reason, the model sensitivity was carried out considering periods in which the area under study was affected by severe weather conditions.
The paper is organized as follows:
Section 2 contains a description of the model set-up and of the test case considered; in
Section 3, a description of the methodology used is reported; in
Section 4 results are presented and discussed. Concluding remarks are discussed in
Section 5.
4. Results
4.1. Analysis in Terms of Temperature
The first analysis was performed contemplating the hourly time series of T2m, comparing the data provided by the CIRA ground station with model data obtained with the different configurations, considering the nearest grid point to the CIRA site.
Figure 2 shows the time series of T2m for observational and model data (reference configuration c0 and sensitivity configurations from c1 to c8) over a period of 96 h, from 01:00 3 November 2017 to 24:00 6 November 2017. In a similar way,
Figure 3 shows the time series of T2m for observational data and model data (reference configuration c0 and sensitivity configurations from c9 to c18). For a better quantification of performances of sensitive runs,
Table 3 shows the daily mean T2m values for each day considered in the 2017 and 2018 events for both observational and model data (all the configurations apply). The maximum deviation from the reference simulation with the initial condition ensemble (cP) is also shown. The last line contains the average cumulative errors with respect to observations. In a similar manner,
Table 4 and
Table 5 show, respectively, the daily maximum and minimum T2m values. It is evident that, with the c0 configuration, the maximum value of the first day is well reproduced, while it is underestimated in the other days. Mean and minimum values are overestimated in all days. A good improvement in the representation of maximum values is achieved with c7, which allows a null bias on 6 November 2017 and a bias reduction on 5 November 2017. The c15 allows a good improvement on 4–5 November 2017 and 19–20 November 2018, but conversely overestimates the maximum values on 3 and 6 November 2017. Good improvements are achieved also for the minimum values; in particular, c7 and c15 allow a better representation of the days of 2017 event, while c17 allows an improvement on 4 and 6 November 2017. Moreover, c1 performs better on 3 November 2017, while c14 and c16 perform better on 19 November 2018. Finally, for all the cases considered, the effects of internal variability (cP) are limited to small variations (about 1–2%).
In order to avoid the limitation of analysis related to a single spatial point, the effects of the sensitivity were investigated over a wider area, comparing model data with SCIA daily observational data. More specifically, six stations located in southern Lazio (namely Arpino, S. Elia Fiumerapido, S. Giorgio a Liri, Formia, Frosinone and Alvito) were considered. For each station, the nearest grid point was selected. Results are presented in terms of average values (average observational values and average model values for the different configurations). In detail,
Table 6 shows the maximum daily values of T2m for 5 and 6 November 2017 and 19 and 20 November 2018. Similarly,
Table 7 shows daily minimum values. The results highlight that the maximum values are always underestimated by c0, but improvements are achieved with several configurations, in particular with c17 for 5 November 2017, with c1 and c17 for 6 November 2017, and with c3 for 20 November 2018. Looking at the individual stations (value not shown in the tables), it shows that considerable improvements for the maximum temperature (bias reduction of 0.6 °C) are achieved in Arpino with c3 and c18, in S. Giorgio Liri with c7 and c17. Significant improvements for the minimum temperature (bias reduction of 0.5 °C) are achieved in Arpino with c7 and c17 and in Alvito with c15.
Even if observational data are not available over the whole domain, it is interesting to analyze how the sensitivity configurations modify the distribution of temperature with respect to the reference one.
Figure 4 shows the T2m distribution related to 5 November 2017 at 6.00 (hour 54, corresponding to a minimum value) for the reference configuration c0 and the differences of distribution obtained with configurations that have been proven to produce better improvements (i.e., c1, c7, and c17) with respect to the reference one. The CIRA site is highlighted with a blue dot. From the data already discussed, it shows that c0 generally overestimates the minimum temperature. The maps of
Figure 4 highlight that c7 and c17 provide lower values over the whole domain (generally reducing the bias), while with c1 benefits are confined to smaller areas, close to the CIRA site. Similarly,
Figure 5 shows the T2m distribution related to 3 November 2017 (mean value) for c0, and the difference of distribution obtained with c7, c15, and c17 with respect to the reference one. As already said, c0 overestimates mean T2m of about 1.2 °C at the CIRA site, but the other three configurations are able to reduce significantly this bias. A reduction of temperature is observed over the northern Campania region (eastern part of the domain), while warmer temperatures are recorded in the northern part of the domain, over the sea and along the coastal area.
4.2. Analysis in Terms of Precipitation
The second analysis was performed considering daily precipitation values, comparing data provided by ANCE and SCIA with model data obtained with the different configurations. On 3–4 November 2017, observed precipitation values are almost zero everywhere, and these null values are well reproduced by all the model configurations (considering the nearest grid point). On 5–6 November 2017, and 20 November 2018 high precipitation was observed.
Table 8 shows the daily precipitation values in these days, averaged over the whole network of 76 ANCE stations, for observational and model data (all the configurations). The maximum deviation from the reference simulation with the initial condition ensemble (cP) is also shown. These values reveal that c0 underestimates the observed value and that improvements are recorded with the configurations c3, c15, c16, and c18. These are all characterized by rlam_heat at minimum. For all the cases considered, the effects of internal variability (cP) are limited to small variations (about 2–3%). Then, more specifically, we analyzed the behavior of the model in selected stations at different altitudes, namely Benevento (135 m), Grazzanise (12 m), Montemarano (800 m), Giffoni (250 m).
Table 9 and
Table 10 show the observed and model values, respectively, for 6 November 2017 and 20 November 2018 for these stations, with the configurations (i.e., c3, c15, c16, c18) that have been proven to provide the best results. In Benevento and Grazzanise (low altitude sites) precipitation is largely underestimated by the reference configuration (more than 50%), but relevant improvements are achieved with c15 and c18. In Montemarano (high altitude) precipitation is slightly underestimated by c0, and quite improved by c18. In Giffoni (medium altitude), underestimation with c0 is more than 50%, but improvements are obtained with c16 and especially with c18 also in this case.
Table 11 shows the daily values of precipitation for 5–6 November 2017 and for 19–20 November 2018 (observational value and model data with the different configurations), averaged over the six stations taken from SCIA datasets (already considered for temperature analysis). Significant improvements are achieved with the same configurations already mentioned for ANCE data, namely c3, c15, c16, and c18. Looking at the individual stations (data not shown), it shows that (with the exception of Frosinone) precipitation is always underestimated with the reference configuration.
Even if gridded data over the whole area are not available, it is interesting to analyze how the sensitivity configurations modify the precipitation distribution with respect to the reference one.
Figure 6 shows the precipitation distribution related to 6 November 2017 obtained with the reference configuration and the difference of distribution obtained with c3, c16, and c18 with respect to the reference one. These figures confirm that c3 and c18 are able to increase the average precipitation over wide areas of the domain. In particular, c3 and c18 (both characterized by rlam_heat at minimum) provide larger increases over low altitude areas, while enhancements are less relevant in high orography zones. Configuration c16 provides increases over southern Lazio (where the network of SCIA station is located, see also
Table 11), while no (or small) variations are recorded over Campania. We also analyzed the maximum precipitation value over the entire simulated domain (values not shown), recording that it shows high variability with the configurations, ranging from minimum values (c5 and c7) to maximum values (c6 and c13).
4.3. Discussion
The previous analysis has revealed that variations in radqc_fact, fac_root_dp, and kexpdec produce very slight (or null) modifications of both T2m and precipitation values. Regarding the effects of variation of the other parameters, the following general considerations can be drawn on the basis of the results obtained:
A reduction of tkhmin causes a decrease of minimum and mean T2m. Stratification is made more stable, leading to decrease of night air temperature. On the other side, its increase causes a general increase of temperature, especially the minimum value (up to 1.5 °C). In fact, an increase of tkhmin implies that the turbulent kinetic energy is maintained in stable conditions, eliminating strong inversions [
10]. A reduction of tkhmin does not cause variation in precipitation, while its increase causes a growth, since it increases the small convective cloudiness [
24].
A reduction in rlam_heat causes a slight increase of T2m, while its increase does not modify the values of temperature with respect to c0. Generally, an increase of rlam_heat will increase the heat fluxes upward from the warm surface, leading to a larger heating of the lower atmosphere. Of course, this effect is more evident during the summer, while in the present test case the effects are less evident. Anyway, a not optimized setting of this parameter does not result in large temperature errors, since the soil moisture scheme implemented appropriately changes the soil moisture value. A reduction of rlam_heat causes the largest increase of precipitation, while its increase causes a reduction. In fact, the reduction of this parameter causes an increase of instability, leading to more precipitation. This effect is more evident for convective precipitation during summer.
An increase in v0snow causes a modest increase of precipitation. Variations have slight effects on the minimum value of T2m only, since this parameter is related to the temperature in an indirect way. In fact, the microphysical processes have an impact on the thermodynamics and the hydrological cycle via both direct and indirect feedback mechanisms.
A reduction in uc1 causes an increase of the maximum temperature and a reduction of the minimum, while an increase causes a slight reduction of the maximum temperature and a slight increase of the minimum. A reduction of uc1 increases the critical relative humidity level above the boundary layer, resulting in an increase of mid-level cloud formation, leading to an increase of maximum temperatures [
18]. Variations in uc1 do not provide significant changes in precipitation.
The numerical results related to the four interaction simulations performed (defined in
Table 2) allow drawing the following general considerations:
c15 provides an increase of the maximum temperature and a slight reduction of the minimum and mean T2m. It is also provides an increase of precipitation. This configuration is therefore able to improve the representation of both variables.
c16 provides slight achievements for precipitation representation.
c17 provides a slight increase of the maximum T2m and a slight reduction of the minimum and mean T2m, improving the simulation of temperature.
c18 is able to increase precipitation (improving its representation), along with an improvement of maximum temperature.
Analyzing the whole set of simulations performed, it shows that the best improvement for the representation of the mean T2m is achieved with uc1 at minimum (c7), even better when combined with rlam_heat at minimum (c15). Configuration c18 (rlam_heat at minimum and v0snow at maximum) provides a good representation of precipitation. In summary, even if the selection of the best configuration is beyond the purposes of the present work, as a result the configuration c15 is able to combine the best improvement for the maximum T2m, a slight improvement for the minimum T2m and the best representation for precipitation.
5. Conclusions
In this paper, the results of sensitivity experiments performed with the COSMO-LM model at very high resolution, over a domain located in southern Italy, are presented. The main aim of this work was to establish a hierarchy regarding the parameter sensitivity that could be useful in order to apply more advanced optimization techniques, such as the ones based on the application of meta-models. It was observed that some parameters have a strong impact and they could be a standalone source of further investigation, for a better understanding of the main physical processes in the atmosphere. Evaluation was performed in terms of temperature and precipitation against observational data provided by ground stations.
The present investigation revealed that over the area considered, the results of COSMO-LM in terms of temperature and precipitation show a great sensitivity to changes related to the physical parameterizations of soil and atmosphere. Effects of internal variability on the analysis presented were estimated, resulting in small variations. It was found that for the domain considered, better results in terms of temperature are obtained by choosing the minimum value of the parameter controlling the vertical variation of critical relative humidity for sub-grid cloud formation, even better when combined with the minimum value of the factor for laminar resistance for heat. This configuration allows a good improvement in terms of temperature bias (up to 0.5 °C) over this complex orographic area. Positive effects are observed also reducing the minimal diffusion coefficient for heat. Precipitation is generally underestimated by the reference configuration. These biases are partially due to shortcomings of the model in simulating some climate features of the area considered, along with deficiencies in the lateral boundary conditions and internal variability. Improvements in terms of precipitation for this area can be achieved by setting the factor for laminar resistance for heat at its minimum value. An increase in the value of factor for vertical velocity of snow also provides a positive effect on precipitation. Of course, further adjustments are needed in order to improve the spatial distribution of both temperature and precipitation. In particular, an approach based on an objective multi-variate calibration method built on a quadratic meta-model could be a useful tool in order to find a better combination of the values of the tuning parameters, even with respect to additional meteorological fields (e.g., cloud cover). The present results are valid for the considered mesh, while different (coarser) resolutions would require specific analyses. Anyway, the finding of the present work will be also applied to the configuration of ICON [
35], a new model based on icosahedral grid that will replace COSMO in the next years.
That said, a detailed analysis of the capabilities of COSMO-LM in reproducing extreme events over this area must be performed considering even different periods and events, in order to have a statistical significance. This is suggested to be the topic of future work.