## 1. Introduction

Numerical weather prediction (NWP) models are the main source of information on the future state of the atmosphere. As NWP models are based on partial differential equations that describe the evolution of the state of the atmosphere, accurate knowledge of the initial conditions is essential. To address this problem, many methods and approaches have been developed and have become an important part of atmospheric science known as data assimilation (DA) (e.g., [

1]).

DA methods were mainly developed in the global NWP model framework and were subsequently adopted in limited-area models (LAMs). A detailed mathematical description of operational DA methods can be found in [

2]. In [

3] a survey of DA methods used in LAMs is presented, along with the challenges in the research and development in convection permitting NWP models. DA combines different sources of information about the state of the atmosphere with the aim to obtain the best possible estimate of its true state. As all sources of information are imperfect, and to produce the optimal combination, the error statistics (of this information) must be estimated as accurately as possible.

Currently, variational DA is the method of choice for many NWP centers around the world for both LAM and global models. Variational DA seeks the model state (analysis) that is the statistically optimal combination between a background field (usually a short-range forecast) and observations by minimizing a cost function. One of the important components of the DA system is a background-error covariance matrix. It influences the analysis field because it determines the weight of the background field with respect to the observations and determines how the information from observations is spread spatially and temporally to the model grid-point space. Additionally, in multivariate formulation, the background-error covariance matrix spreads information from one to the other model variables.

In theory, to be able to estimate a background error, the true state of the atmosphere should be known. As this is not possible, one seeks an appropriate surrogate of the background error that should have similar statistical properties, and presently, mostly forecast differences are used. The forecast differences are computed either between two forecasts that are valid at the same time but initialized at different times or from an ensemble of forecasts. The so-called NMC method [

4] employs the first approach, where 48- and 24-hour or 36- and 12-hour forecast differences are usually used for evaluation of climatological background-error covariance matrix (B matrix). The reason for using the 24-hour forecast differences is that it avoids including errors in modeling the diurnal cycle in the background error [

5]. The NMC approach is rather simple to use because it requires forecasts that are usually present in the archives, so it was the first choice of many global and LAM models. Although the NMC method has its potential deficiencies (e.g., [

6]), according to Table 4 from [

3] it is one of the most widely used methods for obtaining the B matrix. Therefore, interest in studying various aspects of the NMC B matrix continues also in recent years (e.g., [

7,

8]). For the LAM, a variant of the NMC method called the lagged NMC method [

9] was developed where both forecasts that are initialized at different times use the same global LBCs (i.e., initialized at the same time). Additionally, a shorter forecast uses the initial conditions from the global model (interpolated to the LAM grid). The second approach computes forecast differences from an ensemble of perturbed assimilation cycles (e.g., [

5,

10,

11]). In the LAM context, such B matrix is estimated from a sample of differences between ensemble members obtained either from the downscaling of a host ensemble (e.g., [

12,

13,

14]) or from ensemble of perturbed assimilation cycles of the LAM (e.g., [

15,

16,

17,

18]). Ensemble-based methods sample forecast differences over some period of time and thus also as NMC provide estimate of climatological B matrix. With increase of computer power available, more attempts to increase ensemble size and to estimate daily (flow-dependent) B matrices were made (e.g., [

17,

19,

20]). While this approach was found beneficial, for many meteorological centers it is not operationally feasible which is why climatological B matrix is still widely used.

The comparison of the B matrices obtained by NMC and ensemble-based estimation method was studied for the global system (e.g., [

10,

21]) and for the LAM (e.g., [

13,

14]). The NMC method suffers from two main drawbacks. First, it includes long forecast ranges (24 or 48 hours), which are usually much longer than the ones used in the DA cycle. The second drawback is the analysis step representation, where instead of the analysis differences as in the ensemble method, in the NMC method analysis increments are involved, as shown in [

13]. This leads to overestimation of the correlations in both the horizontal and vertical. Additionally, for the NMC method in data-poor regions, small forecast differences are expected, and the background-error variances are likely to be underestimated. While having a significant influence on the B matrix characteristics, the influence of the sampling method on the quality of the model forecast was modest for the global models [

10,

11,

21]. Previous research has shown that, with the exception of using different sampling strategies, the B matrix characteristics are also influenced by (i) the model resolution (e.g., [

12,

16]), (ii) the geographical location (the location of the model domain) (e.g., [

10]) and (iii) the weather regime during the sampling period (the seasonal dependency) (e.g., [

19]).

This study aims to provide additional further insights into differences between NMC and ensemble-based B matrices with an aim to improve operational application of those matrices in different data-assimilation systems. For comparison of different formulations, we estimated three new B matrices, the first using the standard NMC method and the two latter ones using different ensemble-based methods. Due to constraints regarding computing resources required to run real-time EDA with many members operationally, our focus was using EDA to estimate climatological B matrix over the same period as NMC in order to take into account the seasonal/weather regime influence on the B matrix characteristics. Also, to have same sample size as for NMC we have opted to use only two ensemble members. Such a comparison in the LAM framework differed from most of the other studies in the field in several aspects. First, the NMC and ensemble B matrix was sampled over the same time period. A similar diagnostic comparison in the LAM framework was performed in [

13] using samples obtained by downscaling the global model ensemble (as opposed to the ensemble of perturbed assimilation cycles of LAM used here). It was shown by [

16] that the background-error covariances from the downscaled ensemble differed from those calculated from the ensemble of perturbed assimilation cycles of LAM. The comparison of the NMC, downscaled global model ensemble and perturbed assimilation cycle LAM ensemble B matrices was also performed in [

14], but sampling in the different methods was performed neither for the same time period nor for periods of the same length. Second, the B matrix was estimated for the NWP model with a 4 km horizontal grid spacing. A similar horizontal resolution was used in [

18] but it was dealing only with ensemble methods. Third, the B matrix was estimated for the domain that covered a geographically diverse area of southern Europe, including the Mediterranean Sea, several mountain chains (the Alps, Dinaric Alps, and Apennines), and several plains and lowlands, and these topographical features have an important influence on the weather conditions specific to this area and pose challenges to optimal data assimilation [

22]. This is likely to result in some differences compared to the studies performed in other regions. Fourth, our study used a somewhat smaller domain for the calculation of the B matrix compared to most of the other aforementioned studies. The size of the domain in our experiments resembled the domains of operational numerical weather prediction models of many countries in the region, which typically cannot afford using large domains due to computational constraints. Because of the small horizontal domain of the model, the influence of LBC perturbation on the characteristics of ensemble-based B matrix could be enhanced.

The ensemble-based B matrices were estimated from a 2-member ensemble of perturbed assimilation cycles of the NWP model (EDA). The first EDA setup used the same LBCs for all members while in the other setup perturbed LBCs from the global ensemble were used. Using the same LBCs for EDA members was inspired by [

14]. Although neglecting the LBC errors led to an unrealistic setup, our approach aimed to test the influence of the LBC perturbations on the characteristics of the B matrix for a relatively small LAM domain (the influence of the LBCs could be substantial). The statistical properties of the ensemble-based and NMC B matrices were compared, and the influence of using different B matrices in the LAM DA system on the analysis and quality of the model forecast was assessed. Such ensemble-based B matrix with unperturbed LBCs was estimated and used in the work of [

18], with the aim of suppressing the B matrix influence on the large scales in the analysis, as those were included in their LAM from the global model analysis via digital filter blending. As we did not use such a procedure (digital filter blending), the suppression of one source of background errors would not be realistic (except for diagnostic purposes) in our forecasting system. Therefore, the ensemble-based B matrix with unperturbed LBCs was not used in the verification experiments.

In

Section 2, the methods, model and used datasets are described. The results of the diagnostic study and the influence of the different B matrices on the analysis and quality of the forecasts are presented in

Section 3. The main results are summarized in

Section 4.

## 4. Summary and Conclusions

Climatological B matrices for the ALADIN-HR4 model were estimated using three error simulation techniques. The characteristics of these B matrices, along with their influence on the analysis and quality of the forecast, were investigated using spectral and moment-based evaluation in diagnostic comparison, single-observation experiments and full observation forecast experiments. The first technique used was the standard NMC method. The second and third approaches obtained samples of forecast differences by using the EDA system with a cycling frequency of 6 hours and with 2 members. To test the influence of LBC perturbations, one EDA system had the same LBCs for all members (from the deterministic IFS run), while the other had perturbed LBCs from the global IFS ensemble. For all experiments, the sampling was consistently performed over the same period of almost 3 months.

The examination of the forecast differences using the geographical distribution of the normalized standard deviation showed that neglecting LBC perturbations led to unrealistically small standard deviations near the domain boundaries. It also showed that for the specific humidity, the smallest differences were between the ENS and ENSLBC experiments, which suggest that the humidity field is more sensitive to the method used to sample the forecast error than to LBCs. For the other variables, especially for the surface pressure and temperature, a notable influence on the standard deviation amplitude came from the LBC perturbations. More importantly, this influence spread over a significant portion of the relatively small ALADIN-HR4 domain. Considering that the B matrix was estimated from temporal and the domain averages, the influence of the LBC perturbations could dominate other sources of background errors. The influence of LBC perturbations was confirmed by diagnostic comparison, where it was shown that the contribution of the large scales to the shape of the correlation function for the ENSLBC experiment was enhanced and had amplitudes closer to those of the NMC experiment. Nevertheless, the shape of the correlation functions from the ENSLBC experiment was shifted to smaller scales compared to the NMC experiment. The ensemble B matrices were further characterized by smaller standard deviations, shorter horizontal length scales and sharper vertical correlations compared to the B matrix derived using the NMC method, which agrees with previous research results. The influence of the ENSLBC and NMC B matrix on the analysis was tested by performing a single-observation experiment, calculating the mean analysis increments, calculating the fit of the analysis to the observations and by assessing the influence of the B matrix on the internal model balances in the analysis.

The single-observation experiment showed that the ENSLBC analysis increments had a smaller spatial extent with a shape that was more or less similar for all control variables compared to the NMC experiment. The most pronounced differences were found for the specific humidity. This agreed well with the mean analysis increments, which differed between the NMC and ENSLBC experiments, mostly for the specific humidity, showing apparently excessive (from the fit of the analysis to the observations) moistening of one part of the atmosphere. A clear benefit of using the ENSLBC B matrix was demonstrated by the model spin-up, where it was shown that less imbalance was present in the analysis using the ENSLBC method. For both experiments, approximately 3 hours were needed for the model to adjust to the internal balance, and the degree of imbalance was relatively smaller for the ENSLBC experiment.

The quality of the forecast initialized from the local DA system that used different B matrices was assessed by comparing the forecast with in situ data from surface and radiosonde observations during June 2017. The verification results showed that the choice of the B matrix had a small influence on the forecast of the surface parameters. Nevertheless, a statistically significant improvement of the ENSLBC experiment compared to the NMC experiment was found for the mean sea level pressure and for the cloud cover in the first 6–12 hours of the forecast. The upper-air verification scores were generally not significant, although some slightly better statistics were obtained for the ENSLBC experiment. A comparison of the 12-hour “accumulation” precipitation forecasts (accumulation between +06 and +18-hour forecast started at 00UTC) was performed against rain gauge measurements (that measure precipitation at 18UTC), and better results for the precipitation thresholds of 10 and 30 mm/12 h were obtained for the ENSLBC experiment.

The comparison of the NMC and ENSLBC B matrices in a diagnostic sense and based on the influence on the analysis and quality of the forecast showed that using the ENSLBC B matrix in the variational DA system would be preferred. For all performed comparisons, the most marked differences were noticed for the specific humidity or variables related to it (e.g., cloudiness, precipitation). Smaller, if any, improvement was found for the other variables. With the increased availability of humidity and precipitation information, sensitivity found in humidity fields needs to be further explored, as future plans involve assimilation of humidity observations derived from radar data delivered by OPERA radar project [

39].