Investigation of Local Weighting Filtering on Randomization Technique Estimates in a Data Assimilation System

Mainstream numerical weather prediction (NWP) centers usually estimate the standard deviations of background error by using a randomization technique to calibrate specific parameters of the background error covariance model in variational data assimilation (VAR) systems. However, the sampling size of the randomization technique is typically several orders of magnitude smaller than that of model state variables, and using finite-sized estimates as a proxy for the truth can lead to sampling noise, which may contaminate the estimation of the standard deviation. The sampling noise is firstly investigated in an atmospheric model to show that the sampling noise has a symmetrical structure oscillating around the truth on a small scale. To alleviate the sampling noise, a heterogeneous local weighting filtering is proposed based on distance-weighted correlation and similarity-weighted correlation. Local weighting filtering is easy to implement in the VAR operational systems and has a low computational cost in the post-processing of reducing the sampling noise. The validity and performance of local weighting filtering method are examined in a realistic model framework to show that the proposed filtering is able to eliminate most of the sampling noise dramatically, the details of the filtered results are more visible, and the accuracy of the filtered results is almost the same as that estimated from the larger sample. The signal-to-noise ratio of the optimal filtered field is improved by nearly 20%. A comparison with the widely used spectral filtering approach in the operational system is considered, showing that the proposed filtering method is more efficient to implement in the filtering procedure and exhibits very good performance in terms of preserving the local anisotropic features of the estimates. These attractive results show the potential efficiency of the local weighting filtering method for solving the noise issue in the randomization technique.


Introduction
Numerical weather prediction (NWP) can be expressed by a nonlinear equation system, and the accuracy of the initial state determines the quality of the prediction to a large extent. The variational data assimilation (VAR) system optimizes the combination of observational data and background information to provide the NWP system with the best estimate of the initial state of the meteorological conditions. The background error covariance matrix has a profound impact on assimilation analysis in the VAR system, because the matrix determines the weighting of the background state, influences the observational data on the multivariate analysis and propagates the balanced relationship of model state variables [1][2][3]. However, there are some challenges in specifying the background error covariance matrix. Firstly, the dimension of the matrix is close to O(10 7 ) in a current NWP system, which is too large to be represented explicitly. Furthermore, the computational cost of the numerical storage and model integration is beyond the resources of current supercomputers. Secondly, an understanding of the statistical properties of the background error probability distribution function, which is described by the covariance matrix, is still lacking.
Mainstream NWP centers generally model the background error covariance matrix in the form of control variable transforms (CVT) in the operational VAR system [4][5][6][7][8][9]. The CVT method models the background error covariance matrix with a relatively small number of parameters, and it is shown how to input the standard deviations of the background error of control variables into the covariance model explicitly. The randomization technique has been widely employed to estimate the standard deviations in numerical meteorological dynamic systems when the covariance matrix is model in the form B = UU T [10,11]. However, the estimates of standard deviations may suffer from a few problems when the truth is replaced with randomization estimates. For practical implementation in atmospheric systems, the randomization estimates are typically deficient due to their finite size, which can lead to sampling error in the covariance model [12].
To address the sampling noise, many filtering studies have been performed with the operational system [13,14]. Spatial averaging filter technology based on the characteristic scale of noise and signals has been successfully implemented for operational filtering [15]. Spatial filtering is effective in removing the sampling noise on a small scale, but it requires the manual tuning of the length at which the sample realizations should be averaged, and it is hard to implement automatically in an operational context [16]. A filtering approach known as objective filtering was developed by Raynaud et al., which is a spectral filtering method that can automatically estimate the truncated wavenumber based on the noise energy spectrum [17]. Objective filtering is able to remove most of the detrimental noise, and it works well in terms of model prediction in the spectral space as a kind of spectral filtering. Thus, objective spectral filtering is widely implemented in VAR systems in current operational centers. However, one of the main shortcomings of this approach is that this homogeneous filter is not adaptable to the local structure of the signal [18,19]. In this paper, we propose local weighting filtering to solve the noise issue in the randomization estimates. The proposed filtering approach not only shows very good performance in terms of eliminating sampling noise while preserving the standard deviation estimates but also presents an excellent ability to preserve the local anisotropic features of the estimates.
The structure of this paper is organized as follows: in Section 2, a brief introduction of the randomization technique and spectral filtering in the operational VAR system is given, and then we present the detailed implementation of the proposed filtering. Then, in Section 3, the application of the local weighting filtering is examined experimentally in the VAR system, and the filtered results on the raw estimates are also evaluated in detail. Finally, conclusions and perspectives are discussed in Section 4.

Filtering Materials
In this section, the randomization technique used in the operational system and the concept of local weighting filtering are introduced, with a discussion of the implementation of the filtering approach in the operational context.

Randomization Technique
The specification of the background error covariance matrix is a key component of the four-dimensional variational data assimilation (4D-Var) system, but it remains a great challenge. The structure of the background error covariance matrix is usually modeled in a CVT form, which can be written as an eigenvector decomposition B = UU T . Therefore, the covariance matrix calculated implicitly by the operator U, which is currently decomposed into the parameter transform U p , horizontal transform U h and vertical transform U v in the operational system [20], can be represented as follows: where the matrix B u is the covariance of the unbalanced control vector variables that are thought to be nearly uncorrelated with each other [21][22][23], and S is the spectral transformation for the variables in spectral space. Considering the constraints between multiple variables, Fisher and Courtier suggested that the standard deviation of background error can be diagnosed by the randomization technique to reduce the impact of the imposition of balance conditions on the correlations of background error. Thus, the covariance matrix can be decomposed with the background error standard deviation (Σ u ): where ξ is generated by the randomization technique and each element of ξ is independent of other elements. The background error standard deviation of the control variables Σ u could be estimated using a random number generator, and the other standard deviations are implicitly given by the parameter transform U p . The processes of the randomization technique in the operational VAR system can be summarized as follows: • Generating a sample of M (i.e., M = 50) random vectors 1 , 2 , . . . , m drawn from a Gaussian distribution with ∼ N(0, 1); • Applying U to each vector i to obtain the sample vectors η i of model states in spectral space; • Transforming the sample vectors η into the grid-point space and calculating the sample standard deviation estimates.
The sample estimates give the approximation of the actual standard deviation in the VAR system.

Local Weighting Filtering and Implementation
The standard deviations of the background error estimated by the randomization technique are contaminated by sampling noise, which is inevitably introduced into the estimates because of the finite sample size. Although the sampling noise is reduced by increasing the sample size, it is unaffordable to employ a sample size greater than 100 for estimation in the operational system. Furthermore, the resource-limited sample size, with a compromise between the accuracy of the estimates and operational time constraints, is far smaller than the dimension of the model states. In the operational VAR system, a spectral filtering method [17] is usually employed to reduce the sampling noise by multiplying a spectral coefficient C spec , which is defined by where c s is the wavenumber in the spectral space, and C tr is the truncated wavenumber. A small wavenumber C tr will cut off a large-scale signal in the given field, whereas a large wavenumber will cut off the information at small scales. The spectral filtering method shows great performance on eliminating sampling noise in operational application, but the truncated wavenumber is an empirical process that needs to be tuned manually based on experience, and the filtering effect is greatly reduced when the sampling standard deviation field is not isotropic.
In our work, a heterogeneous filtering method is proposed based on distance-weighted correlation and similarity-weighted correlation. The local weighting filtering incorporates distance-weighted correlation and similarity-weighted correlation, which can maintain the local anisotropic features of the estimates in the filtering procedure. The distance-weighted correlation is determined with the Gaspari-Cohn function [24], abbreviated as the GC function, which is a compactly supported fifth-order piecewise rational function that has a value that decreases from one to zero by the Euclidean distance between the grid points, shown in Figure 1a and defined as where r is Euclidean distance between two points and c is the cut-off region value, which relates to the similarity threshold.The similarity-weighted correlation is determined by the difference of the standard deviation based on the GC function and used to measure the degree of similarity between the filtering realizations: where std(x, y) is the standard deviation field in the filtering domain and ρ s is the threshold value of the similarity between standard deviation fields, which is set adaptively based on the upper whisker of the neighboring filtering realizations.  Figure 1b shows the similarity-weighted correlation of the vorticity standard deviation field at approximately 1000 hPa obtained by the randomization technique. The local weighting filtering is the Schur product of the distance-weighted correlation and similarity-weighted correlation, w = w r • w s , and the filtering becomes more anisotropic and heterogeneous with the introduction of the similarity-weighted correlations of the standard deviation into the filtering weight, which is shown in Figure 1c. This fusion plays an important role in preserving the local characteristics of the filtered standard deviation results.
The filtered results can be updated by the local weighting function and the neighboring realizations in the filtering domain: The application of the local weighting filtering on the operational VAR system is considered, and the implementation flowchart is shown in Figure 2. The local weighting filtering can run directly in the grid-point space to remove the sampling noise without transforming the filtered results between the grid-point and spectral space and is easier to implement in the statistical post-processing of the VAR system. Another striking feature is that the distance-weighted correlation can be calculated offline because the sample realizations are fixed in the grid-point space; therefore, the filter module has a higher parallel efficiency. Once the filtering domain is determined, the threshold range of filtering realizations is adjusted adaptively by the similarity-weighted correlation, and the local weighting becomes flow-dependent with the statistical properties of samples.

Results and Analysis
In this section, the standard deviation estimated by the randomization technique in the atmospheric model is presented as well as the sampling noise, and the application of the local weighting filtering on removing sampling noise is examined in a realistic framework.

Experiment Settings
Applications of the local weighting filtering approach for alleviating noise are considered with the atmospheric model on a low resolution grid with 91 levels. The samples for estimating the standard deviation are generated using the randomization technique with 10 samples, 50 samples, 100 samples and 1000 samples, respectively. The estimations from 10, 50 and 100 samples are the experimental estimates, and the reference estimates of the standard deviation field are estimated offline with 1000 samples. The background error standard deviations estimates of the 1000 samples are sufficiently accurate to represent the true state, which is never exactly known.
There are two filtering groups of experiments. The raw estimates are calculated from 50 samples using the randomization technique, because it is usually unaffordable to employ a sample size over 100 online for estimation in a realistic system, which necessitates a compromise between accuracy and computational costs. The raw estimates with the postprocessing of the local weighting filtering approach to eliminate the sampling noise are denoted as LWF, and the other experimental group using the spectral filtering to remove the sampling noise is marked with SPEC.

Structure of the Sampling Noise
The randomization technique is widely employed to estimate the standard deviations of background error in VAR systems, but the estimates are inevitably contaminated by the sampling noise because of the finite sizes. The characteristics of the sampling noise generated by the randomization technique have an important influence on the effect of filtering methods to eliminate noise. Thus, the investigation of the spatial structure of the sampling noise is explored firstly.
The sampling noise induced by the randomization technique is obtained by subtracting the reference from the estimates of 10, 50, and 100 samples. Figure 3 show the sampling noise of vorticity, divergence and zonal wind at 1000 hPa with different samples and an enlargement of a specific region, respectively. It can be found that the sampling noise is distributed uniformly in the standard deviation estimates, which is presented to be white noise. The noise distribution in highlatitude areas is larger than that in the low-latitude area, which is related to the projection method of the Earth, in which the high-latitude areas are enlarged when mapped to a plane. The structure distribution of sampling noise is similar with the different model variables in Figure 3a-l, and this feature is more easily observed in the enlarged maps (Figure 3d,h,l). This phenomenon shows that the sampling noise caused by the randomization technique has a similar symmetry form regardless of the model variables. Thus, a unified filtering method can be used to filter the sampling noise in the operational model.
There is another prominent feature worth noting in Figure 3: the absolute extreme value of the sampling noise decreases as the sampling size increases, but the structure distributions of sampling noise are still similar with 10, 50 and 100 samples. For instance, the extreme noise values of the vorticity standard deviation are 7.992 × 10 −6 and −7.05 × 10 −6 with 10 samples in Figure 3a, and the extreme values drop to 3.99 × 10 −6 (−3.27 × 10 −6 ) and 2.35 × 10 −6 (−1.97 × 10 −6 ) in the estimates with 50 ( Figure 3b) and 100 (Figure 3c) samples, respectively. In addition, the extreme noise values of the temperature standard deviation are reduced by half as the numbers of samples increase from 10 to 50, but the distribution of the noise still has very high similarity.
The reduction of the maximum noise shows that it is useful to eliminate the sampling noise by increasing the sample size in the randomization technique. However, due to computational cost limitations, the maximum sample size is also much lower than the model dimension of the realistic system. The similarity of the distribution with different sizes indicates that it is not beneficial to adjust the spatial structure of the sampling noise by increasing the sample size.
The variance in the range of sampling noise on the model level with 50 samples for vorticity, divergence, zonal wind and temperature at 1000 hPa are shown in Figure 4. It appears that the sampling noise displays high-frequency oscillations symmetrically on a small scale in the vertical levels of the models, with a pattern close to white noise, which reflects the fact that the noise generated by the randomization technique is different from the estimates of the standard deviations in terms of the structure distribution. Such a difference in the distribution of characteristics is the fundamental basis for the requirement of the filtering techniques to eliminate noise in post-processing.  Figure 5 presents the raw estimates from 50 samples and the reference and filtered results of the standard deviation for vorticity, divergence, zonal wind and temperature. A great difference can be seen in the spatial distribution of the standard deviation fields with the model variables.The high values of standard deviation for vorticity and divergence are more scattered, and the high values for zonal wind and temperature are more concentrated, especially in the southern hemisphere. However, the low values of standard deviation for divergence are found at middle and low latitudes, and the low values of vorticity are more concentrated in the intertropical convergence zone.

Filtered Results
The spatial distributions of different model variables have their own characteristics, but there are also some commonalities. There is a large number of higher standard deviation values that are more concentrated at high latitudes; in contrast, lower latitudes have smaller standard deviations. Relatively large standard deviation values occurred in the southern hemisphere, and smaller standard deviation values are found in the densely observed land area.
The raw estimates shown in Figure 5 captured the main large-scale features of the standard deviations. However, it is can be noticed obviously that the details of the estimates are strongly deteriorated by small-scale sampling noise. Figure 6 shows the details of the randomization technique estimates in a 1D framework, where the large-scale features of the true field are captured by the raw estimates and the small-scale sampling noise oscillates around the true values.
The previous paragraph shows that the spatial distribution of sampling noise differs from the standard deviation estimates, and that difference is the key to removing the sampling noise. The corresponding filtered results using local weighting filtering are illustrated in the third column of Figure 5. It can be noticed that the sampling noise is eliminated dramatically in the filtered results, and the details of the filtered results are more visible and very close to the reference. The signal-to-noise ratios of the filtered vorticity and divergence results are 26.01 and 24.15 (Table 1), respectively, both of which are improved by nearly 20%. Local weighting filtering improves the accuracy of the estimates of small samples by increasing the independent samples through weighting the neighboring realizations, which has shown potential efficiency in eliminating most of the small-scale detrimental noise.

Heterogeneous Filtering
The local weighting filtering approach introduces the similarity-weighted correlations of the standard deviation through the neighboring samples; thus, the proposed filtering method shows very good performance in terms of preserving the local small-scale structure in small-scale regions.
The filtered results of the standard deviation for vorticity at 1000 hPa with 50 samples obtained with the spectral filtering and local weighting filtering are shown in Figure 7. Comparing the raw estimates in Figure 5a, most of the sampling noise is removed in the filtered results, and the scale of the filtered standard deviation is more visible, almost reaching the accuracy obtained from larger samples. It can be seen in Figure 7a that the spectral filtering tends to smooth the sampling noise and the small-scale features of standard deviation estimates. The local feature information has been smoothed in the vicinity of low values after applying the spectral filtering, while the local anisotropic features of the estimates are preserved reasonably well in the local weighting filtering.  Figure 8 shows the proposed filtering and spectral filtering with correlated and heterogeneous noise in a 1D analytical framework. It may also be noticed that the results of spectral filtering (dashed line) are smoother and the small-scale signals are also greatly weakened at the same time. The local weighting filtering becomes more heterogeneous, and localized filtering in the small-scale area helps to preserve the local structure of the signal to a greater extent than the spectral filtering, as shown in Figure 8c.
Spectral filtering is essentially weighted homogeneous filtering, and the smoothing range is determined by the characteristic scale of the noise and signal. The spectral coefficient is a global quantity and cannot change with spatial position, so spectral filtering cannot preserve the local high-frequency signal in the small-scale regions.

Discussion
The randomization technique is widely employed to estimate the standard deviations in the background error covariance matrix model. The samples used for estimating the background error are generated from a group of random vectors in a Gaussian distribution. However, the sample size in the randomization estimation is far smaller than the dimensions of the model states in atmospheric systems, and this undersampling may have a negative effect on the estimates. To reduce the sampling noise introduced by the finite samples, a heterogeneous filtering method is proposed based on distance-weighted correlation and similarity-weighted correlation in this paper.
The experimental studies of the randomization technique estimates are explored in a realistic model. It can be seen that the main large-scale features are well captured by the raw standard deviation estimates with finite samples, but the sampling noise is distributed symmetrically around the standard deviation estimates. The structure distribution of sampling noise is on a small scale and oscillates around the truth like white noise. More precisely, it can be noticed that the structure distribution of sampling noise has a very high similarity in different sample estimates, which suggests that the extreme values of the sampling noise can be decreased by increasing the sample size, but the noise structure cannot be improved. Secondly, the investigations of the sampling noise show that the spatial structure of the noise is smaller than that of the standard deviation field. This difference motivates the application of filtering techniques to increase the accuracy of the estimates.
The validity of the local weighting filtering is examined in an atmospheric model framework. It has been observed that the sampling noise is eliminated dramatically after applying the local weighting filtering, and the filtered results of the 50 sample estimates are even more accurate than those of the reference. The signal-to-noise ratios of the filtered results are significantly improved-especially the vorticity, which was improved 23.56%and the detailed features destroyed by the noise become clearer.
The performance of heterogeneous filtering has been studied by comparing the effects of the spectral filtering with 50 sample estimates. Both the filtered results removed the noise well, but the results of spectral filtering were smoother than the local weighting filtering. Compared with the spectral filtering, where a great deal of local feature information and small-scale signals are smoothed out, the local weighting filtering shows great performance in terms of characterizing the local anisotropic features of signals. This is because the local weighting filtering incorporates the distance-weighted correlation and similarity-weighted correlation; thus, the weighting of the filtering realizations becomes more anisotropic and heterogeneous in the filtering domain.
Finally, the local weighting filtering is easy to implement in the operational context. The filtered results are calculated directly by using the neighboring realizations and the corresponding weighting. A separate filtering module of the local weighting filtering can be inserted in the flowchart of the existing operational process without affecting the model forecast.
However, this will increase the computational cost as the local weighting filtering obtains better local features by using the neighboring realizations in the grid-point space, compared with the spectral filtering, which only sets a truncated wavenumber. On the other hand, the model resolution will have an impact on the filtered results because the neighboring values are weighted in the local weighting filtering. Future work will systematically explore the relative importance of the model resolution and sample sizes for local weighting filtering. Furthermore, a more automatic and flow-dependent filtering scheme could be interesting to consider in forecasting work for strong convective weather.
Author Contributions: X.X. contributed to the investigation, methodology, experiments and manuscript. B.L. and W.Z. contributed to the theme selection, resources, supervision and manuscript revision. X.C. and J.S. contributed to the analysis, funding acquisition and manuscript revision. All authors have read and agreed to the published version of the manuscript.