Comparison of Statistical Dynamical, Square Root and Ensemble Kalman Filters

We present a statistical dynamical Kalman filter and compare its performance to deterministic ensemble square root and stochastic ensemble Kalman filters for error covariance modeling with applications to data assimilation. Our studies compare assimilation and error growth in barotropic flows during a period in 1979 in which several large scale atmospheric blocking regime transitions occurred in the Northern Hemisphere. We examine the role of sampling error and its effect on estimating the flow dependent growing error structures and the associated effects on the respective Kalman gains. We also introduce a Shannon entropy reduction measure and relate it to the spectra of the Kalman gain.


Introduction
A central problem in data assimilation is how best to model the error covariance matrices for the background states and analyses.The aim of data assimilation is to obtain a near optimal estimate of the state of the atmosphere, based on observations and on short term forecasts, that provide the so-called background states with information in the data void areas.Data assimilation methods can be regarded as generalized filters that extract the signal from noisy observations and produce assimilated data sets at each required time and with complete coverage at the required resolution.The accuracy of numerical weather prediction models depends critically upon data assimilation for specifying the initial conditions.Model initialization methods include variational data assimilation (Talagrand & Courtier (1987) [73], Courtier & Talagrand (1987) [13], Courtier et al. (1994) [14], Rabier et al. (1998a) [66], Rabier et al. (1998b) [67], Rabier et al. (2000) [68]), in which an optimal initial condition is found that minimizes differences between observations and model solutions over a given finite time interval, and nudging data assimilation, where the model state is relaxed toward observations dependent on empirically or optimally determined relaxation coefficients (Zou et al. (1992) [81], Vidard et al. (2003) [77]).The Kalman filter approach to data assimilation has become a possible alternative to variational data assimilation due to the development and application of the ensemble Kalman filter (Evensen (1994) [21], Evensen & van Leeuwen (2000) [23], Evensen (2003) [22] and references therein) and ensemble square root filter (Andrews (1968) [3], Tippett et al. (2003) [74] and references therein) methodologies.
Schneider & Griffies (1999) [71], Roulston & Smith (2002) [69], DelSole (2004) [18] and Kleeman (2007) [48] have extensively applied the use of information theoretic functionals to measure the information content of probabilistic numerical weather prediction (NWP) models.In numerical weather prediction the crucial question is how to estimate the current state of the system given observations of the past state.Such an estimate must be obtained recursively in time utilizing new observations as they become available.Zupanski (2005) [82] developed an ensemble data assimilation methodology whereby the information content of observations is considered in terms of the signal and entropy reduction analogous to analysis and forecast covariances respectively (see Shannon & Weaver (1949) [70]).These quantities are then used to determine the maximum likelihood state solution by minimizing a cost function.Eyink & Kim (2006) [24] employed maximum entropy as a basis for selecting weights, means and covariances to model prior distributions using mixture models of weighted sums of Gaussians.The current interest in information theoretic approaches in the weather prediction and data assimilation community has to some degree been motivated by the desire to develop methodologies that go beyond assumptions of linearity and Gaussianity.O'Kane & Frederiksen (2008) [63] quantified the role of non-Gaussian cumulants in ensemble weather prediction using statistical closure methods; in this current work we extend their ensemble prediction method to data assimilation and demonstrate how this methodology relates to the concepts of Shannon entropy and maximum entropy.A detailed discussion of the relationships between entropy and statistical mechanical equilibrium states can be found in Frederiksen & O'Kane (2008) [34].
In numerical weather prediction and data assimilation there are a number of outstanding issues that need to be addressed.For large numerical prediction models, with millions of variables, calculating the full covariance matrices of the background errors as they evolve with time remains an intractable problem.For this reason current data assimilation schemes frequently assume that the background error covariances are homogeneous and even isotropic (Snyder et al. (2003) [72]).In addition the background error covariances are often assumed to be stationary as is the case for the National Meteorological Center's (NMC) spectral statistical-interpolation (SSI) analysis method developed by Parish & Derber (1992) [64]; see also Dee (1995) [17], Rabier et al. (1998b) [67], Derber & Bouttier (1999) [19], McNally et al. (2000) [57], & Buehner (2005) [11], and in the the 3-D variational operational method developed at ECMWF (Courtier et al. (1998) [15]).In the spectral data assimilation method used at the NMC it is assumed that the background error matrices describe horizontally homogeneous correlations for the field variables with correlations in the vertical.More recently 4-D variational (4D-VAR) methods have been implemented operationally, such as at ECMWF (Lorenc (2003) [54], Rabier et al. (2000) [68]), which have been shown to be equivalent to a fixed-interval Kalman smoother (Ménard & Daley (1996) [58]).In common with 3D-VAR (Andersson et al. (1998) [2]), current 4D-VAR methods employ the use of a static background error covariance matrix independent of the background field itself and tuned to ensure balance for any given observational network as discussed in Beck & Ehrendorfer (2005) [4] and Derber & Bouttier (1999) [19].In 4D-VAR the use of a stationary background error covariance model at the beginning of each assimilation cycle means that the full flow dependent background error covariances cannot be propagated from one data assimilation cycle to the next (Lorenc (2003) [54], Wang et al. (2007b) [79]).In both 3D and 4D-VAR this stationary background error covariance model is often nearly homogeneous and isotropic (see Wang et al. (2007b) [79] and references therein).
In both the ensemble Kalman filter (EnKF) (Evensen (1994(Evensen ( , 2003) ) [21,22]) and the ensemble square root filter methodology (EnSF) (Andrews (1968) [3], Tippett et al. (2003) [74]) estimates of the flow dependent background covariance matrices are constructed from ensembles of short term forecasts with slightly differing initial conditions.In the EnKF approach an ensemble of model states is propagated with a fully nonlinear model allowing the error covariance matrix to be calculated with no moment closure required.This allows the construction of the forecast error covariance at any given time by averaging over the ensemble.Because it is generally only practicable to perform of the order of a hundred background forecasts using current operational numerical weather prediction systems additional assumptions must be made about the nature of the background correlations.Such assumptions are required to reduce sampling errors.Currently hybrid data assimilation methods have been trialled in which an ensemble of forecasts is used to calculate flow dependent background error covariances and then augment those from a 3D-VAR data assimilation scheme (Hamill & Snyder (2000) [39]) or optimal interpolation scheme (Wang et al. (2007b) [79]).
Sampling error is one of the fundamental issues that affects our ability to estimate the posteriori state of the atmosphere.Typically, sampling error is caused by an insufficient number of realizations such as occurs when the sample size is less than the number of degrees of freedom.As well, sampling error may arise through the use of perturbed observations.One common approach to reducing sampling error is to assume that background error correlations vanish for points separated by a distance greater than several thousand kilometers (Hamill et al (2001) [40], Buehner (2005) [11]).In these approaches the effectiveness of the assimilation schemes is found to depend on the choice of the localization length scale (see Buehner (2005) [11] and references therein).As well, to ensure that the filtering solution does not diverge from the observations and to keep prior covariances small, it is generally necessary to employ an empirically selected covariance inflation factor (Anderson (2001) [1], Hamill et al (2001) [40]).
Recently alternatives have been introduced that promise to overcome some of the sampling problems associated with the EnKF.Anderson (2001) [1] developed the ensemble adjustment Kalman filter (EnAKF) and applied it to a non-divergent spherical barotropic model.Andrews (1968) [3] formulated the square root Kalman filter and Tippett et al. (2003) [74] showed the relationship between various ensemble square root filter variants including the EnAKF and ensemble transform Kalman filter.Whitaker & Hamill (2002) [80] examined sampling error in a comparison of the EnKF to the EnSF.A recent computationally efficient square root filter variant enabling higher resolution studies called the local ensemble transform Kalman filter (LETKF) has been tested by Miyoshi & Yamane (2007) [61].These assimilation methods share a common approach based on the use of unperturbed observations and form a class of deterministic ensemble square root filters; see Tippett et al. (2003) [74] and references therein.
The calculation of the error covariance of fields described by an evolving nonlinear system requires the solution of an infinite hierarchy of moment equations; this problem is formally identical to the closure problem that arises in statistical approaches to turbulence.Typically Kalman filter methods simply discard cumulants of third order and higher, although these cumulants are often required to ensure that regime transitions in nonlinear models are accurately tracked (Miller et al. (1994) [59], Bengtsson et al. (2003) [5]).More generally it would seem to be desirable to develop a tractable data assimilation scheme that incorporates information about the higher order cumulants.
To treat the general statistical problem of mean fields interacting with inhomogeneous turbulence and topography more general closures are needed.Frederiksen (1999) [27] formulated the quasi-diagonal direct interaction approximation (QDIA) closure for the interaction of general mean flows with inhomogeneous turbulence and topography on an f -plane.O'Kane & Frederiksen (2004), hereafter OF2004 [62], studied the performance of the QDIA, including computationally tractable variants, and found that in regularized form it is in excellent agreement with the statistics of DNS at all scales.Frederiksen & O'Kane (2005), hereafter FO2005 [33], generalized the QDIA theory and numerics to apply to the interaction of inhomogeneous turbulence with Rossby waves, mean flows and topography on a β-plane.The QDIA has also been successfully applied to ensemble prediction studies that have focused on the statistics of error growth during blocking transitions (O'Kane & Frederiksen (2008) [63]).The QDIA closure solves prognostic equations for the statistics of the mean field and the inhomogeneous covariance matrices of the fluctuations or errors.It is called the quasi-diagonal DIA because the off-diagonal elements of the covariance matrix (or two-point cumulant) in spectral space are expressed in terms of the diagonal covariances and response functions, as are the elements of the three-point cumulant.The QDIA theory is a suitable frame work for formulating a statistical dynamical Kalman filter (SDKF) methodology since it provides prognostic equations for the mean field, inhomogeneous covariance matrix and three-point cumulant without sampling error.
The Kalman filter schemes employed in this paper use gains that are diagonal in spectral space to rescale the equations for the mean field and the errors; thus the gains are homogeneous but anisotropic.The elements of the error covariance matrix in spectral space are then scaled by products of gains at the different wavenumbers and similarly for the three-point function.
A major aim of this paper is to examine covariance error estimation comparing the performance of the statistical dynamical filter in which sampling error is eliminated with the results for the ensemble square root (EnSF) and stochastic ensemble Kalman filters (EnKF).We do this in studies with the barotropic vorticity equation for typical atmospheric conditions and in particular examine the issues of sampling that naturally arise when random observational noise is used in ensemble data assimilation in addition to the role of ensemble size.Further we analyze sampling issues when flow correlated perturbations, such as used in square root filters, are employed.As well, we study error covariance gains for each of these filters.We examine the performance of these data assimilation methods during the formation, maturation and decay of a number of large scale atmospheric blocking events that occurred between mid October and mid November 1979 in the Northern Hemisphere.Such regime transitions are typically associated with a loss of predictability during the formation and decay phases and enhanced predictability during the periods of the mature blocks (Frederiksen, Collier & Watkins (2004) [36] and references therein).
The plan of this paper is as follows.In section 2 we describe the barotropic vorticity equation for flow over topography and Rossby wave turbulence on a generalized β-plane and in the presence of a large scale flow U .The barotropic vorticity equations are discretized using Fourier series methods and then the large and small scale evolution equations are written in compact form by making a judicious choice of interaction coefficients.In section 3 we describe the Kalman filter methodologies including the Kalman, extended Kalman and generalized extended Kalman filters.Section 4 develops the ensemble Kalman and ensemble square root filters and the statistical dynamical Kalman filter.In section 5 we define the Shannon entropy (Shannon & Weaver (1949) [70]) and relate the entropy reduction to the Kalman gain spectra.Section 6 defines a number of diagnostic quantities used throughout this paper including the kinetic energy and Kalman gain spectra as well as the palinstrophy production which is a sensitive measure of the small scales.In section 7 we compare the performance of EnKF, EnSF and the closure based SDKF in 5 day data assimilation experiments followed by 5 day forecasts.Section 8 presents results on the performance of the ensemble square root filter including in 30 day assimilation experiments.In section 9 we discuss the implications of our results and summarize our conclusions.Appendix A describes in detail the spectral Kalman filter equations, Appendix B details the mode coupling or interaction coefficients used by the spectral barotropic vorticity equation and statistical dynamical closure equations.A statement of the details of the QDIA equations is given in Appendix C.

Barotropic flow on a β−plane
The direct numerical simulations and statistical closure equations, and the applications to data assimilation, considered in this paper are based on the generalized barotropic vorticity equation described by FO2005 [33].The evolution equation for two-dimensional motion of the small scales over topography on a generalized β−plane is described by the vorticity equation where ζ is the vorticity and the full streamfunction Ψ is given by ψ − U y, with U a large-scale east-west flow and where ψ represents the small scales.Here f 0 is the forcing, ν the viscosity, is the Jacobian and k 0 is a wave number that specifies the strength of the large-scale vorticity corresponding to the solid body rotation vorticity on a sphere.The vorticity is the Laplacian of the streamfunction: The scaled topography h is given by h = 2µgAH RT 0 where µ = sin φ, φ is the latitude, H is the height of the topography, R = 287Jkg −1 K −1 is the gas constant for air, T 0 = 273K is the horizontally averaged global surface temperature, g is the acceleration due to gravity and A = 0.8 is the value of the vertical profile factor at 1000mb.We can also either replace or add to the forcing a Newtonian relaxation term κ(ζ obs − ζ) where κ is the Rayleigh damping and ζ obs are linearly interpolated daily observed analysis fields.
The form-drag equation for the large-scale flow U is the same as on the standard β−plane.With the inclusion of relaxation toward the state U it takes the form Here, α is a drag coefficient and S is the area of the surface 0 ≤ x ≤ 2π, 0 ≤ y ≤ 2π.In the absence of forcing and dissipation, Eqs. ( 1) & (2) together conserve kinetic energy and potential enstrophy.The standard β-plane vorticity equation is obtained by setting k 2 0 to zero.We note however that there is a one-to-one correspondence between the generalized β−plane equation and that for flow on the sphere in the presence of a solid body rotation contribution (FO2005 [33]).Because the blocking regime transitions we consider occur near 60 0 North we use a β−effect of 1.15×10 −11 m −1 s −1 corresponding to a non-dimensional value of β = 1/2 and k 2 0 = 1/2.The non-dimensional barotropic vorticity equation can be derived by choosing a length scale of a/2, where a is the earth's radius, and a time scale of Ω −1 , the inverse of the earth's angular velocity.
From the barotropic vorticity equation, the corresponding spectral barotropic vorticity equations are obtained by representing each of the small-scale terms by a Fourier series where Here Throughout we will use andˆto denote mean and perturbation.
The spectral equations for all scales including U on the generalized β−plane can be combined into a compact form by defining suitable interaction coefficients, representing the large-scale flow as a component with zero wavenumber and extending the sums over wavenumbers to include the zero component.The resulting compact form is where and κ determines the relaxation time scale.Eq. ( 4) holds for all k in the set T = R ∪ 0 where the set R consists of all points in discrete wavenumber space except the point (0,0) and where κ is the Rayleigh friction.Here The complex ν 0 (k) is related to the viscosity ν and the intrinsic Rossby wave frequency ω k by the expression where and introduced a term h −0 that we take to be zero but which could more generally be related to a large-scale topography.In our notation * denote complex conjugate.We note that U is real and we have defined ζ 0 to be imaginary in order to ensure that all the interaction coefficients that we use are defined to be purely real (FO2005 [33]).Note that we distinguish between 0 and −0 in this representation and these components are complex conjugate as is the case for the small-scale components with oppositely signed wave vectors.The interaction coefficients are defined in appendix B.

QDIA closure equations
To determine the mean field we need an equation for the two-point cumulant C k,−l (t, t ) = ζk (t) ζ−l (t ) which in turn requires an expression for the third cumulant and so on.The barotropic vorticity equation expressed in terms of mean and fluctuating fields take the forms Multiplying Eq. (8b) by its complex conjugate and taking the expectation gives the diagonal two-point two-time cumulant in terms of two-and three-point cumulants where we use the convention ) .The QDIA theory closes Eqs.(8a) & ( 9) through prognostic integral equations for the off-diagonal two-point and three-point cumulants and corresponding expressions for

Relationships between physical space and spectral space
For an ensemble of direct numerical simulations the vorticity can be expressed in terms of the mean and fluctuating components: where ζ is a vector with elements ζ(x i , t) in physical space or ζ k (t) in spectral space respectively.The barotropic vorticity equation can be viewed as performing the transformation ζ(t) → ζ(t+ t).In terms of the means and transients we then have the mappings where the error covariance matrix C(t, t ) = ζ(t) ζT (t ) .Here T denotes transpose for real fields and Hermitian conjugate for complex fields.
The QDIA closure equations again perform the transformations in Eq. ( 11).This may be seen by employing Eq. ( 3) for calculating the physical space vorticity from its spectral components and by calculating the elements of the physical space covariance matrix through: For the QDIA equations, in common with all statistical equations, there is no sampling error in the transformations given by Eq. (11a) and Eq.(11b).In contrast, for ensemble averaged direct numerical simulations (DNS) of the barotropic vorticity equation, the estimation of the mean and covariances depend on the number of realizations used in the ensemble.

Kalman filters
A standard derivation of the matrix Kalman filter equations can be found in section 5.4 of Brown (1983) [9].These equations may also be derived simply by using Bayes' theorem (Lorenc (1986) [53]).The Kalman filter theory (Kalman (1960) [44]) implicitly assumes that both the observations and priori (forecast) distributions, based on the background state, are Gaussianly distributed.The posteriori distribution for the analysis is then derived based on the products of these two Gaussians and yields simply the Kalman filter equations (Lorenc (1986) [53], Anderson (2001) [1]).In the standard Kalman filter theory it is also assumed that the background or forecast field variable ζ(x, t) satisfies a linear equation.
The extended Kalman filter (Ghil et al. (1981) [38], Evensen (1992) [20]) (EKF) is formulated entirely in terms of covariances and completely neglects higher order moments.This approach may be considered to be a method of successive linearizations about the evolving nonlinear trajectory; it is a tangent linear approximation for calculating C f (x, t; y, t).In practice, a tangent linear model is used to transform a perturbation ζ(x, t) at some initial time t to a final perturbation ζ(x, t+ t) at time t+ t (Kalnay (2003) [45]).Linearizing about the nonlinear model trajectory will however be inaccurate whenever the perturbations are large.EKF data assimilation studies using the shallow water equations (Budgell (1986) [10]) with applications to oceanographic problems have enjoyed some success.
In order to tackle strongly nonlinear systems, where bifurcations may occur, (Miller et al. (1994) [59]) developed the generalized EKF.The generalized extended Kalman filter (GEKF) uses a moment expansion method in terms of Taylor series to estimate the contributions of the higher order moments to the Kalman gain (Miller et al. (1994) [59]).The GEKF extends the EKF methodology to strongly nonlinear systems by including contributions from the third and fourth order moments to the calculations of the error variance C f (x, t; y, t).This enables the tracking of regime transitions by increasing the estimated Kalman gain which is typically underestimated in EKF comparisons.The inclusion of approximations to the third and fourth moments into the EKF was found to be sufficient to track the reference solution and to capture the bifurcation behavior of chaotic trajectories in both the three-component Lorenz and double well models; however, it was found to be impractical for higher dimensional models.
In general the Kalman filter theory for estimating the a posteriori analysis fields, in terms of the observations and prior forecast fields, takes the following form.Let d = d + d denote the vector of observations with mean d and error d, ζ f is the vector of prior fields, K is the Kalman gain and H is the operator that maps the fields from model space to the space of observations.Then the analysis field is estimated by where ζ = ζ + ζ.The estimate Eq. ( 13) is the optimal linear combination of the observations and the prior.The Kalman gain K is given by where the observational error matrix D = dd T , the forecast error covariance matrix C f = ζf ζfT and T denotes transpose for real fields in grid point space.
The ensemble Kalman filter differs from the Kalman and extended Kalman filters in that Eq. ( 13) is implemented directly for an ensemble of forecast fields, ζ f i = ζ + ζf i , rather than for the mean field and covariance statistics.Again the prior covariance matrix can be calculated by ensemble averaging over the fluctuating components with C f = ζf i ζf i T .In both the ensemble and statistical approaches the Kalman gain is given by Eq. ( 14).
For the statistical Kalman filter equations, the mean analysis field is given by and the analysis error covariance is In contrast, for the ensemble Kalman filter, the analysis fields are calculated for each realization through Then averaging yields the mean analysis fields, the prior and analysis covariances, C f and C a and consequently the Kalman gain can be calculated through Eq. ( 14).

Ensemble data assimilation strategies
In general, calculation of the full error covariance matrix represents an intractable problem for data assimilation based on DNS.In ensemble methods often the number of variables is very much larger than the number of available ensemble members resulting in sampling errors in the estimation of the background error covariance matrix.In particular inadequately sampled ensemble averaged DNS often results in covariance matrices that have poorly resolved and under-estimated error covariances.Sampling errors can manifest themselves as spurious long range correlations requiring some form of spatial localization.In the horizontal plane localization is usually accomplished through the introduction of a correlation cut-off factor of the order of several thousand kilometers beyond which the correlations are zero (Houtekamer & Mitchell (1998) [43]).Studies using covariance localization have been demonstrated to lead to more homogeneous background error covariances (Gaspari & Cohn (1999) [37]).As well as localization a covariance inflation factor (Anderson (2001) [1], Bowler (2006) [7]) is often applied to the analysis perturbations to not only ensure that perturbations have the correct spread at the next time step but to address problems such as filter divergence.
Alternative strategies may involve various other approximations, rather than applying the full matrix Kalman filter methodology in which matrix size depends on the dimension of the state vector.One approach that we shall discuss in some detail involves Kalman gains that are homogeneous, and generally anisotropic, when acting on the mean fields and error fields.In spectral space this means that the Kalman gains are diagonal when acting on the spectral coefficients of the mean field but that the covariances "see" products of gains at different wavenumbers.More restrictive assumptions may also be considered such as approximate homogeneity of covariance matrices (Parish & Derber (1992) [64], Courtier et al. (1998) [15], Buehner (2005) [11]).Vidard et al. (2003) [77] discuss a variational data assimilation method involving the determination of optimal nudging coefficients (ON) that is shown to be closely related to the Kalman filter.In particular they compare the performance of the assimilation when their nudging operator is a full 4D-ON matrix, when it is "pseudo-diagonal" (correction coefficient varies over observation locations) and when it is a constant coefficient over all spatial points at a given time.They find that the results for the diagonal and scalar coefficient nudging parameters are better than for the full 4D-ON even though the latter is more than 60 times more expensive than the scalar case (see also Zou et al. (1982) [81]).Vidard et al. (2003) [77] also note that both the pseudo-diagonal matrix and scalar coefficients give about the same results.
From section 5, and Appendix A following Parish & Derber (1992) [64], we have the full Kalman filter equations in both physical and spectral space.In general we now have four choices 1. Full Kalman filter equations in physical space with N refers to the number of variables in grid point space and hence determines the dimensions of the background error covariance matrices.Similarly M refers to the number of observations and hence the dimensions of the innovation.For an ensemble the number of perturbations is an added consideration.
2. The Kalman filter equations of Appendix A with K × K matrices in spectral space where K refers to the number of interacting wavenumbers in spectral space.
3. Spectral Kalman filter equations with diagonal Kalman gain K k and diagonal observation operator H k (as described in Appendix A and Eqs. ( 18) through (21).
4. If H k are nonzero then we may also have spectral equations in terms of Here we consider choices 3 & 4. Choices 1 & 2 would present difficulties for both the ensemble and square root filters in terms of sampling errors.As in many previous studies (Anderson (2001) [1], Whitaker & Hamill (2002) [80], Bengtsson et al. ( 2003) [5], Kalnay (2007) [46]), we assume for simplicity that H k is unity; however, our results would equally apply to other specifications of H k such as being homogeneous or invertible.In Appendix A we derive the full Kalman filter equations in spectral space (following Parish & Derber (1992) [64]).
Our method of implementing the Kalman gain uses a homogeneous but anisotropic spectral gain to act on the mean field and error fields.Our approach generalizes the SSI method of Parish & Derber (1992) [64] and Buehner (2005) [11] to flow dependent perturbations.

Ensemble Kalman filter
In our approach we use a prescribed observational error with perturbations di k as well as analysis and forecast fields defined by: k where i = 1, . . ., N runs over the ensemble.Throughout the following discussion we will simply assume that the perturbation field denotedˆruns over the entire ensemble i.e. ζi will be assumed.Given the background vorticity field the equation for the analysis at time t is again in terms of a linear interpolation between observations and predictions scaled by the Kalman gain K. d k are taken from a nudged integration of a barotropic model but could in general represent the Fourier transform, Eq. (3), of observed fields ζ obs .
Data assimilation using a Kalman filter methodology estimates the mean analysis field based on the mean forecast and observations as Here, K k and H k are the spectral diagonal elements of K 0 and H 0 defined in Appendix A. The prior field ζ f k satisfies Eq. ( 4) in which we may include a model error through f 0 k = f 0 k + f 0 k where f 0 k is generally taken to be white.As well the analysis covariance is modeled by where C k = C k,−k and the spectral Kalman gain is defined as Eq. ( 18) implies that updating the perturbation field corresponds to covariance matrices given by The EnKF implements Eqs. ( 18) & (20) whereas Eqs. ( 19) & ( 21) follow by consistency.More specifically we approach ensemble filtering following the EnKF formulation of Evensen (1994) [21].From Eq. (18c) for the error fields we can also construct the equation for the higher order cumulants of analysis error.The mean and fluctuating vorticity fields ζ k & ζk see a Kalman gain described in terms of homogeneous but anisotropic covariances in physical space, which is equivalent to one based on diagonal covariances in spectral space (Parish & Derber (1992) [64]).The full analysis error covariances including off-diagonal elements are then generated from the error fields and therefore the two-point cumulants C k,−l are acted on by products of K k and K −l .Similarly the three-point cumulant ζk ζ−l ζl−k is acted on by products of K k , K −l and K l−k and so on to higher order.

Ensemble square root filter
We next consider a spectral formulation of the square root filter consistent with the EnKF formulated previously.As before the Kalman gain acting on the mean field is homogeneous but anisotropic.Square root filters have recently been applied in the meteorological context by Bishop et al. (2001) [8], Anderson (2001) [1], Whitaker & Hamill (2002) [80] and Tippett et al. (2003) [74] following on from the work of Andrews (1968) [3].The square root filter methods were developed to address the problem of underestimated analysis error covariances due to insufficient sampling with the ensemble Kalman filter.
The EnKF methodology requires that observations be perturbed in order that the Kalman filter update equation be satisfied, as seen from section 3.If one naively omits the perturbations on the observations then it has been shown (Whitaker & Hamill (2002) [80]) that the term resulting in C a k being systematically under-estimated.The square root filter approach requires an ensemble filter that does not use perturbed observations yet satisfies the Kalman filter equation for the analysis error.The general solution was developed by Andrews (1968) [3] and employed in various forms by Bishop et al. (2001) [8], Anderson (2001) [1], Whitaker & Hamill (2002) [80] and Tippett et al. (2003) [74].The analysis mean field equation is as for the EnKF, Eq. ( 18), while the analysis error field is given by Our approach seeks to define a homogeneous K for which the diagonal analysis covariance components satisfy The solutions are or Thus from Eq. ( 22) & ( 24), we have where K k is again given by Eq. ( 20).

Statistical dynamical Kalman Filter
The statistical dynamical Kalman filter (SDKF), which is used for the closure based assimilation method, is analogous to the EnKF.However the SDKF implements where the prognostic equations for the statistics ζ k , C k , C k,−l and ζk ζ−l ζl−k are given in Section 3 and in Appendix C.
In our numerical experiments we choose our initial C k,−l (t 0 , t 0 ) to be isotropic; it subsequently evolves to an inhomogeneous C k,−l (t, t ) and take C a (t, t ) = C f (t, t ) for t = t .Higher order cumulants or non-Gaussian contributions are included through Eq. ( 59) of Appendix C. In our numerical experiments we also employ the common assumption that the statistics of dk are Gaussian, so that dk (t) d−l (t) d(l−k) (t) = 0; as well we set D k,−l = 0 for l = k.
To reiterate Eqs.(8a), ( 9), ( 10), ( 54), ( 59), the response function Eqs. ( 57), ( 58) and the Kalman filter Eqs. ( 27) constitute the SDKF.The SDKF includes specific terms for the full background error covariance matrix and non-Gaussian three-point cumulant terms.We note from Eq. ( 27) that while the mean field and diagonal covariance are scaled by the diagonal Kalman gain, the off-diagonal covariance is scaled by products of two gain terms at different wavenumbers.In turn the three-point function is scaled by three gain terms at different wavenumbers.In these respects the form of the SDKF is similar to the EnKF and EnSF.

Entropy measures
The Shannon entropy has the general expression in terms of the probability distribution function P (x).If P (x) is multivariate Gaussian with covariance C then up to an arbitrary constant The Shannon entropy reduction when the information content is increased due to new observations in the assimilation is If C f and C a are diagonally dominant in spectral space we have This is also the expression we obtain if we base our analysis on the statistical mechanical equilibrium expression of the entropy i.e. S = 1 2 k ln C k , applied out of strict statistical mechanical equilibrium (Frederiksen & Bell (1983, 1984) [29,30], Frederiksen & O'Kane (2008) [34]).Now from Eq. ( 19) and hence for small K k H k , on using a Taylor expansion.We may then define the entropy reduction spectrum Thus the behavior of K k relates directly to the entropy reduction through the relations in Eq. ( 36).

Diagnostics
The main diagnostics used in this study are the zonally averaged kinetic energy as well as the palinstrophy production measure.Palinstrophy is a measure of a higher order than enstrophy hence it emphasizes the smaller scales more.In the absence of mean flows and topography the palinstrophy production measure P M reduces to an estimation of the skewness and is a common diagnostic in statistical turbulence theory for the statistics of the small scales.In the current form P M also includes inhomogeneities arising from the nonlinear interaction of the two-point cumulant and the mean field and topography.
As such P M allows us to very sensitively measure error growth due to non-Gaussian correlations and inhomogeneity.
The zonally averaged perturbation (error) ê(k x , t) and mean e(k x , t) kinetic energy spectra are defined as The kinetic energy of the large-scale flow is plotted at zero wave number.
In the flow regime where strong mean-field dominates very weak initial transients our measure, P M , of the small scale differences between the EnKF, EnSF and SDKF effectively becomes a measure of the growth of small scale inhomogeneities.If we define the transient enstrophy where ) then P M can be defined as From Eqs. (39a) & (39b) we see clearly that P M is dependent on the tendency of the diagonal covariance and thus its amplitude reflects sensitively not only the contribution of the off-diagonal terms and any non-Gaussian contributions but the slope indicates the rate at which errors are growing (OF2004 [62], O'Kane & Frederiksen (2008) [63]).

A comparison of ensemble and statistical dynamical Kalman filter methods
In this section we compare EnKF, EnSF and SDKF filters in a range of experiments.These include 5 day assimilation experiments followed by 5 day forecast studies and as well 30 day assimilations experiments.In our studies we focus on 500-hPa Northern Hemisphere atmospheric flows during a 30 day period starting on the 16 th of October 1979.During this period three major Northern Hemisphere atmospheric blocking high low dipoles formed.The first large-scale block began developing over the United Kingdom around the 21 st of October then matured to form a high (over Scandinavia) -low (over Spain) dipole around the 27 th of October.The second occurred in the middle of this period over the Gulf of Alaska on the 5 th of November, amplified and persisted until the 12 th of November and then weakened and moved downstream.A third blocking high formed over the Atlantic Ocean around the 13 th maturing into a high-low dipole on the 16 th and finally decaying during the period between 17 th − 18 th of November.By conducting our data assimilation experiments over this period we are rigorously testing the ability of the various assimilation methods to track the truth trajectory through a number of atmospheric regime transitions.As such our results are analogous to the study of Miller et al. (1994) [59] examining the ability of the general extended Kalman filter (GEKF) methodology to track regime transitions associated with the chaotic trajectory of the Lorenz model.Miller et al. (1994) [59] also consider the strongly non-Gaussian stochastically forced double well.
The dynamics and predictability of these particular events have been analyzed in detail by Frederiksen (1989) [25] and in the bred perturbation ensemble prediction studies of Frederiksen, Collier & Watkins (2004) [36].As well, O'Kane & Frederiksen (2008) [63] compared closure and ensemble averaged DNS results for 5 day forecasts starting after an initial 5 day breeding period; the QDIA closure was used to separate and quantify the various contributions to error growth from the covariances and higher order cumulants and with the DNS providing a means to measure the accuracy of the closure.Figure 1 displays the model Northern Hemisphere topography and the mature block over the Gulf of Alaska on the 6 th November 1979.Note that there is some slight evidence of the Gibbs effect after Fourier transformation.
Throughout, the truth trajectory is calculated by running the barotropic vorticity equation over the desired period with a relaxation Eq. ( 5) where κ = 1day −1 and ζ obs are linearly interpolated daily observed fields.The source term is calculated at each timestep of the unperturbed simulation, stored and then applied to both the perturbed ensemble DNS and mean field equation of the closure for data assimilation and prediction.In all the studies that follow we use the general dissipation ν 0 (k)k 2 = νk 2 + iω k as in Eq. ( 7) where ν = 2.5 × 10 5 m 2 s −1 .We choose the observational error to have an rms of 1 × 10 6 m 2 s −1 and define the nondimensional variance D k (t, t) = 1.826 × 10 −6 k, which results in an almost flat kinetic energy spectrum.Our choice of observational error variance is similar to that employed by Anderson (2001) [1] in studies of data assimilation using the ensemble adjustment Kalman filter that belongs to the same family of square root filters considered here.Throughout we assume the model error variance Q k (t, t) to be zero.As well, in all calculations we use a circularly truncated wavenumber space k = 16 (C16) resolution (which has 797 degrees of freedom).

Performance of QDIA closure
As a basis for the subsequent discussion and to rule out systematic differences between closure and ensemble models we first examine the performance of the QDIA as compared to DNS without data assimilation.In Figure 2 we compare QDIA closure and ensemble average DNS calculations for the period from the 29 th of October to the 3 rd of November 1979 without data assimilation but including the source term S obs k (t) Eq. ( 5). Figure 2a) shows the initial zonally averaged mean and isotropic error kinetic energy fields with Figure 2b) depicting the evolved closure and DNS fields after a five day evolution period.The evolved fields are in close agreement for both mean and perturbation energies and are included to show that there is close agreement between closure and DNS in the flow regime prior to the implementation of the various data assimilation schemes.

Comparison of EnKF with SDKF
In Figure 3 we compare the SDKF to the EnKF run with 3600 and 20 member ensembles after a 5 day period (3 rd November 1979) with 6 hourly data assimilation.There is little difference between the 20 and 3600 EnKF evolved perturbation kinetic energy.This result is in agreement with the study of Houtekamer & Derome (1995) [42] who also considered the role of sample sizes.However, in comparison to the SDKF, the EnKF calculations underestimate the assimilated error variances at day 5 and consequently the initial forecast perturbations.Subsequently the ensemble Kalman filter error variances grow more slowly (see the comparison of the day 5, 8 th November 1979, forecast error variances in Figure 3b).The EnKF assimilates data as described in section 4.1 where dk has newly sampled random phases at every assimilation time.The effect of sampling error due to introduced Gaussian random initial perturbations is to randomize the evolved analysis vorticity field perturbations thus hindering alignment in the direction of the evolving flow instabilities.Figure 3b) clearly shows rapid growth in the SDKF error variances whereas the forecast EnKF calculations show slow error growth whether the sample size is 20 or 3600.The problem is that for each ζf i k we may require a large number of di k to adequately sample the observational error .

Discussion
As noted by Leith (1974) [52] and Pham (2001) [65] the number of forecast perturbations in the vorticity field required to resolve the mean at reasonable resolutions is considerably less than the number required to accurately resolve the covariances.The problem of accurately resolving the second moment is significantly more complex due to sampling error in the tendency equation for C f k arising through the Jacobian terms.FO2005 [33] have discussed the mechanisms by which errors in representing the Jacobian arise starting from diagonal (homogeneous or isotropic) initial conditions in the context of turbulent barotropic flows.In the present study this initial sampling problem arises for each ζf i k at each assimilation time.However the sampling error in estimating ζf i k may to some extent be ameliorated due to convergence toward the leading flow instabilities.
We first considered the role of sampling on the growth of the diagonal error variances by calculating the amplitude of any spurious cross correlations in the EnKF which for the purposes of clarity we will refer to as the "direct" sampling error contribution to the analysis error variance.For 3600 realization ensembles this direct error was typically much less than 1 percent while for 20 realizations it was found to be as large as 10-20 percent.For our covariance modeling studies the sampling error in specifying dk turns out to be crucial.
The more difficult problem arises in estimating di k for each ζf i k in such a way that the tendency of the background variances i.e. the Jacobian terms, are well represented.FO2005 [33] have discussed in some detail the role of sampling error in the Jacobian and have suggested some methods to reduce its impacts in flow regimes typical of atmospheric data assimilation studies.FO2005 [33] demonstrated that for resolutions and flows typical of those considered here that of the order of a few thousand realizations were required to reduce noise associated with sampling error to levels that allowed the Jacobian to be resolved and hence grow.In the current context of the EnKF with perturbed observations NxMxL realizations are required, where N is the number of variables, M the number of perturbations required to resolve the Jacobian and L the number of perturbations required to resolve the statistics of the perturbed observations with M typically equal to L.More encouragingly FO2005 [33] showed that in order to obtain the correct error growth rates due to flow inhomogeneities it is crucial to resolve the Jacobian as opposed to the more difficult problem of resolving the full background error covariances.On the basis of the square root filter results presented in the next section we expect that the Jacobian sampling errors are, as in FO2005 [33], due to terms involving dk .

Performance of the ensemble square root filter
Next we contrast the SDKF and EnKF with the ensemble square root filter (EnSF) on the 3 rd November 1979 after a 5 day period of 6 hourly data assimilation (Figure 4).The EnSF shows much closer agreement to the SDKF with very close agreement for k ≥ 3.Although the results presented in Figure 4 are for 3600 realizations little difference was found for calculations of kinetic energy down to 20 member ensembles.The mean streamfunction fields are shown in Figure 5. Clearly, as far as resolving the mean is concerned, each of the filters produces nearly identical results, at least at the larger scales.In the presence of fast growing instabilities the Kalman gain adjusts itself in order to give increasing weight to the observations.It is therefore of interest to consider the spectra of the gain.In Figure 6 we plot the band averaged Kalman gain where the set S is defined as and the subscript i indicates that the integer part is taken in Eq. ( 41) so that all k that lie within a given radius band of unit width are summed over.Following from Eq. ( 37) we may now define the entropy reduction in terms of the band averaged Kalman gain (for H k = 1 in Eq. ( 36) and so S ∆ (k) may be inferred directly from the spectra of the Kalman gain.x By summing each term in Eq. ( 40) over the set S we obtain a simple ratio at each wavenumber; our initial gain is constant over all modes.As the flow evolves the gain spectra evolve to profiles that peak at the fast growing instability vectors, which range between k = 3 & 8 during this period.Thus, the Kalman gain gives more weight to the observations when the prior contains rapidly growing instabilities.We also notice that the EnKF under-predicts the analysis error covariance while exhibiting a systematically weaker Kalman gain (see Figure 6 b).From Eq. (42) we see that the entropy reduction is correlated with the Kalman gain spectra and that maximum entropy reduction occurs where the gain peaks, as is to be expected.The square root filter performance is much improved as it is better able to capture the evolving spectral gain resulting from the developing large scale flow instabilities.To understand the reason for this we again need to consider the respective equations for the gains.These are In Eq. (43a) there are competing contributions to the amplitude of the analysis field through the rescaling (1 − K k ) ζf k of the growing part and the term K k dk which is stochastic.In Eqs. ( 43b) & (43c) we essentially only have rescaling of the growing contributions to the analysis error.In Eqs.(43a) & (43b) rescaling ζf k transparently results in a rescaling of the constructed full background error covariance matrix.By rescaling the background error variances in the manner of (43c) we are necessarily rescaling the full off-diagonal background error covariance matrix as a consequence of it being a functional of diagonal cumulant and response function matrices in the SDKF formalism.Importantly, assuming Gaussian statistics, the effect of sampling error in the stochastic term of the EnKF Eq. (43a) will be compounded by producing errors in D k,−l in the equation for the two-point cumulant Eq. ( 21), and also result in a spurious three-point non-Gaussian observational error term dk d−l d(l−k) .
In the recent ensemble prediction study of O'Kane & Frederiksen (2008) [63] the palinstrophy production measure was shown to be a very useful measure of the small scale differences between DNS and closure simulations with different approximations for the higher order cumulants.In Figure 7 we compare the evolution of P M (t) for the SDKF with that for EnKF and EnSF.The growth of P M (t) over the assimilation period may be divided into two phases.The initial 12hr period corresponds to the growth of off-diagonal covariances generated by flow inhomogeneities starting from isotropic initial error statistics.The second period of nearly monotonic growth corresponds to the growth of small scale disturbances as the European block decays.Interestingly it is the SDKF and EnKF models that seem most sensitive to the 6 hourly data assimilation with the EnSF showing a much less sensitive small scale response at each assimilation time.Closer examination shows that for the EnKF P M drops at each assimilation time and then gradually increases over the subsequent 6 hour period before the process is repeated.This is due to the sampling error in the Jacobian reducing the covariances and results in the systematic underestimation of the small scale kinetic energy.The SDKF on the other hand undergoes an upward boost in P M at each assimilation time and then relaxes over the interim period.The EnSF is very much smoother in its response to data being assimilated, presumably because the posteriori perturbation has the same phase as the prior, but closer inspection shows similarities with the SDKF filter response.It is also apparent that compared to the EnSF model the EnKF systematically under predicts the palinstrophy production measure over the first 2.5 days.However in the later period it is the EnSF that shows less growth in P M .
Our results are in agreement with Anderson (2001) [1] in that sampling error introduced through the use of perturbed observations and small ensembles is a likely cause of the poor performance of the EnKF compared with square root filters.Our results also suggest that improving the performance of the traditional EnKF may be difficult to achieve in realistic models without tuning parameters.We next examine the performance of the ensemble square root filter EnSF during a 30 day period of 12 hourly data assimilation beginning on the 16 th of October 1979.As discussed earlier, during this period three major large-scale atmospheric regime transitions occurred in the Northern Hemisphere corresponding to blocking high low dipoles.The European block formed in the period leading up to the 27 th of October.The block over the Gulf of Alaska amplified rapidly on 6 th of November to form the mature anomaly on the 8 th .A third block formed over the Atlantic Ocean during the period leading up to the 16 th of November.In Figure 8 we plot the evolved zonally averaged mean and error kinetic energy spectra on the initial day a) and at subsequent 5 day intervals b)-g).Figure 8 clearly shows that the prior covariances remain small, in relation to the mean, throughout the entire 30 day assimilation period.In Figure 9 we show the evolved Kalman gain spectra for the EnSF.From Figure 9 we see that the Kalman gains again reflect the flow dynamics with the strongest gains occurring for the fast growing scales.The spectra are peaked at various wavenumbers between 3 & 8 consistent with the error growth   Frederiksen (1997Frederiksen ( , 2000) ) [26,28].As in Figure 3 maximum entropy reduction occurs at the fastest growing scales.Figure 10 portrays P M over the 30 day period.Because P M is a measure of palinstrophy production steep slopes correspond to periods of rapid growth in the small scale statistics and we note that the evolution is again dominated by the flow inhomogeneities even at the smallest resolved scales.In Figure 11 we estimate the systematic differences between the EnSF methodology and the truth by plotting and comparing it with Here superscript f indicates background forecast and t indicates truth.Figures 10 and 11 are strongly correlated because at the resolution considered the systematic differences between the EnSF and the control (Figure 11) are again dominated by small scale flow inhomogeneities.This is the reason for the broad similarities between Figures 10 & 11. Figure 11 shows that the EnSF methodology performs well over the 30 day assimilation period.The EnSF calculation maintains  a very close trajectory to the truth, that is S ψ D ψ , with little evidence of a systematic drift over the period considered.We note that our test, Eq. ( 44), is similar to that used by Anderson (2001) [1], where a similar measure was calculated at a single grid point.Note that here we are summing over the entire spectral domain and so incorporate all regions of instability.
In Figure 12 we compare the evolved zonally asymmetric, or eddy, streamfunction and variances in physical space on the 6 th of November 1979 when the block over the Gulf of Alaska was amplifying rapidly.We note that the largest amplitudes of the error variance in general occur in the regions of large potential vorticity gradients, as expected from the Raleigh-Kuo barotropic instability criterion.This phase shifting between the evolved zonally averaged asymmetric streamfunction and the error variances during blocking is evident in the earlier instability studies of Frederiksen (1997Frederiksen ( , 2000)) [26,28].

Discussion and conclusions
We have compared covariance modeling methodologies with applications to data assimilation based on statistical dynamical, square root and ensemble Kalman filters within the frame work of diagonal dominance of the Kalman gain.As stated in the introduction our approach can be regarded as a generalization of the spectral statistical interpolation (SSI) method of Parrsih & Derber (1992) [64].The performance of the statistical dynamical Kalman filter has been examined through direct comparison with the ensemble filter methods.We have compared the SDKF to ensemble Kalman filter results for atmospheric regime transitions from strong zonal flows to the development of large scale meridional circulations associated with the formation of blocks in the Northern Hemisphere.The SDKF accurately incorporates information about the higher order cumulants into the Kalman gain enabling close tracking of the analysis trajectory.Many of the problems observed in the EnKF have been shown to arise because of sampling error associated with the use of perturbed observations.In order to further elucidate the role of sampling we have run both the stochastic ensemble Kalman and deterministic square root filters over a very large number of realizations (up to 3600) and compared the results with the SDKF which corresponds to an infinite number of realizations.Our results show that even when the sample size is significantly larger than the state space the EnKF systematically under estimates the gain leading to reduced amplitude of the evolved variances.
Firstly we have compared integrations of the QDIA closure with ensemble averaged DNS of the barotropic vorticity equations in a 5 day forecast experiment.We have found that the QDIA closure compares closely with the statistics of DNS, confirming the findings of our previous studies and showing that there are no systematic differences in the underlying methodologies.We have further demonstrated, in 5 day periods of 6 hourly data assimilations followed by 5 day forecast experiments, that the EnKF significantly under estimates the prior covariances.In addition, the initial forecast perturbations generated by the EnKF method were not able to match the growth demonstrated by the SDKF error variances over the forecast period.The deterministic EnSF was much better able to match the SDKF results due to an improved estimate of the Kalman gain.The failure of the EnKF was then shown to be due to sampling error introducing spurious off-diagonal elements in the prior covariance matrix.This in turn affects the Jacobians in the prognostic equation and consequently the diagonal prior covariances.The source of the sampling error was shown to arise from the use of random observational noise.This process was also shown to lead to an underestimation of the EnKF gain.The deterministic square root filter uses flow dependent perturbations and avoids the sampling problems of the EnKF, where random observational noise is added, by rescaling the prior to have the same theoretical gain as the SDKF.The EnSF was better able to capture the growth of the error covariances due to the fact that while the amplitudes are rescaled the phase directions of the forecast perturbations were preserved during the assimilation.This allows information about the fast growing flow instabilities to accumulate throughout the assimilation cycle in a similar manner to the method of bred perturbations (Toth & Kalnay (1993, 1997) [75,76]; Frederiksen, Collier & Watkins (2004) [36]; O'Kane & Frederiksen (2008) [63]).The spectra of the Kalman gains for each of the deterministic (EnSF), statistical (SDKF) and stochastic (EnKF) variants were compared.All methods displayed bell shaped spectra peaked at the wavenumbers of the evolved fast growing instability modes.We have also compared the evolution of the palinstrophy measure for each of the three filters.In all experiments the square root filter displays very similar performance to the statistical dynamical filter.
We have estimated the systematic differences between the EnSF methodology and the truth in comparison to the observed error variances and found that the EnSF calculation maintains a very close trajectory to the truth with little evidence of a systematic drift over the period.In addition the prior covariances remain small, in relation to the mean, throughout the entire 30 day assimilation period.No localization or inflation factors have been used in these experiments.
We have examined the performance of statistical and ensemble filters in a setting consistent with the assumptions of the QDIA closure.Our motivation for taking this approach has been twofold.Firstly, Table 1.Description of the forward and backward transformation matrices and vectors.it is in general not possible to calculate the full covariance matrices, and hence the full inhomogeneous Kalman gain, for data assimilation based on DNS and hence some method of reducing the dimensionality of the problem is required.Secondly, in light of our results, the method of using a homogeneous but anisotropic gain acting on the mean field and a diagonally dominant inhomogeneous gain acting on the covariances is shown to be likely to be successful in terms of assimilation and filter stability.We should emphasize however, that the SDKF, as well as the EnKF and EnSF, methodology is general and can in principle be used to study data assimilation with the full inhomogeneous Kalman gain.In a future study we plan to examine and compare the performance of the SDKF with both the full inhomogeneous gain and quasi-diagonal gain and to explore issues such as the effect of covariance localization and inflation in relation to filter stability.
A Appendix: Spectral Kalman Filter equations Our approach to the formulation of the Kalman filter theory in spectral space is in the spirit of the approach taken by Parish & Derber (1992) [64].They formulate the Kalman filter equations directly in terms of the spectral amplitudes of the forecast model rather than in terms of physical space grid point values.Their appendix A describes the factorization of the observational operator which transforms from analysis to pseudo-observations.In our case the spectral space analysis is determined from the spectral space model forecast and the observations on the observational grid through the following equation: where ζ has spectral components ζ k .Here the analysis and forecast variables are vectors in spectral space.Following Parish & Derber (1992) [64] we factorize the observational operator and the Kalman gain into transformations between spectral and model grid point space (and back) and between model grid and observational grid space (and back).The factorizations of the operators are summarized in table 1 which also defines the operator H 0 and the spectral space transform of the observations d in terms of the observations d obs grid on the observational grid.
With the operators defined in table 1, we can also write the equation for the analysis in the form where d has spectral components d k .For the more general multi-level problems considered by Parish & Derber (1992) [64] they also have operators (and inverses) that perform EOF analyses in the vertical.
Parish & Derber (1992) [64] and McNally et al (2000) [57] discuss strategies for and the relative merits of the direct assimilation of observations simultaneously as opposed to pre-processing and sequential data selection approaches.
In spectral space the Kalman gain is determined by where H 0 and K 0 are square matrices and If D, C f and H 0 are quasi-homogeneous or diagonally dominant in spectral space with components D k , C f k and H k respectively then so is K 0 with components K k .The Kalman filter equations then take the form of Eqs. ( 18) through (21).

Figure 1 .
Figure 1.500 hPa streamfunction field at 1200 UTC on the 6 th November 1979 in km 2 s −1 (a) and Northern hemisphere topography in m (b).

Figure 2 .
Figure 2. Figure a) shows the initial mean (dotted) and perturbation (dashed) kinetic energy spectra (non-dimensional) as functions of zonal wavenumber on the 29 th October 1979.Figure b) compares the evolved energy spectra on the 3 rd November 1979.The close evolution of both DNS and QDIA mean fields is evident in the near concurrent plots.

Figure 3 .
Figure 3. Evolved mean and analysis error kinetic energy spectra (non-dimensional) as functions of zonal wavenumber after 5 day data assimilation (3 rd November Figure a)) and 5 day forecast periods (8 th November Figure b)).

Figure 4 .
Figure 4. Evolved mean and analysis error kinetic energy spectra (non-dimensional) as functions of zonal wavenumber after a 5 day data assimilation period ending on the 3 rd November 1979 for the SDKF, EnKF and EnSF filters.

Figure 5 .
Figure 5. Assimilated mean streamfunction in m 2 s −1 after a 5 day period of 6 hourly data assimilation ending on 3 rd November 1979 for the SDKF, EnKF and EnSF filters.

Figure 6 .
Figure 6.a) Band averaged Kalman gain spectra (non-dimensional) on the 29 th October and after the 5 th day of data assimilation.b) Comparison of ensemble Kalman, ensemble square root and statistical dynamical re-scaling gains as defined in Eqs.(43a), (43b) & (43c) after the 5 th day of data assimilation.

Figure 7 .
Figure 7.The evolution of the palinstrophy production measure for SDKF, EnKF & EnSF models respectively over the 5 day assimilation period beginning on 29 th October 1979 with 6 hourly data assimilation.

Figure 9 .
Figure 9.The gain spectra K(k) as functions of band averaged wave number for the EnSF calculation on the 16 th October 1979 a) and at every 5 days subsequently b)-g).

Figure 10 .
Figure 10.The evolving palinstrophy production measure over a 30 day assimilation period starting on the 16 th October 1979.

Figure 12 .
Figure 12.The 500hPa EnSF a) mean eddy streamfunction (km 2 s −1 ) and b) variance (km 4 s −2 ) respectively on 6 th November 1979 showing the forming block over the Gulf of Alaska.

H 1 K 1
grid to spectral transform H 1 inverse spectral to grid transform K 2 transforms from obs to model grid H 2 transforms from model to obs gridd = K 1 K 2 d obs grid D = d, d