To provide an accurate forecast, numerical weather prediction (NWP) models must be evolved from accurate initial conditions. Since the true state of the atmosphere is unknown, estimates of these initial conditions must be determined using information from both previous forecasts and from observations . Using the technique of data assimilation, the observations and previous forecasts, known as backgrounds, are weighted by their respective errors and combined to provide a best guess of the state, known as the analysis. Hence, it is important to have an accurate representation of the background and observation error statistics in the assimilation.
The observation error can be attributed to a number of sources. The instrument, or measurement, error is uncorrelated for most instrument types; however, correlated errors are likely to arise from pre-processing errors, errors in the operator that maps between model and observation space and representativity errors. Representativity errors arise when the observations can resolve scales that the model can not [1
]. The instrument error is often known and well understood, but the contribution from the other error sources is complex and information about them is limited. However, errors arising from the observation operator uncertainty in the context of fast radiative transfer modeling have been considered by e.g., [3
]. Until recently, in operational data assimilation, the observation errors have been assumed uncorrelated and processes such as variance inflation, observation thinning and ‘superobbing’ have been used to either reduce the correlated error or account for the unknown correlations. To improve the accuracy of the analysis and the number of observations assimilated, it is necessary to understand and account for the full, potentially correlated, observation error statistics. These error statistics cannot be calculated directly so must be estimated statistically. Desroziers et al. [5
] proposed a diagnostic that provides an estimate of the observation error covariance matrix by considering the statistical average of observation-minus-background and observation-minus-analysis residuals. In theory, it relies on the use of exact background and observation error statistics in the assimilation; however, it has been used successfully in simple model experiments in both variational [6
] and ensemble [7
] data assimilation systems and to estimate time varying observation errors [9
] when the assimilated error statistics are incorrect. Furthermore, recent theoretical work provides a detailed insight on how results from the diagnostic can be interpreted when the incorrect background and observation error statistics are used in the assimilation [10
]. In addition, an improved estimate of the error statistics may be obtained if successive iterations of the diagnostic are applied [5
], although this iteration procedure is often not possible in operational systems that are currently unable to assimilate observations with correlated error.
An important set of observations used in NWPs are those observed by satellite instruments. Inter-channel correlations have been calculated for observations from satellite instruments such as the Atmospheric Infrared Sounder (AIRS) and Infrared Atmospheric Sounding Interferometer (IASI) using the Desroziers et al. diagnostic [12
]. The literature shows that inter-channel observation errors are correlated and that including these errors in the assimilation leads to improved analysis accuracy, better forecast skill score and the inclusion of more observation information content [6
]. As a result, the assimilation of correlated inter-channel errors for IASI observations is now operational at the Met Office. The benefit seen by including correlated inter-channel errors provides motivation to calculate observation error statistics for other satellite instruments. As well as providing potential benefit to the assimilation, the calculation of both the inter-channel and spatially correlated errors may provide information that allows better use of the observations, either by a reduction in thinning, an optimisation of channel selection or by highlighting areas where the observation operator may be improved. In this work, we are the first to use the Desroziers et al. method to calculate inter-channel error correlations for observations obtained using the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) [20
]. Different from previous work estimating inter-channel error covariances using this method, we demonstrate the variation of results across the geographical domain and the enhanced understanding that this can bring. We also consider whether the SEVIRI observation errors are spatially correlated. We are not aware of any previous work in the literature calculating spatial correlations for satellite data using this method, although we have done this for ground-based weather radar [21
Our new results show that all the estimated error variances are lower than those used operationally. For the inter-channel correlations, we find that upper level water vapour channels have strongly correlated errors as do the surface channels. We also consider how the inter-channel correlations vary across the domain. These results show that the inter-channel correlations for the surface channels are much stronger in coastal areas. We find that this is the result of mixed pixel (mixed surface type) observations being assimilated. For the horizontal errors, we find that the correlation length scale is larger than the operational thinning distance, with the length scales ranging between 50 km and 80 km dependent on the observation channel. Considering the horizontal correlation for sea-only observations increases the correlation length scale and suggests that errors associated with the mixed pixel observations have a different structure to the errors associated with the sea-only pixels. Accounting for the correlated SEVIRI errors in the data assimilation is expected to be of benefit to the analysis.
Whilst calculating these results, we find that the estimates from the diagnostics are unaffected by a bias in the observations, providing that the means of the observation-minus-background and observation-minus-analysis residuals are subtracted to make the estimation unbiased.
This paper is organised as follows. In Section 2
, we describe the Desroziers et al. method that we use to calculate observation error statistics; we also describe the SEVIRI observations and their model representation. The experimental design is described in Section 3
. The estimated inter-channel and horizontal error correlation results are presented in Section 4
. Finally, we conclude in Section 5
3. Experimental Design
In this work, we calculate both horizontal and inter-channel correlations for the SEVIRI observations. To calculate these correlations, we use archived observations and background data produced by the operational Met Office system from June, July and August 2013. The analysis fields are produced by rerunning the operational UKV assimilation scheme. To remove any of the model background points that may be affected by the boundary condition spin-up, we only consider observations that are located in the uniform grid area of the model.
As we are using an operational configuration for the assimilation, we are only able to calculate observation error correlations for the thinned SEVIRI data. As a result, when calculating the horizontal correlations, we are unable to consider correlations at a distance of less than 24 km. Experiments were also performed using an unthinned set of data to allow calculation of correlations at 5 km. These results are not presented here due to the suboptimality of the data assimilation scheme with unthinned observations. However, results are briefly discussed in the conclusions. The horizontal correlations are calculated separately for each of the five different channels. We determine the correlation length scale by considering where the correlation becomes insignificant (<0.2) [41
]. We calculate inter-channel correlations using data across the entire domain; however, as channels 5 and 6 are available over both land and sea, we also investigate whether the correlations depend on surface type. We also calculate correlations over sub-domains of the model to consider how the correlations vary across the model domain.
Calculations using an initial set of operational data highlighted a large bias in the observations for Channel 5. Consequently, the bias correction was updated and a new assimilation was run over the same summer period. Here, we present results from the diagnostics using the bias corrected set of data. However, we note that, when using the biased data, results obtained from the diagnostics were qualitatively similar to those presented here. This important result highlights that, if the mean residual values are subtracted from the diagnostic, then the results are not sensitive to bias present in the original data.
In data assimilation, to make the best use of the observations and obtain an accurate analysis, it is important to have a good understanding of the errors associated with the observations. Recently, observation error statistics for a number of different observation types have been estimated using the diagnostics of [5
]. In this work, we use the diagnostics to estimate both horizontal and inter-channel observation error statistics for SEVIRI observations that have been assimilated into the Met Office UKV model. Whilst calculating these results, we find that the estimates from the diagnostic are unaffected by a bias in the observations, provided that the means of the observation-minus-background and observation-minus-analysis residuals are subtracted to make the estimation unbiased.
When considering the variances for both the horizontal and inter-channel statistics, we find that the errors are larger for the upper level water vapour channels compared to the surface channels. This is a result of the uncertainty in estimating upper level water vapour with the radiative transfer scheme. The large variance estimated for Channel 5 may be a result of observations from this Channel that are assimilated over low clouds. In general, the estimated variances are much lower than those currently used in the assimilation. The estimation of variances that are lower than those used in the assimilation has also been seen with other satellite observations [12
], and theoretical work also suggests that the diagnostic may give an underestimate of the observation error variance under the operational configuration used in this study [10
When considering the inter-channel correlations, we find that the upper level water vapour channels have significantly correlated errors, as do the surface channels. Although the correlations are likely to arise from a number of different sources, in this case, we suggest that the dominant contribution is from errors in the observation operators, as we see block correlations between channels that share overlapping weighting functions. We also considered how the inter-channel observation errors varied across the domain. The upper level water vapour showed little dependence on the surface type. However, both the variances and correlations for the surface channels were increased over coastal areas of the domain. This increase in correlation and variance over coastal regions is the result of ‘mixed pixel’ observations being used in the assimilation. This suggests that the observation quality checks in the coastal areas should be made more rigorous to ensure that only observations over sea are assimilated for the surface channels. This result shows that the diagnostics can highlight potential areas of improvement in the data assimilation scheme.
The estimated horizontal observation error statistics, for the full domain, suggest that there are significant correlations between observation errors in the current operational system. Horizontal correlations range between 30 km and 80 km depending on the observing channel. The upper level humidity channels share similar correlation structures as do the surface (temperature) channels. We hypothesise that the differing correlation structures between the humidity and temperature sounding channels are a result of errors of representativity [2
]. Considering the horizontal correlation for sea-only observations increases the correlation length scale and suggests that errors associated with the mixed pixel observations have a different structure to the errors associated with the sea-only pixels. We conjecture that, in part, the large spatial correlations for the sea-only data set are a result of the spatial error correlations in the sea-surface temperature fields used by the RTTOV model. However, to determine the exact causes of these correlation length scales would require a metrological study, which is beyond the scope of this work.
For all channels, the correlation length scale is larger than the operational observation thinning distance of 24 km. Using an operational configuration, it is not possible to calculate the correlation structure below the thinning distance. To understand the correlations at short distances, an assimilation was run where the SEVIRI data were unthinned. This unthinned data resulted in sub-optimal assimilation and resulting analyses. However, results from the diagnostics suggested that the error correlation structure for the unthinned data is similar to the correlation calculated for the operational data set. These results using the unthinned data set suggest that the estimated correlation length scales are not a consequence of the thinning distance or error aliasing.
The results found from this study suggest that SEVIRI observations have significantly correlated spatial and inter-channel observation errors. This implies that, if SEVIRI observations are to be assimilated optimally, the inclusion of correlated observation error statistics in the assimilation system is desirable.