Spatio-Temporal Optimal Interpolation of Aerosol Optical Depth Observations Using a Chemical Transport Model

: To estimate the spatial and temporal distribution of aerosol optical depth (AOD), we used the optimal interpolation (OI). In OI, observational data and a model forecast are linearly combined according to their relative accuracies. Weight coefficients are chosen to minimize the mean-square error in the estimate. To obtain weight coefficients, correlations between model errors in the different grid points are used. In the classical OI, only spatial correlations are considered. We used spatial and temporal correlation functions. To obtain error statistics, we used observations from European stations of the ground-based sun photometers Aerosol Robotic Network (AERONET) and simulations by a chemical transport model GEOS-Chem, assuming a negligible error of AERONET AOD observations. The estimates of the daily mean AOD distribution over Europe are obtained. The reduction of the root-mean-square error of the AOD estimate based on the OI method in comparison with the GEOS-Chem model results is discussed.


Introduction
Atmospheric aerosol has a considerable impact on air quality and climate. One of important characteristics of atmospheric aerosol is aerosol optical depth (AOD), which is a measure of light extinction by aerosol. The atmospheric column integrated aerosol load can be derived from AOD observations. A global ground-based network of sun and sky photometers Aerosol Robotic Network (AERONET) provides AOD data with low uncertainty [1][2][3][4][5]. However, AERONET observations are sparse in space and time. Chemical transport models can fill in observational gaps. Model simulations provide values of AOD at all cells of a regular grid over the domain of interest. A variety of models is used to describe aerosol optical properties including AOD [6][7][8][9][10]. The drawback of models is a large uncertainty. To obtain a likely true estimate of the spatial and temporal distribution of AOD, data assimilation can be applied. Data assimilation is a technique of combining observational data with model simulations outputs. Data assimilation approaches are commonly divided into optimal interpolation (OI) [11][12][13], Kalman filtering (KF) [14][15][16], and variational methods [17][18][19]. All of these approaches are based on the minimum mean-square error principle of the estimation theory. Each method has advantages and  disadvantages depending on specific applications. OI estimates a value of interest in a grid point through a weighted linear combination of observational and modeled data at the point in question and neighboring observational points, according to the accuracies of the data used. Weighting coefficients are chosen to minimize the mean-square error in the estimate. To obtain weighting coefficients, correlations between model errors in the different grid points are used. A single correlation function is estimated from available data assuming homogeneity and isotropy of the field. The model error statistics is assumed to be stationary. KF is a sequential data assimilation scheme. KF is a two step process: the forecast and the analysis. The forecast is made using a dynamical model, in which the estimate obtained in a previous time step is incorporated. The analysis is the same as OI. A forecast error covariance updated in every time step is used instead of a single model error covariance. This allows reducing the mean-square error in the estimate in comparison with OI. However, if the temporal gaps are present in the observational data, there is not improvement in comparison with OI, because the values being estimated converge too quickly to the model trajectory [20]. Variational methods are based on minimising an objective function proportional to the square of the distance between the estimate and both the model and the observations. Under some commonly used assumptions, the threedimensional variational method (3D-Var) is equivalent to OI [21]. Difference is only in the method of solution. In the four-dimensional variational approach (4D-Var), the minimization of the objective function is carried out over a time window. The numerical cost of 4D-Var is very high. OI is much less computationally expensive than KF and 4DVar methods.
In the classical OI, only spatial correlations are considered. The method can be extended to include time dimension by using spatial and temporal correlations. The use of spatio-temporal optimal interpolation (STOI) allows filling in not just spatial, but also temporal gaps in observations, and improving accuracy of the method. STOI was used in ocean sciences in works [22,23]. In [24] we used STOI combining AERONET observations and chemical transport model GEOS-Chem [25,26] calculations, for the estimation of the distribution of AOD at 870 nm over the East European region. In the present work, we assimilated AERONET AOD at the wavelengths of 440, 675, and 870 nm using STOI to obtain the distribution of total AOD over Europe.

AERONET Observations
One of the widely used sources of atmospheric aerosol data is observations by a ground-based network of sun and sky photometers AERONET. The network consists of more than 500 sites located throughout the world. Photometers provide measurements of direct solar and diffused sky radiation at a number of wavelengths. The AERONET retrieval algorithm [3] derives AOD and other integrated aerosol properties from direct and diffuse radiation measurements. AERONET observations are often considered as a standard for the column aerosol properties. An uncertainty of AERONET observations of AOD is about 0.01 for wavelengths > 440 nm [4,5]. In this paper, we used AERONET Version 3, Level 2 (cloud-screened and quality-assured) daily averaged total AOD data.

GEOS-Chem Simulation
GEOS-Chem is a global three-dimensional chemical transport model. The GEOS-Chem model is developed and used by research groups worldwide as it is applicable to a broad range of atmospheric composition problems. The model input includes meteorological data and inventories of emissions. The archived meteorological fields are from the Goddard Earth Observing System (GEOS) [27]. GEOS-Chem uses the Harvard-NASA Emissions Component (HEMCO) [28] to calculate emissions from different databases. The model output is a set of quantities such as tracer concentrations in every grid cell and others including AOD of major aerosol components at a number of wavelengths with a transport time step of 15 min. For calculating AOD, GEOS-Chem combines aerosol species into groups according to their optical properties: sulphate-nitrate-ammonium; size fractions of mineral dust; sea salt in accumulation and coarse modes; black carbon; organic aerosols.
In the present work, we used a nested regional application of the GEOS-Chem version v12.1.1. The simulation was performed at 0.25° latitude x 0.3125° longitude horizontal resolution and 47 vertical σ-layers up to ~80 km. We calculated daily averaged AOD at 440, 675, and 870 nm as these are standard reference wavelengths in AERONET products. Optical depths of above-mentioned individual aerosol groups in every 3D grid cell were summarized to obtain the optical depth of the total aerosol in the cell. The optical depths of the total aerosol in every vertical layer for the given horizontal grid cell were summarized to yield the total column AOD.

Spatio-Temporal Optimal Interpolation
In the OI scheme, an analyzed state is related to the forecast state by the equation: were x a is a vector containing estimated values at regular grid points, x b is a vector containing values calculated by a model at regular grid points, y is a vector containing values of observations at the observational points, K is a matrix containing weighting coefficients, H is an observation operator providing the link between the analysis variables and the observations, B is a covariance matrix of model errors, R is a covariance matrix of observational errors. The matrix of weighting coefficients K is to be determined by minimizing the mean-square error in the estimate. Equations (1) and (2) define the optimal linear estimator under the assumption that the errors are unbiased, the observational errors are uncorrelated, and observational and model errors are mutually uncorrelated. In OI, not all available observations are considered but only those lying in the vicinity of the point being updated. We applied STOI to estimate AOD in Europe in 2015-2016. We considered data from 88 European AERONET sites. The layout of the region and location of the sites are shown in Figure 1. As model AOD uncertainty [29] is significantly larger than AERONET AOD uncertainty, we assumed the observations to be perfect.
Prior to the implementation of the STOI, we compared GEOS-Chem simulated AOD with AERONET observations. The comparison revealed a bias of −0.032 for 440 nm, −0.025 for 675 nm, and −0.024 for 870 nm. Moreover, the dispersion of AERONET AOD turned out to be significantly larger than that of GEOS-Chem simulated AOD for each wavelength. To correct the discrepancy, we used linear regression. Then we applied STOI using the corrected values of GEOS-Chem simulated AOD.
To implement STOI, a spatial and a temporal correlation functions should be known. We obtained correlation curves by fitting them to the points presenting correlation coefficients of the model-minus-observation pairs of AOD at two spatial or temporal locations depending on the distance between them. Then we modelled the obtained correlation curves by analytic functions. We choose exponential functions with argument kd where for the spatial correlation function, d is the distance in kilometres, k = 0.002 for 440 nm, 0.0025 for 675 nm, and 0.003 for 870 nm; for the temporal correlation function, d is the time interval in days, k = 0.4 for 440 nm, 0.45 for 675 nm, and 0.5 for 870 nm. The separability of spatial and temporal correlations is assumed.

Results and Discussion
Using STOI, we obtained the estimate of the distribution of the daily averaged AOD in Europe for 2015-2016. To validate the results, we compared them with independent AERONET observations. We excluded AERONET sites Granada, Lille, and Minsk (see Figure 1) from the assimilation scheme and performed STOI for July 2015 using data from 85 remaining sites. We obtained estimates of AOD at each of the excluded sites and calculated root-mean-square errors of the estimates using AOD observations at those sites assuming a negligible error of AERONET AOD observations. Then we calculated rootmean-square errors of model-simulated AOD at those sites. The results of the comparison are shown in the Table 1. Table 1. Root-mean-square errors of the aerosol optical depth (AOD) calculated using GEOS-Chem and assimilated using spatio-temporal optimal interpolation (STOI) as compared to AOD observed by AERONET. The comparison shows that averaged over three wavelengths reduction in rootmean-square error of the estimate after STOI is 68% for Granada, 45% for Lille, and 22% for Minsk. The best improvement among those three sites is achieved for Granada. This is due to the presence of a number of AERONET stations located close to Granada, and the large errors in the GEOS-Chem calculations for Granada during the period under consideration. A relatively poor improvement occurs for Minsk. The East-European region is characterized by sparse observations. The dominant source of errors in assimilated AOD in this region arises from uncertainties in model results.
Generally, STOI is a computationally efficient technique able to decrease the errors significantly in comparison with the model calculation.