Application of Optimal Interpolation to Spatially and Temporally Sparse Observations of Aerosol Optical Depth

: Aerosol optical depth (AOD) is one of the basic characteristics of atmospheric aerosol. A global ground-based network of sun and sky photometers, the Aerosol Robotic Network (AERONET) provides AOD data with low uncertainty. However, AERONET observations are sparse in space and time. To improve data density, we merged AERONET observations with a GEOS-Chem chemical transport model prediction using an optimal interpolation (OI) method. According to OI, we estimated AOD as a linear combination of observational data and a model forecast, with weighting coefﬁcients chosen to minimize a mean-square error in the calculation, assuming a negligible error of AERONET AOD observations. To obtain weight coefﬁcients, we used correlations between model errors in different grid points. In contrast with classical OI, where only spatial correlations are considered, we developed the spatial-temporal optimal interpolation (STOI) technique for atmospheric applications with the use of spatial and temporal correlation functions. Using STOI, we obtained estimates of the daily mean AOD distribution over Europe. To validate the results, we compared daily mean AOD estimated by STOI with independent AERONET observations for two months and three sites. Compared with the GEOS-Chem model results, the averaged reduction of the root-mean-square error of the AOD estimate based on the STOI method is about 25%. The study shows that STOI provides a signiﬁcant improvement in AOD estimates.


Introduction
Aerosol is one of the major pollutants in the atmosphere.Atmospheric aerosol has a considerable impact on air quality, which is crucial for human health and the environment [1][2][3][4].Aerosol particles play an important role in the radiation budget of the atmosphere and affect climate [5].
One of the important characteristics of the aerosol burden in the atmosphere is aerosol optical depth (AOD), which is a measure of light extinction by aerosol.The atmospheric column integrated aerosol load can be derived from AOD observations.AOD was investigated by many authors, e.g., [6][7][8][9][10].Large discrepancies were reported in AOD values retrieved using different satellite instruments and retrieval techniques [11][12][13].A global groundbased network of sun and sky photometers, the Aerosol Robotic Network (AERONET), provides AOD data with low uncertainty [10,[13][14][15][16][17].However, AERONET sites are sparse in space, and observations are limited in time due to weather dependence.
Chemical transport models can provide values of AOD at all cells of a regular grid over the domain of interest.A variety of models is used to describe aerosol optical properties, including AOD [18][19][20][21][22].The disadvantage of the models is considerable uncertainty [22,23].
To obtain a likely true estimate of the AOD distribution on a regular spatial-temporal grid, data assimilation was applied.Data assimilation [24][25][26] is a technique of combining data from different information sources to produce the best possible estimate of the state of the system.Information sources usually include observational data and theoretical knowledge (forecast, model) on the system.Data assimilation is based on the minimum mean-square error principle of the estimation theory, with the estimate of the state of the system being chosen to minimize the uncertainty of the estimate.Data assimilation approaches are commonly divided into optimal interpolation (OI) [24,[27][28][29], Kalman filtering (KF) [7,[30][31][32][33][34], and variational methods [8,[35][36][37][38].Each method has advantages and disadvantages depending on specific applications.
Optimal interpolation, sometimes referred as statistical interpolation, estimates a value of the interest in a grid point through a weighted linear combination of observational and modeled data at the point in question and neighboring observational points, according to the accuracies of the data used.Weighting coefficients are chosen to minimize the meansquare error in the estimate.To obtain weighting coefficients, correlations between model errors in the different grid points are used.Weighting coefficients are not determined anew at each time point.A single correlation function is estimated from available data, assuming homogeneity and isotropy of the field.The model error statistics are assumed to be stationary.
KF is a sequential data assimilation scheme.KF is a two-step process: the forecast and the analysis.The forecast is made using a dynamical model, in which the estimate obtained in a previous time step is incorporated.The analysis is the same as OI.A forecast error covariance updated in every time step is used instead of a single model error covariance.This allows the reduction of the mean-square error in the estimate in comparison with OI.However, if the temporal gaps are present in the observational data, there is no improvement in comparison with OI, because the values being estimated converge too quickly to the model trajectory [39].
Variational methods are based on minimizing an objective function proportional to the square of the distance between the estimate and both the model and the observations.Under some commonly used assumptions, the three-dimensional variational method (3D-Var) is equivalent to OI [40].The difference is only in the method of the solution.In the four-dimensional variational approach (4D-Var), the minimization of the objective function is carried out over a time window.The numerical cost of the 4D-Var is very high.OI is much less computationally expensive than KF and 4D-Var methods.
In the classical OI method, only spatial correlations are considered.The method can be extended to include time dimension by using spatial and temporal correlations.The use of spatial-temporal optimal interpolation (STOI) allows for filling in not only spatial, but also temporal gaps in observations, and improving the accuracy of the method.The use of a temporal correlation is particularly important for temporally sparse observations.STOI was used in ocean sciences [41,42].We developed the STOI method for atmospheric applications.In our previous study [43], we used STOI combining AERONET observations and chemical transport model GEOS-Chem [44,45] calculations, for the estimation of the distribution of AOD at 870 nm over the eastern European region.In that previous study, we estimated the quality of the assimilation by comparing AOD obtained using STOI with observations made using the Microtops II portable sun photometer [46].
The present study aims a further investigate the capability of STOI to improve estimates of AOD by validating against independent AERONET observations.
In Section 2, we described the observation dataset and GEOS-Chem model used for the assimilation and the STOI method.In Section 3, we used STOI to assimilate AERONET AOD at wavelengths of 440, 675, and 870 nm for obtaining the distribution of total AOD over Europe and validated the method for two months and three AERONET sites.In Section 4, we presented the discussion of the results followed by conclusions.

AERONET Observations
One of the widely used sources of atmospheric aerosol data is observations by a ground-based network of sun and sky photometers, AERONET.This network was founded by the National Aeronautics and Space Administration (NASA), USA, and Photométrie pour le Traitement Opérationnel de Normalisation Satellitaire (PHOTONS), France.The network consists of more than 500 sites located throughout the world.The Cimel multichannel, automatic sun, and sky scanning photometers provide measurements of direct solar and diffused sky radiance at several wavelengths in the range 340-1640 nm with reporting frequency of 15 min during clear sky daylight hours [10,47].Since 2015, the latest version of the sun, sky, and lunar photometers CE318-T was introduced in AERONET that provides lunar light measurements in addition to sun and sky observations to retrieve AOD, volume size distribution, refractive index, single scattering albedo, and column water vapor.The Cimel photometers cannot perform measurements under cloudy weather conditions, so intervals between measurements can sometimes reach several days.The AERONET retrieval algorithm [16,48] derives AOD and other integrated aerosol properties from radiance measurements.The AERONET website provides a database of aerosol optical, microphysical and radiative properties for aerosol.AERONET observations are often considered a standard for column aerosol properties.AERONET data are widely used to validate model simulations and satellite observations.An uncertainty of AERONET observations of AOD is approximately ±0.01 for wavelengths >440 nm [16,17].In this study, we used AERONET Version 3, Level 2.0 daily averaged total AOD data.The Version 3 processing algorithm provides automatic cloud screening and instrument anomaly quality controls [48].Version 3 AOD data presents three levels of data.Level 1.0 data are unscreened, and Level 1.5 data are cloud-screened and quality controlled.Level 2.0 data are quality-assured.Level 2.0 AOD data are available within several months after measurement.

GEOS-Chem Simulation
GEOS-Chem is a global three-dimensional chemical transport model.It also enables simulations on local scales.The GEOS-Chem model is developed and used by research groups worldwide as it applies to a broad range of atmospheric composition problems.The team based at Harvard University and Washington University with support from the NASA Earth Science Division, the Canadian National Science and Engineering Research Council, and the Nanjing University of Information Sciences and Technology are providing the general management of GEOS-Chem [45].The model input includes meteorological data and inventories of emissions.The archived meteorological fields are derived from the Goddard Earth Observing System (GEOS) [49].The latest version of GEOS Forward Processing (GEOS-FP) has a horizontal resolution of 0.25 • latitude × 0.3125 • longitude and a temporal resolution of 1 or 3 h, depending on meteorological fields.The GEOS-Chem Classic configuration uses meteorological data on a rectilinear latitude-longitude grid to model horizontal and vertical transport.To describe tracer advection, GEOS-Chem Classic uses a semi-Lagrangian scheme of Lin and Rood [50].GEOS-Chem simulates detailed oxidant-aerosol chemistry in the troposphere and stratosphere.Aerosol chemistry was first included in GEOS-Chem by [51].GEOS-Chem uses the Harvard-NASA Emissions Component (HEMCO) [52] to calculate emissions from different databases.Global anthropogenic emissions, including shipping emissions, with a monthly resolution, are from the Community Emissions Data System (CEDS) inventory [53,54].Annual aircraft emissions are from the Aviation Emissions Inventory Code (AEIC) [55].Emissions from open fires are from the Global Fire Emissions Database (GFED) with monthly resolution [56].Emissions of dust aerosol [57], lightning nitrogen oxides [58], biogenic volatile organic compounds [59], soil nitrogen oxides [60], and sea salt aerosol [61] are computed in GEOS-Chem in dependence on the local meteorological conditions.The model output is a set of quantities, such as tracer concentrations, in every grid cell and others, including the AOD of major aerosol components at several wavelengths with a transport time step of 15 min.The GEOS-Chem model is widely used for estimating the distribution of aerosol species in the atmosphere [51, [62][63][64].
Aerosol tracers of GEOS-Chem include sulfate, nitrate, ammonium, mineral dust, sea salt, black carbon, and organic aerosols.AOD is calculated on the basis of aerosol tracer concentrations, using aerosol optical properties from [65,66].Hydroscopic growth of hydrophilic aerosol particles depending on the ambient relative humidity is taken into account.For calculating AOD, GEOS-Chem combines aerosol species into groups according to their optical properties: sulfate-nitrate-ammonium; size fractions of mineral dust; sea salt in accumulation and coarse modes; black carbon; organic aerosols.
In the present work, we used a nested regional application of GEOS-Chem Classic version v12.1.1.The nested version of GEOS-Chem Classic was first implemented in [67].Our simulation was performed for the European region at the native GEOS-FP horizontal resolution of 0.25 • latitude × 0.3125 • longitude and 47 vertical σ-layers up to ~80 km.We calculated daily averaged AOD at 440, 675, and 870 nm as these are standard reference wavelengths in AERONET products.The optical depths of those mentioned individual aerosol groups in every 3D grid cell were summarized to obtain the optical depth of the total aerosol in the cell.Finally, the optical depths of the total aerosol in every vertical layer for the given horizontal grid cell were summarized to yield the total column AOD.

Spatial-Temporal Optimal Interpolation
In the OI scheme, an analyzed state is related to the forecast state by the equations: Here, x a is a vector of estimated values at regular grid points, x b is a vector of values calculated by a model at regular grid points, and y is a vector of values of observations at the observational points.H is an observation operator providing the link between the analysis variables and the observations.H maps the scattered observation sites to the regular grid.K is a matrix containing weighting coefficients.Thus, the estimate is a weighted average of the forecast (model results) and observations.
The matrix of weighting coefficients, K, can be determined by minimizing the meansquare error E in the estimate, which is done by setting the derivative of E relative to each element K, equal to zero.The solution is where B is a covariance matrix of model errors, and R is a covariance matrix of observational errors.Equations ( 1) and ( 2) define the optimal linear estimator assuming that the errors are unbiased, and the observational errors are uncorrelated.The observational and model errors also are mutually uncorrelated.
Error covariance matrix B cannot be obtained directly.However, it can be estimated statistically.In OI, the correlation function is specified a priori assuming homogeneity and isotropy of the considered field.
We applied STOI to estimate AOD in Europe during the 2015-2016 period.We considered data from 86 European AERONET sites and the layout of the region and location of the sites are shown in Figure 1.Because the model AOD uncertainty [23] is significantly larger than the AERONET AOD uncertainty, we assumed the AERONET AOD uncertainty to be negligible.Prior to the implementation of the STOI, we compared GEOS-Chem simulated AOD with AERONET observations.The comparison revealed a bias of −0.032 for 440 nm, −0.025 for 675 nm, and −0.024 for 870 nm.Moreover, it was revealed that the dispersion of AERONET AOD is significantly larger than the dispersion of GEOS-Chem-simulated AOD for each wavelength.To correct the discrepancy, we used linear regression.After that, the STOI method was applied using the corrected AOD values simulated by GEOS-Chem.
To implement STOI we should know the spatial and temporal correlation functions.We obtained error statistics using observations from AERONET European stations, and simulations by a chemical transport model GEOS-Chem.We calculated spatial correlation coefficients of the model-minus-observation pairs of AOD at two spatial locations depending on the distance between them.We determined temporal correlation coefficients of the model-minus-observation pairs of AOD at two temporal locations, depending on the temporal interval between them.We constructed the correlations curves by fitting to the points presenting the obtained correlation coefficients.After that, we modeled the correlation curves by analytical functions.We chose exponential functions because they guarantee the positive definition of covariance matrices [27].The exponential functions were chosen with argument kd, where d is the distance in kilometers or the time interval in days for the spatial or temporal correlation functions, respectively, and k is a fitting factor.The best fit was provided by k = 0.002 at 440 nm, 0.0025 at 675 nm, and 0.003 at 870 nm for the spatial correlation function, and k = 0.4 at 440 nm, 0.45 at 675 nm, and 0.5 at 870 nm for the temporal correlation function.The obtained spatial and temporal correlation curves are presented in Figure 2. Figure 2 shows that the correlation coefficient value slightly decreases with an increase in the wavelength.A possible reason for such behavior is the larger contribution to AOD at longer wavelengths by coarse aerosol particles, which have higher spatial and temporal variability than fine particles.
We used the correlation functions to construct the covariance matrix of model errors.We calculated the weighting matrix for every spatial-temporal grid cell, using Equation ( 2), with a temporal resolution of 1 day.According to OI, not all available observations were considered: only those lying near the point being updated.We chose the number of observations using empirical selection criteria.Some tuning of numerical parameters is a common practice in data assimilation [34].We accepted the spatial correlation length of 850 km and the temporal correlation length of 5 days.These are the intervals where the correlation coefficient is equal to about 0.1.We assumed the separation of spatial and temporal correlations.We calculated spatial-temporal correlation coefficients for a given distance and time interval as a product of corresponding spatial and temporal correlation coefficients.If the resulting correlation coefficient turned out to be less than 0.08, it was considered zero.The covariance matrix of observational errors was assumed as a zero matrix.The estimates of AOD at every grid cell were determined from Equation (1).

Results
Using the STOI method, we estimated the distribution of the daily averaged AOD in Europe for 2015-2016.Figure 3 shows an example of the distribution of daily mean AOD at 870 nm before STOI (modeled by GEOS-Chem) and after STOI (estimated using assimilated AERONET observations) for 17 July 2015.We chose the date arbitrarily.The example illustrates the change in AOD distribution over Europe after applying STOI.
Figure 3 shows only individual cases and cannot be used for generalized conclusions.The statistical processing of obtained estimates will be performed in the future.However, Figure 3 confirms that the GEOS-Chem simulation underestimates AOD in general.
To validate the results, we compared the daily mean AOD estimated by STOI with independent AERONET observations.The AERONET sites Granada, Lille, and Minsk (see Figure 1, red triangles) were excluded from the assimilation scheme, and STOI for July and November 2015 was performed using data from 83 remaining sites.July is the month with the most AERONET observations.The number of observations is small in November in the northern part of Europe because of many cloudy days.We considered only days with Level 2.0 observations.At the Granada site, there were 29 days with observations in July 2015 and 28 days in November 2015.There were 21 days at the Lille site and 31 days at the Minsk site in July 2015, and only 6 days at Lille and Minsk sites in November 2015.In contrast with the Minsk site, there is a significant amount of AERONET sites close to the Granada and Lille sites.Nineteen AERONET sites were taken into account in the Lille region.The location of these sites is shown in Figure 4b.Twenty AERONET sites were considered in the Granada region.Their location is shown in Figure 4c.
We obtained estimates of daily mean AOD at each of the excluded sites for every day in November and July 2015.We calculated root-mean-square errors of the estimates using AOD observations at those sites when Level 2.0 observations existed, assuming a negligible error of the AERONET AOD observations.The root-mean-square errors (RMSE) of model-simulated AOD were calculated at the Granada, Lille, and Minsk sites.The comparison results are shown in Table 1 for July and Table 2 for November.In parentheses, the reduction in RMSE of the estimate after STOI is shown in tables.

Discussion and Conclusions
In this study, we presented a spatial-temporal optimal interpolation method for evaluating atmospheric aerosol load in locations where direct measurements on-site are unavailable.The observations in neighboring AERONET sites and model retrievals were linearly combined, accounting for their relative errors, to estimate the spatial and temporal distribution of AOD.The AOD values were calculated with the use of STOI to check the STOI method applicability.Results were compared with observations at Granada, Lille, and Minsk AERONET sites in July and November 2015.
In July, the RMSE of the estimate after STOI is significantly less than the RMSE of model-simulated AOD for every three sites at every three wavelengths.After averaging over three wavelengths (Table 1), the reduction in RMSE of the estimate after STOI in July is 68% for the Granada site, 45% for the Lille site, and 22% for the Minsk site.The best improvement among those three sites is achieved for Granada.This improvement is due to the presence of a number of AERONET stations located close to the Granada site, and the significant errors in the GEOS-Chem calculations for the Granada site during the period under consideration.A relatively poor improvement occurs for the Minsk site because of insufficient coverage of observations in the eastern European region.The dominant source of errors in assimilated AOD in this region arises from uncertainties in the model results.
In November, the RMSE of the estimate after STOI for Granada is comparable with that in July.However, the RMSE of modeled AOD was significantly lower in November 2015 than in July 2015.Accordingly, the reduction in RMSE is lower in November.For the Minsk site, the RMSE of both modeled and assimilated AOD is approximately the same in July and November.Averaged over three wavelengths, the reduction in RMSE of the estimate after STOI in November (Table 2) is 26% for the Granada site, and 18% for the Minsk site.For the Lille site, the percentage of STOI RMSE in November is negative: −19% on average, which is due to a perfect agreement between modeled and observed AOD for Lille in November, especially for the 870 nm wavelength.That is most likely a rare case.
In [43], we evaluated the correlation between the observed AOD and the modeled by GEOS-Chem AOD values, using statistics of observations at the 88 European AERONET sites for the 2015-2016 period.The correlation coefficients obtained in [43] were 0.60, 0.53, and 0.45 for wavelengths 440, 675, and 870 nm, respectively.Therefore, the discrepancy between observed and modeled values of AOD is usually significant.
Generally, the reduction of the RMSE averaged over two months, and three sites are about 25%.The result shows that STOI improves an estimate of AOD compared to model calculations.In the locations where the AERONET site density is high, STOI expectedly shows better improvements.In this paper, we assimilated AERONET data only.The satellite data assimilation will be considered using STOI in future work.Although the present study is focused on AOD only, the STOI technique can be applied to other aerosol properties and atmospheric species.

Figure 1 .
Figure 1.Location of the Aerosol Robotic Network (AERONET) sites (black triangles) considered in the assimilation scheme.Red triangles are the sites chosen for validation.

Figure 3 .
Figure 3. Distribution of daily mean AOD at 870 nm (a) modeled by GEOS-Chem and (b) estimated with the use of assimilated AERONET observations for 17 July 2015.The AERONET Granada, Lille, and Minsk sites chosen for validation also differ in the number of neighboring AERONET sites and the distance between sites.Minsk is located in the eastern European region, where observations are sparse.The location of AERONET sites in the Minsk region is shown in Figure4a.Only six AERONET stations operated in 2015 within the correlation length for the Minsk site, and all of them are at a significant distance from this AERONET site.

Figure 4 .
Figure 4. Location of the AERONET sites (black triangles) considered in the assimilation scheme for the Minsk region (a), Lille region (b), and Granada region (c).Red triangles are the sites chosen for validation.

Table 1 .
Root-mean-square errors of the aerosol optical depth (AOD) calculated using GEOS-Chem and assimilated using spatial-temporal optimal interpolation (STOI) as compared to AOD observed by AERONET in July 2015.

Table 2 .
Root-mean-square errors of the aerosol optical depth (AOD) calculated using GEOS-Chem and assimilated using spatial-temporal optimal interpolation (STOI) as compared to AOD observed by AERONET in November 2015.