Spatial Retrievals of Atmospheric Carbon Dioxide from Satellite Observations

: Modern remote-sensing retrievals often invoke a Bayesian approach to infer atmospheric properties from observed radiances. In this approach, plausible mean states and variability for the quantities of interest are encoded in a prior distribution. Recent developments have devised prior assumptions for the correlation among atmospheric constituents and across observing locations. This work formulates a spatial statistical framework for simultaneous multi-footprint retrievals of carbon dioxide (CO 2 ) with application to the Orbiting Carbon Observatory-2/3 (OCO-2/3). Formally, the retrieval state vector is extended to include atmospheric and surface conditions at many footprints in a small region, and a prior distribution that assumes spatial correlation across these locations is assumed. This spatial prior allows the length-scale, or range, of spatial correlation to vary between different elements of the state vector. Various single- and multi-footprint retrievals are compared in a simulation study. A spatial prior that also includes relatively large prior variances for CO 2 results in posterior inferences that most accurately represent the true state and that reduce the correlation in retrieval error across locations.


Introduction
Space-borne estimates of atmospheric composition are providing improved understanding of numerous geophysical processes that are closely connected in the climate system. Satellites such as the Greenhouse Gases Observing Satellite (GOSAT) [1] and the Orbiting Carbon Observatory-2 (OCO-2) [2] have provided a multi-year record of global atmospheric carbon dioxide (CO 2 ) concentration that is improving quantitative inferences for the carbon cycle [3]. The recently launched OCO-3 satellite facilitates small-area investigations over areas such as megacities [4]. These high-resolution satellite products can ultimately be used in flux inversion systems to infer carbon sources and sinks, potentially at regional scales [5,6]. The end-to-end processing pipeline from satellite spectra (Level 1) to inferred carbon fluxes (Level 4) includes multiple stages of inference that require robust uncertainty quantification [7].
An observation made by the OCO-2 instrument at a particular location consists of a spectrum, which is a vector y of calibrated radiances based on photon counts at different wavelengths. From this spectrum, the goal is to infer a state vector x, which includes CO 2 concentrations at different altitudes, along with other atmospheric and surface constituents within the instrument field of view, or footprint, which is 1.3 × 2.25 km in nadir mode. The OCO-2 retrieval algorithm infers the state vector from the observed spectrum by solving an inverse problem involving a physical forward model, which relates the state vector to In this work, we motivate the multi-footprint retrieval with a focus on a hierarchical statistical model for a remote-sensing observing system [7,20]. In this context, a multivariate spatial model for the joint distribution of the state is formulated. Recent research in multivariate spatial statistical modeling has provided flexible models that would have utility in inverse remote-sensing problems. Importantly, these multivariate spatial models allow the smoothness and correlation range of the spatial process to vary across variables of the state while maintaining a valid joint probability distribution [21]. This property is particularly applicable to the heterogeneous state vector considered in a remote-sensing retrieval. For example, CO 2 concentration near the surface in the presence of sources and sinks may have a different correlation range than concentration higher in the atmosphere where it is well mixed. Further, other state vector components, such as surface albedo, are likely to vary on spatial scales different from those of atmospheric constituents.
We invoke a Gaussian-process-based regularization with the Matérn covariance structure to incorporate the spatial dependency in the model. The Matérn covariance model has been widely used in spatial statistics. For a univariate spatial process, the model is parameterized by a range parameter λ that dictates the rate of decay of spatial correlation with distance and a smoothness parameter ν that characterizes the differentiability of the process. The exponential covariance function is a special case with ν = 0.5 [22]. This type of covariance structure has no boundary effects, and thus allows us to straightforwardly specify a joint spatial model that can incorporate different degrees of smoothness for the CO 2 concentrations at different pressure levels. Modeling the cross-covariance in the spatial structure is a major contribution of our article in this regard. In statistical terms, the constraint-based spatial retrieval formulation [15] corresponds to a Gaussian Markov random field (GMRF), specifically the conditional autoregressive (CAR) model [23]. A GMRF prior is also considered in hierarchical Bayesian retrievals of aerosol optical depth (AOD) from spectra observed by the Multi-angle Imaging Spectro Radiometer (MISR) [24,25]. The intrinsic GMRF implied through these spatial priors produces a somewhat inflexible spatial correlation structure, and the models we investigate offer additional flexibility.
The remainder of this paper is organized as follows. In Section 2, we introduce the multi-footprint retrieval methodology and mathematical notation in the context of OCO-2. Section 3 describes a collection of simulation experiments that investigate the spatial retrieval for CO 2 using a reduced-order linear model. Section 4 provides concluding remarks and prospects for future work.

Spatial Retrieval Methodology
We consider spatial retrievals for a set, S = {s 1 , . . . , s n }, of n footprints along a short span of a single OCO-2 polar orbit. Each footprint is indexed by geolocation information (longitude, latitude) s i . Figure 1 shows an example of a small area for an OCO-2 orbit that passed near the Lamont, OK TCCON site in October 2015. This density of observations within a small spatial domain in OCO-2's standard observing mode, as well as observations made in its target mode, are valuable for validation as well as for regional carbon cycle studies [26,27]. In this section, we outline the mathematical framework for estimating CO 2 concentration from the satellite spectra in the operational retrieval configuration and provide an extension to a multi-footprint spatial retrieval. q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q

Model and Notation
At a single footprint s i , the state of the atmosphere and surface is represented as a p-dimensional vector x(s i ) = x 1 (s i ), . . . , x p (s i ) . In addition to CO 2 concentration, other variable and unknown atmospheric constituents are included in the state vector for OCO-2. The additional elements include atmospheric aerosols, surface albedo, and surface air pressure. The numerical investigation in Section 3 examines an implementation with p = 39 [20]. The collection of state vectors in the small area is assembled into an np-dimensional vector, x(S ) = x(s 1 ) , . . . , x(s n ) .
The spectrum observed by the satellite is an m-dimensional vector y(s i ). For a single footprint, OCO-2 observes radiances of up to m = 3048 spectral channels. The spectra are observed in three narrow infrared bands: the O 2 A-band at 0.76 µm, the weak CO 2 band at 1.61 µm, and the strong CO 2 band at 2.06 µm. Each band corresponds to a different spectrometer on OCO-2 and has up to 1016 channels. The vector y(S ) represents the set of all spectra at locations in S. The physical relationships between the state and the spectra are contained in a forward model F(·). In general, the forward model is nonlinear.

The Spatial Objective Function
Many modern retrievals implement a Bayesian framework to infer the atmospheric state from the observed satellite spectra. The L2 retrieval implements an OE methodology for a single footprint at a time. Extending this assumption of independence across footprints, the L2 retrieval algorithm for a collection of footprints maximizes the objective function with respect to x(S ), where N · µ, Σ denotes a multivariate normal (Gaussian) density with mean µ and covariance matrix Σ. The statistical assumptions of a normal prior on the state and a normal data model for the spectra given the state are typical for OE retrievals [10]. Here, V i is a diagonal matrix containing the measurement-error variances, and µ i and Σ i are the prior mean vector and covariance matrix of x(s i ), respectively. The off-diagonal elements of the p × p matrix Σ i characterize the dependence between different elements of the state vector (i.e., between CO 2 at different pressure levels and the other state variables). This prior covariance matrix controls the regularization along the dimension of the state vector, and can be viewed as a "vertical" regularization for the CO 2 profile. Since the forward model is nonlinear, the objective function is optimized numerically and separately for each i using an algorithm, such as gradient descent, Gauss-Newton, or Levenberg-Marquardt [8].
The spatial retrieval instead maximizes the spatial objective function with respect to x(S ), where µ := (µ 1 , . . . , µ n ) , and Σ is a joint np × np covariance matrix with diagonal blocks Σ i (as above) and (i, j)th off-diagonal block cov(x(s i ), x(s j )). These off-diagonal blocks characterize the spatial dependence of CO 2 and other variables in the state vector. This allows us to borrow strength over space. These off-diagonal blocks can be viewed as a "horizontal" regularization. This spatial retrieval strategy has been implemented in multi-footprint retrievals for aerosols in particular [15,16]. These approaches have typically achieved this regularization with spatial smoothness constraints rather than with a spatial statistical model. Even so, the spatial smoothness constraints can be viewed as specific structures for the prior covariance matrix Σ. More generally, parameterizing the cross-correlations offers additional flexibility in the retrieval, but can be challenging. A strategy for the OCO-2 small areas is discussed in Appendix A.1.

State Vector
The radiances observed by OCO-2/3 are sensitive to a number of atmospheric, surface, and instrument properties that are unknown and that vary spatially. These properties are represented in the state vector x(s i ). In the simulation study in Section 3, a state vector of dimension p = 39 following [20] was used. The state vector is composed of four distinct groups of geophysical elements, as outlined in Table 1. The state includes an atmospheric vertical profile of CO 2 , surface air pressure, atmospheric aerosols of varying types, and wavelength-dependent surface albedo in the three spectral bands. Further details on this configuration can be found in [20]. This slightly simplified state vector omits some elements in the full physics state vector [10].
In extending the retrieval to a multi-footprint framework, it is important to consider the nature of spatial dependence for the various elements of the state vector. The horizontal spatial correlation length scale for CO 2 varies vertically. Near the surface, atmospheric CO 2 can be highly variable in the presence of surface sources or sinks. At higher altitudes, away from direct sources and sinks, CO 2 can have larger correlation length scales [28,29]. Aerosols are similarly sensitive to atmospheric transport, but have generally shorter residence times. Surface pressure and clouds have spatial scales connected to weather systems. Surface albedo over land can have short correlation length due to heterogeneity in surface types. A multi-footprint retrieval methodology should have the capability for spatial dependence that varies with the state vector element. While the OCO-2/3 retrieval estimates multiple atmosphere and surface constituents, the mission's primary quantity of interest is known as XCO 2 , which is the column-averaged dry-air mole fraction of CO 2 . This quantity is a weighted average of the CO 2 vertical profile portion of the state vector, The distribution of XCO 2 , given a Gaussian distribution for x(s i ), is given by: By plugging in the posterior distribution for XCO 2 (s i ), this formula can be used to evaluate our proposed procedure as a log score (see Section 3.2).

A Tractable Linear Model
We consider the special case of a linear forward model that can be described by a matrix, F S . We then have the Bayesian model That is, the measurement noise for an individual footprint is assumed to be uncorrelated with that for any neighboring footprints. Similarly, we have essentially assumed F S to be block-diagonal as well. This structure is consistent with a 1D (vertical atmospheric path) radiative transfer model. We make this assumption for the land nadir observations considered in the current work.
In this simple linear case, we can write down the posterior of x S in closed form: We can now examine the effects of Σ and/or F S being (incorrectly) assumed to be block-diagonal.
If both are block-diagonal, then so are Σ −1 x|y and Σ x|y . This would imply that dependence across footprints is being ignored. Further, the posterior covariances for individual footprints will typically be incorrect.
In addition, note that the smaller the noise variance V, the less the prior covariance Σ matters. So, the effect of using a spatial prior will be most pronounced if Σ is "small" relative to V.

Considerations for Degeneracy
If the prior state components are highly correlated across space, the prior covariance Σ may become nearly singular and require special handling. This is often the case after performing the multivariate spatial estimation procedure outlined in Appendix A.1. This issue can be circumvented with a low-rank model based on principal components, which is an approach used in other multi-footprint retrievals [16]. The covariance matrix is written as a product of diagonal standard deviation matrices S and a correlation matrix C, so that the highly correlated elements are directly accessible.
Performing an eigen-decomposition of C, we take the top e eigenvalues that explain sufficient variation and exclude all remaining terms. The result is written as Σ ≈ SP e D e P e S, where P e is the np × e matrix of the top e eigenvectors of the correlation C, and D e is the diagonal matrix of the e leading eigenvalues.
Propagating this low-rank prior in the original state vector space as if it were the original prior is not feasible because it has a singular covariance, and any step involving the precision matrix would fail. We instead reparameterize the model using x S ≈ SP ex + µ, wherex ∼ N(0, D e ) represents the lower rank latent space and cannot be directly interpreted. Using standard techniques, the posterior distribution forx can be shown to take the formx|y = N (μ x|y ,Σ x|y ), with Converting back into the original state vector space only requires the reparameterization, and a similar formula for the variance: V(x S |y) ≈ SP eΣx|y P e S.

Numerical Study
In this section, we present a collection of simulation experiments to assess the performance of the multi-footprint retrieval methodology. The simulations are carried out over spatial domains corresponding to small areas within individual OCO-2 orbits ( Figure 1). Multiple retrieval approaches are compared for each of three small area templates.

Simulation and Retrieval Configuration
We estimated a realistic covariance structure for the spatial distribution of the state at three space-time locations that coincide with TCCON sites; one in Lamont, OK, USA and two in Wollongong, Australia. The procedure for estimating the spatially informed covariance Σ from available data is rather involved and is described in Appendix A.1. The estimated spatial correlation range parameters for each template and state vector element are shown in Figure 2. The estimated range parameters are generally tens of kilometers for most state vector elements, but differences exist among state elements and across the three geophysical templates. Surface albedo correlation ranges are typically smaller than those for atmospheric constituents, such as CO 2 and aerosols.
The procedure also yields an estimate for the mean vector µ. For the purposes of the numerical study, these parameters are denoted µ T and Σ T and serve as the "true" datagenerating parameters for the study. The true mean vectors µ T , along with the operational prior means µ a , are summarized for the three templates in Figure 3 and Table 2. The true mean differs from the prior mean in meaningful ways for XCO 2 , as well as for other key state vector elements, such as surface pressure.   Figure 3. True mean vectors µ T and operational retrieval prior mean vectors µ a for the CO 2 vertical profile at the three TCCON templates. Table 2. True mean vectors µ T and operational retrieval prior mean vectors µ a for selected state vector elements at the three TCCON templates. The CO 2 profile means are displayed in Figure 3. Aerosols are represented with two location-specific types plus cloud ice and water. A small OCO-2 area is identified for each of the three TCCON coincidences. The locations form an 8 × 8 grid that resembles the swath over which the satellite would have collected observations. For each case, an ensemble of spatially correlated state vectors is randomly generated as x S ∼ N (µ T , Σ T ) from the reparameterized multivariate Gaussian prior described in Section 2.5, which has mean and variance parameters estimated following the procedure in Appendix A.1. Since each state vector has 39 components, each sample corresponds to 39 × 64 = 2496 values. For each simulated multivariate field, an ensemble of synthetic radiances is generated from a linear forward model based on the surrogate forward model of [20].

Lamont
Finally, a series of single-footprint and spatially informed retrievals are carried out on the simulated radiances. For each retrieval, the assumed prior mean vector, denoted by µ a , is taken to be the OCO-2 operational prior mean for the appropriate time and location. Importantly, the prior mean is not equal to the true mean, µ a = µ T . This is a realistic situation and has potential impacts on the retrieval bias [30]. Three different choices for the prior covariance Σ a are investigated:

1.
Operational, Σ a = I 64 ⊗ Σ a,0 , where I 64 is an identity matrix with a dimension matching the number of spatial locations. The OCO-2 operational prior covariance for a single footprint, Σ a,0 , is used at all locations, assuming no spatial correlation. This is essentially a single-footprint retrieval. In this case, the prior standard deviations for the CO 2 profile are substantially larger than those in Σ T (see Figure 3 of [30]).

2.
Spatial, Σ a = S a C a S a . The within-footprint operational correlation structure is extended between footprints by averaging parameters (see (A1) in Appendix A.1), yielding a multivariate spatial correlation matrix C a . This is combined with the standard deviations used in the operational retrieval, represented in the diagonal matrix S a .

3.
True, Σ a = Σ T . The prior covariance is set to the true data-generating spatial covariance. The true data-generating covariance Σ T and the spatial covariance Σ a both exhibit numerical instability, so low rank approximations are used, as discussed in Section 2.5. Using 200 dimensions for Σ T and 1000 dimensions for Σ a at each site recovered at least 99% of the variability across all scenarios. For additional numerical stability, the estimated range and smoothness parameters for the Matérn covariance function are truncated to numerically stable intervals of [0, 25] for the range and [0.5, 1.5] for the smoothness.

Results
Here, we summarize the results of the retrieval simulation experiments for the three TCCON templates. Several properties of the retrievals are relevant in this multivariate spatial setting. First, the retrieval properties for the various state vector elements at a single spatial location are summarized. Figure 4 shows the logarithm of the mean squared error (MSE) by the state vector element for a single pixel in the October 2015 Lamont experiment. The log MSE is shown in part due to the changing magnitudes of variability for the various state vector elements. The spatial prior tended to improve MSE, especially for spatially correlated parameters, such as the atmospheric components. However, using the highly informative true covariance in the prior can be detrimental to retrieval performance, as the prior mean misspecification contributes to bias in this scenario [30].
The different retrieval methods also yield different retrieval behaviors across the entire spatial domain. Two important aspects of the spatial behavior for the retrieval of XCO 2 are illustrated in Figures 5 and 6. The spatial retrievals result in smaller retrieval-error standard deviations, which is illustrated through XCO 2 credible intervals for a portion of the October 2015 Lamont template in Figure 5. Science investigations that use OCO-2 data involve combining retrievals in various ways [3], and an understanding of the correlation of retrieval errors is often critical. Figure 6 displays a series of correlation matrices of the XCO 2 retrieval error for the three choices of prior covariance. The operational single-footprint approach yields errors that are strongly spatially correlated, while the spatial prior reduces spatial correlations in the error for the October 2015 Lamont case. Figure 7 summarizes the mean absolute error (MAE) for XCO 2 by location for the Lamont template. While the errors are relatively uniform across the small area, they are smallest in magnitude near the center, where the spatial retrieval provides the most information from surrounding locations.    We aggregate the results for XCO 2 across multiple simulations in Table 3 by computing the mean squared error (MSE) and two forms of a mean log score. The log score evaluates the likelihood of the generated truth given the posterior distribution implied by the retrieval (e.g., [31]). These scores are examples of proper scoring rules or metrics that characterize predictive distributions. In practical terms, proper scoring rules are optimized when predictive distributions are as narrow as possible while still capturing the true state for a sufficient percentage of the time. The marginal log score uses the variance of the XCO 2 estimate for each location, ignoring the off-diagonal covariance terms relating XCO 2 measurements at different locations. For example, with 100 generated samples x S , the joint L and marginal log scores mL are

L({XCO
Here, σ (k) XCO 2 |y corresponds to the ith diagonal component of the kth simulation realization's XCO 2 posterior covariance Σ (k) XCO 2 |y . Since the operational prior assumes independence across footprints, the joint and marginal scores will be the same for this prior choice. For all three templates, the spatial prior yields the most desirable outcomes for both the marginal and joint scores. The scores are consistent with Figure 4, showing that using the true covariance as the prior is far too optimistic due to the prior mean misspecification. Table 3. Performance of multiple retrieval approaches from three simulation experiments. The joint and marginal log scores are defined in Equations (3) and (4)  Although it is of secondary interest, we also compute a mean absolute error (MAE) and MSE for the posterior estimate of the full state vector in case one of the components not related to XCO 2 is poorly estimated. Performance for the full state is consistent with the XCO 2 estimates. The important takeaway from this simulation is that the improvement in posterior accuracy and precision due to a spatial model is closely related to the covariance parameters. As the parameters strengthen cross-correlation, the benefit of a spatial model becomes more obvious.

Discussion
We have developed a multi-footprint retrieval approach for estimating atmospheric CO 2 from remote-sensing observations over small spatial areas. The statistical properties of the retrieval approach were investigated with simulation experiments with a realistic simplified physical forward model. A prior that combines spatial correlation across footprints with a large prior variance for individual levels of the CO 2 vertical profile often provides improved precision over single-footprint retrievals ( Figure 5). In addition, the multi-footprint approach can reduce the spatial correlation of retrieval errors, which enhances utility for downstream scientific use of the retrieval results [3].
The multi-footprint retrieval approach has demonstrated utility for other remotesensing applications [15,16], and the methodology in this work can set the stage for further development of spatial retrievals for CO 2 and other trace gases. Our simulations suggest that the actual spatial dependence as well as the assumed spatial dependence in the prior retrieval distribution have an impact on the retrieval precision and magnitude of spatial correlation in retrieval errors. Widespread implementation would need additional investigation of these interactions, as well as of the role of the within-footprint correlation structure. The multi-footprint retrieval may have added value in situations that are challenging for the current single-footprint retrieval, including low signal-to-noise situations, such as dark surfaces and high-latitude observations [32].
Other aspects of the observing modes for OCO-2 and OCO-3, as well as other greenhouse-gas-observing missions, could benefit from the multi-footprint retrieval approach. Both OCO-2 and OCO-3 periodically observe in target mode, where multiple observations at varying geometries are collected over an area near a validation site (e.g., TCCON) in a short time span [33]. Further, OCO-3 uses a pointing mirror assembly (PMA) for observations at varying geometries and has instituted a two-dimensional sweeping mode known as snapshot area mode (SAM). These modes produce observations over small spatial regions and could be combined by invoking prior distributions that account for inherent spatial dependence in the surface and atmospheric states. In addition, the spatial coverage of OCO-2/3 retrievals is incomplete in the presence of clouds. Cloudy footprints are typically screened out before the retrieval is attempted. A multi-footprint approach would still not use cloudy radiances, but could provide a mechanism for inferring conditions given nearby clear scenes.
There are several parameters in F, V i , µ i , and Σ that need to be estimated in order to implement the multi-footprint retrieval methodology. In principle, it would be possible to do this online as part of the multi-footprint retrieval with the aid of a hierarchical statistical model [7]. However, this might be quite computationally challenging, and would likely require joint retrievals for very large spatial fields as well as additional tools for Bayesian inference, such as Markov-chain Monte Carlo methods. Currently, the withinpixel parameters are estimated offline based on different sources of information and expert knowledge (Appendix A.1).
Using OCO-2 Level 2 products for characterizing the multivariate and spatial dependence of the state (i.e., the off-diagonal blocks in Σ) has inherent advantages and disadvantages. The OCO-2 products have the necessary spatial resolution and span the ranges of the plausible geophysical conditions anticipated. The estimation procedure outlined in Appendix A.1 aims to account for some artifacts introduced by the ill-posed nature of the retrieval, but remaining information on spatial correlation can still be limited, particularly for the retrieved CO 2 [14]. The additional parameters necessary for spatial retrievals could be estimated offline using atmospheric transport models [34] if the spatial resolution is suitable.
The current study has demonstrated some statistical properties of the multi-footprint retrieval for a linear forward model. Combined with the assumed Gaussian distributions, this setting provides tractable analytical results. In the operational setting, the forward model is moderately nonlinear, and the retrieval involves iterative numerical optimization. The forward-model evaluation on successive iterations adds noticeably to the computational expense of the retrieval. In a multi-footprint setting, the cost function (2) requires evaluation of the forward model for all footprints. Any computational gains or losses would depend on the overall number of iterations for the spatial retrieval. These and other operational computational challenges warrant further investigation.
If it turns out that more iterations of the nonlinear-least-squares solver are necessary for the same degree of convergence, there are several compromises that could be made to reduce the computational effort. First, we could use a statistical emulator of the forward model to obtain good starting values for the algorithm. Second, it is, of course, always possible to run the optimization for only a fixed number of iterations, at which point the achieved solutions might still be better than the current retrievals at the same computational effort. Finally, we could carry out "sequential retrievals", meaning single-pixel retrievals conditioned on the previous retrievals, as opposed to the simultaneous multipixel retrievals proposed above. More specifically, based on some ordering of the pixels in a small region, a regular retrieval would be carried out for the first pixel. The resulting posterior distribution, together with the assumption of spatial dependence, would then imply a prior distribution for the next pixel, which would be "tighter" (i.e., more peaked and informative) than the original prior, and so forth.
The multi-footprint retrieval provides additional regularization of the joint atmosphere and surface state in the horizontal spatial dimension. This constraint could enable relaxation of the regularization within a single footprint, particularly for the vertical profile of CO 2 . The use of the larger-variance spatial prior in Section 3.1 provides an initial investigation of this idea, but further adjustments to the within-footprint correlation structure could prove useful. This investigation would be valuable for the CO 2 retrieval problem as well as for the retrieval of vertical profiles of other trace gases, such as O 3 and CH 4 .

Data Availability Statement:
The data for the simulations presented in this study are available on request from the corresponding author. OCO-2 data products are publicly available available at the NASA Goddard Earth Science Data and Information Services Center (GES DISC) https: //disc.gsfc.nasa.gov/datasets?page=1&keywords=OCO-2.

Abbreviations
The following abbreviations are used in this manuscript: In order to carry out the simulation experiment in Section 3, a realistic probabilistic model for the state is needed. This model incorporates both within-footprint correlation for different state vector components (e.g., correlation in the CO 2 vertical profiles) as well as spatial correlation across footprints in a small area. This multivariate spatial statistical model is constructed to represent the small areas within the single OCO-2 orbits studied in Section 3. Since the areas considered are small in extent and to maintain focus on the statistical complexity of the multivariate nature of the problem, the spatial covariance of the state is taken to be isotropic. Therefore, the model is x(·) ∼ GP(µ(·), C), where GP is a Gaussian process with mean function µ and cross-covariance function C. For a given small area, the main challenge is to estimate the cross-covariance function C.
For the cross-covariance function C, we assume a product of a covariance matrix G describing the dependence between different state variables for a single footprint and of a Matérn correlation with a different range parameter λ k and smoothness parameter ν k for each variable x k (·). This covariance function can be viewed as a combination of the nonstationary covariance functions proposed in Paciorek and Schervish [35] and Stein [36] in an expanded space with an artificial latent dimension (in addition to the two geospatial dimensions of latitude and longitude) that distinguishes the different variables. The resulting covariance C kl (s i , s j ) between a combination of state variables and locations is where M is the Matérn correlation function, and λ kl = (λ k + λ l )/2. The question is then how to estimate G and the spatial parameters {(λ k , ν k ) : k = 1, . . . , p}. Actual OCO-2 retrievals from these small areas are a possible source of information for estimating the within-footprint and spatial correlation structures. However, the retrieval error for individual footprints is non-negligible [12,14] and should be accounted for when attempting to estimate the characteristics of the underlying process x. We propose a statistical model for the actual spatially indexed OCO-2 retrievalsx(s i ), x(s i ) = x(s i ) + i , where i ∼ N(0, Ω) is the retrieval error and where x(s i ) and i are independent. Then, for a single footprint, var(x(s i )) = G + Ω. Therefore, if we can obtain var(x(s i )) and Ω, we can simply obtain G as var(x(s i )) − Ω. Since it is difficult to disentangle the roles of G and Ω in the actual OCO-2 data products, we estimate Ω from a retrieval system simulation experiment for each small area. This type of simulation framework has been used for several retrieval error investigations for OCO-2 [20,30,37]. The simulation experiment produces an ensemble of synthetic state vectors x sim and retrievalsx sim . The distribution of i ∼ N(0, Ω) is estimated from this ensemble. With the estimate of Ω in hand, we combine this with an empirical estimate S of var(x(s i )) from the OCO-2 data. Then, a within-footprint covariance for the state vector x is G = S −Ω.
Estimation of G was carried out separately on four blocks of the state vector: CO 2 vertical profile, surface pressure, albedo, and aerosols. Where necessary, the estimation was constrained to the nearest positive definite matrix [38]. These computations were carried out using the Matrix package in the R statistical computing environment [39].
For estimation of spatial dependence, we use the OCO-2 retrievals for each state vector component x k (s i ) separately. For each small area, OCO-2 retrievals were assembled for any orbits within 300 km of the corresponding TCCON site within the month of interest. We estimate spatial dependence parameters using restricted maximum likelihood (REML), assuming a constant mean for each orbit. The variances and covariances are modeled as follows: var(x k (s i )) = φ k (G k,k + Ω k,k ) cov(x k (s i ),x k (s j ) = φ k G k,k M ν k s i − s j /λ k .
The diagonal elements, G k,k and Ω k,k , of the previously estimated matrices are fixed at this stage. Since the empirical variability ofx k (s i ) can differ from the sum of these components in some cases, a single scaling parameter, φ k , is estimated, along with the Matérn range λ k and smoothness ν k .
To review, here are the key steps in the spatial model estimation procedure: 1.
Run a single-footprint simulation experiment of the full retrieval system for the location of interest. 2.
Estimate the retrieval error covariance Ω from the simulation results. 3.
Assemble OCO-2 retrievals for orbits in the month of interest within 300 km of the TCCON site.

4.
Estimate the within-footprint covariance G from the OCO-2 retrievals.

5.
Estimate the spatial correlation parameters λ k and ν k from the OCO-2 retrievals, one state vector element at a time.
The above procedure has the practical advantage that the available data align exactly with the necessary structure of the state vector and the desired spatial resolution. However, the OCO-2 data products will still have incomplete information about the underlying physical mechanisms driving spatial correlations. Alternative sources for estimation would include transport models.