Assimilation of GOSAT Methane in the Hemispheric CMAQ; Part I: Design of the Assimilation System

: We present a parametric Kalman ﬁlter data assimilation system using GOSAT methane observations within the hemispheric CMAQ model. The assimilation system produces forecasts and analyses of concentrations and explicitly computes its evolving error variance while remaining computationally competitive with other data assimilation schemes such as 4-dimensional variational (4D-Var) and ensemble Kalman ﬁlter (EnKF). The error variance in this system is advected using the native advection scheme of the CMAQ model and updated at each analysis while the error correlations are kept ﬁxed. We discuss extensions to the CMAQ model to include methane transport and emissions (both anthropogenic and natural) and perform a bias correction for the GOSAT observations. The results using synthetic observations show that the analysis error and analysis increments follow the advective ﬂow while conserving the information content (i.e., total variance). We also demonstrate that the vertical error correlation contributes to the inference of variables down to the surface. In a companion paper, we use this assimilation system to obtain optimal assimilation of GOSAT observations.


Introduction
Methane is the second most important greenhouse gas (GHG) after CO 2 , contributing to about 16% of the anthropogenic radiative forcing of all types of GHGs [1][2][3]. The globally averaged methane concentration has risen sharply since 2007, while the largest annual increase of methane (15.85 ± 0.47 ppb) was recorded in 2020 [4,5]. Owing to a stronger warming potential and a significantly shorter lifetime compared to CO 2 , the reduction of methane has drawn much attention in GHG mitigation policy [6][7][8][9]. Furthermore, a higher methane concentration increases tropospheric ozone production, which is a critical problem, especially in populated areas in the Northern Hemisphere [10,11]. Methane is also identified as a major factor in the production of stratospheric water vapor, which indirectly affects the oxidation of other pollutants in the atmosphere [12].
Because of its impact on atmospheric chemistry and climate, better characterization of methane concentrations, and subsequently, inverse modelling of its emissions, has received significant attention over the past decade [1,[13][14][15][16][17][18]. While most past studies have focused on the global inversion of methane sources at low spatial resolution, inversions over regional domains and at higher resolutions are important for constraining anthropogenic sources, such as fugitive and high-emitting sources from industry (e.g., oil and gas fracking, natural gas production) [19][20][21][22]. Inverse modelling of methane on a highly resolved regional domain requires an accurate estimate of both the initial concentrations within the domain and the inflows of methane at the lateral boundaries of the regional model and their uncertainties. In other words, the contribution from emissions on a short time scale (e.g., a week to a month) of a regional domain can be significantly smaller than the inflow mass from the lateral boundary [23]. From an estimation point of view, the emissions signal (in inverse modelling) is masked by a much larger contribution from the state. Therefore, it is important to reduce the state uncertainty to a level comparable to the signal that we want to extract in inverse modelling. The scheme we have developed and verified by cross-validation in this two-part paper permits us to estimate the error variance of the state, which hopefully will be reduced enough by assimilation to extract the signal of the emissions.
Estimating the concentrations (i.e., in our case, the model state), whether for the purpose of constraining the initial conditions or boundary conditions or the whole domain in space and time, is a typical problem of data assimilation. We may contrast it with an inverse modelling problem where the variable to be estimated is not the model state, but rather a model parameter, such as the emissions. Furthermore, we should remark that the forward model in the data assimilation usually relates the estimated variable (i.e., a model concentration) with the observed variable (e.g., a column measurement). Hence, the forward model in the data assimilation context depends only on the measurement or instrument characteristics, such as the vertical averaging kernel or the radiative transfer problem when the radiance measurement is the observation (i.e., Level 1B data acquired by satellites [24]). In inverse modelling, the forward model relates, instead, the estimated parameter (i.e., emissions) with observations, and thus has to include the transport model (i.e., dynamic model of the atmosphere). See the tutorials in Kasibhatla et al. (2000) [25] for the commonalities and differences between inverse modelling and data assimilation.
Data assimilation has issues of its own, and several methods have been developed, such as the ensemble Kalman filter (EnKF), 4-dimensional variational (4D-Var), and hybrid ensemble-variational methods (see Asch et al. (2016) [26] for a detailed review). These methods require several model integrations (on the order of 30 to 100) for the state estimation. Ensemble methods are well-adapted for nonlinear atmospheric models, whereas variational methods are well-fitted for nonlinear observations operators (i.e., forward model in data assimilation context). However, regardless of nonlinearity, all data assimilation methods face the challenge of their applicability to large state-space models (e.g., atmospheric models state-space). For 3D-and 4D-Var, the applicability of the assimilation algorithm to large state-space is made possible with the introduction of the adjoint of operators combined with the use of an initial or background error correlation, assumed to be homogeneous and isotropic. In this case, the effect of multiplication of an (homogeneous and isotropic) error covariance with a state vector can be obtained as a sequence of operators without involving the storage of extremely large covariance matrices [27][28][29][30] (see Massart et al. [31] for 4D-Var assimilation of methane). For EnKF methods, applicability to large state-space arises from the use of a limited number of model integrations (e.g., 30-100 ensemble members) in combination with a localization performed with a Schur product of a tapering correlation function. The localization not only eliminates the spurious correlations at large distances but also increases the rank of the sample covariance to a value larger than the dimension of the state-space (as required by the assimilation algorithm), thus providing a full rank forecast error covariances to assimilate observations [32]. A comparison between EnKF and 4D-Var chemical data assimilation methods (i.e., to estimate the state, not the emissions) was conducted using a stratospheric chemistry transport model [33,34].
Another approach designed for the assimilation of long-lived chemical species is based on simplified Kalman filtering. This method is well-adapted to large state-space, linear problems, and exploits the properties of the continuity equation. In this scheme, the error variance is computed explicitly using only a single model integration. This assimilation method, historically also known as the sequential filter, was first used in the assimilation of long-lived chemical species in the stratosphere, both in case studies and over long time periods. In particular, it was used for the assimilation of MOPITT CO to perform a reanalysis of stratospheric ozone and to diagnose model error [35][36][37][38][39][40]. This scheme is, in fact, part of a larger class of algorithms under development called parametric Kalman filters (PKF), where additional covariance parameters, such as correlation lengths, are evolved in time [41,42]. In its simplest form, where only the error variance is evolved, we will call this algorithm the PvKF for parametric variance Kalman filter. The PvKF requires only two model integrations, one for the state estimate and the other for its error variance (or uncertainty) and seems well-adapted for the assimilation of methane as a long-lived species. Considering a lifetime of about 10 years [43,44] and limited atmospheric chemistry [13], together with a linear observation mapping with smooth averaging kernels (such as those from GOSAT), renders the whole assimilation problem quasi-linear. In terms of algorithm computational efficiency, the applicability of the parametric Kalman filter (or PvKF) to large state-space is made from the correspondence between the continuous and discrete representations of the different operators, more specifically between spatial correlation functions and the corresponding correlation matrices. This idea was originally developed for the Optimal Interpolation (OI) method. Although OI is often used in conjunction with a data selection procedure to process a large number of observations by batches, we may not need such a procedure (data selection) if the number of profiles per time step is relatively small so that matrices in observation space can be directly inverted. The main idea underlying the applicability of OI to large state-space comes from computing the error correlations on the fly, as needed, using a continuous correlation function between pair of points without the need for the storage of a matrix in state-space. It also has the advantage that spatial correlation functions can be represented on any grid [45]. Note that the comparison of the computational cost mentioned above is for different assimilation schemes (not emissions inversion). The purpose of this paper is to design assimilation of GOSAT observations using the PvKF approach with the hemispheric version of the Community Multiscale Air Quality (CMAQ) model. In a companion paper, we use this assimilation system to obtain optimal assimilation of GOSAT observations and realistic error covariance parameters, including observation and model error and correlation lengths. We envision that the development of the current hemispheric assimilation system will be used in the future on a limited domain for joint data assimilation and emission inversion using the regional CMAQ model.
The organization of the current paper is as follows. In Section 2, first, we present the modifications made to CMAQ to include methane as an evolving species and, in particular, how anthropogenic and natural emissions of methane can be incorporated into the SMOKE processing system. Then, we explain how to use the observation averaging kernels and a priori provided by the GOSAT data in a form useful for the hemispheric CMAQ grid and for assimilation. In Section 3, we describe the PvKF and how the scheme can be adapted for a large state-space, such as the CMAQ model. Section 4 details other aspects of the assimilation system, specifically, the initial conditions, the observation bias correction, and the formulation of homogeneous isotropic correlation models in the uniform hemispheric CMAQ grid. We close this section with a description of the parameters used in the error covariance matrices. Finally, in Section 5, we conduct simulated observation experiments, in particular the one-observation experiment that permits the verification of the algorithm and the assumptions used for propagation of error variance. We conclude Section 5 with an estimation of the computational cost of such an algorithm.

Modifications of the CMAQ Model to Handle Methane Transport and Emissions
CMAQ is a limited-area model developed by the U.S. Environmental Protection Agency (EPA) [46] that is used as a regional model driven at the lateral boundaries by the hemispheric version of this model (referred to as H-CMAQ) [47]. H-CMAQ, based on CMAQ v5.3, is used here to simulate and assimilate methane in the Northern Hemisphere. For all practical purposes, a hemispheric model does not need to account for the interhemispheric exchange of mass for time scales up to several months [12]. The domain covers the Northern Hemisphere on a polar stereographic projection (see Appendix A), which includes 187 × 187 grid cells horizontally with a 108 km grid spacing and 44 vertical layers extended from the surface to the model top at 50 hPa. The modified vertical structure of H-CMAQ maintains finer resolution than regional CMAQ above the boundary layer, aiming at a better representation of transport processes, also suitable for long-lived species [47]. We modified H-CMAQ to account for methane concentrations and emissions. Methane can be configured either as an inert trace gas or a reactive gas-phase species oxidized by hydroxyl radical (OH). Figure S2 shows an example of concentration loss due to chemical reactions after two weeks. We considered reactive methane with the gas-phase model within CMAQ v5.3 and based on the CB06 chemical mechanism [48]. The error variance is treated as a chemically inert tracer in CMAQ, subject only to an advection-only transport scheme. Our data assimilation system includes methane emissions of both anthropogenic (~60%) and natural (~40%) sources. For the anthropogenic emissions, we obtained all available source sectors from the Emission Database for Global Atmospheric Research (EDGAR v6) [49,50]. These sources come with 0.1 • × 0.1 • horizontal resolution and monthly temporal resolution emission grid maps. We processed them using Sparse Matrix Operator Kernel Emissions (SMOKE v4.5) [51] to provide hourly gridded methane emissions into the model. Wetlands are the primary source of natural emissions, accounting for about 80% of the total [13,52]. We used monthly wetlands emissions from WetCHARTS v1.0 with the full ensemble mean [53] and mapped it into our domain using a flat hourly/weekly temporal profile.

GOSAT Observation Operator for Data Assimilation
Greenhouse Gas Observing Satellite (GOSAT) was launched in January 2009 by the Japanese Space Agency (JAXA) [54]. It is in a Sun-synchronous orbit at an altitude of 666 km with a 3-day revisit time and an equator overpass time at about 13:00 local time. One main goal of GOSAT is monitoring the abundance of methane in Earth's atmosphere. Due to the high sensitivity of the shortwave infrared (SWIR) GOSAT retrieval at the surface, coupled with a suitable spatiotemporal resolution, the assimilation of atmospheric methane and inverse modelling of its sources and sinks are suitable [55][56][57][58][59].
The instrument, TANSO-FTS, onboard GOSAT, has a field of view with a 10.5 km diameter footprint operating in a cross-track scanning mode. It measures the abundance of methane by analyzing the backscattered solar radiance spectrum in the SWIR near 1.6 µm. A column-average dry-mole fraction of methane (XCH 4 ) represents the instrument retrieved observation, which corresponds to the methane average volume mixing ratio (V MR ≡ y o ) of a partial column atmosphere with a given surface,p S , and top pressure, p T . Two approaches are used to derive retrieval algorithms: Full Physics (FP) and Proxy (PR). The FP method integrates a sophisticated radiative transfer model and solely relies on methane modelling and its corresponding errors, while the PR algorithm provides more data points but relies on an accurate CO 2 model simulation and its retrieval (XCO 2 ) [56]. Both algorithms were developed at the Netherlands Institute for Space Research (SRON) and Karlsruhe Institute for Technology (KIT) [55], and their products are available through the ESA GHG-CCI initiative, https://climate.esa.int/en/projects/ghgs/ (accessed on 20 November 2021) [57]. We use both products, but mainly focus on PR due to higher density and/or better coverage [55,56].
For assimilation purposes, in addition to the retrieval data, we use supplementary products [57,60]. Thus, each retrieval also includes vectors of the normalized columnaverage kernel, A, pressure levels, p l , at which the average kernels are derived, and the corresponding vector of a priori, y p . PR and FP retrieval algorithms offer 5 and 13 levels of data in the vertical, respectively. A layer-based approach described in Bergamaschi et al. (2007) [61] is applied to compute the model partial-column value, y m , equivalent to the retrieval, y o y m represents the mapped concentration of H-CMAQ on the pressure layers of the observation, and ω is the vector of pressure layers weights, whose elements are expressed as where p S is the surface pressure and p T is the model top pressure. Note that Equations (1) and (2) are in the retrieval space so that all the vectors represent the vertical grids within a single retrieval. Now, let us consider the 3D model estimate of methane concentrations at time, t, in a vectorized form, X, with N x dimension. The observation vector, Y o , consists of a set of retrievals at approximately the same time but different locations from model grid points. It also provides the same type of quantity but with a dimension quite smaller than the model (N y N x ). Assuming a linear relationship between Y o and X, we have a linear observation operator, H.
In fact, H is a combination of two linear operators, comprising a horizontal, H h , and a vertical, H v , operator, where H h interpolates the components of X from the horizontal model grid points to observation locations using a bi-linear interpolation function, and H v transforms X at the geographical location of observations to a vector equivalent to Y o (using avergaing kernels). Thus, the assimilation problem consists of finding the best estimate of X with its error statistics, which together are called the analysis.

Background of the Assimilation Scheme
First, let us define a correlation function on an arbitrary grid following Gaspari and Cohn 1999 [45]. It is sufficient to define a correlation function in R 3 × R 3 , so that any subspace (or manifold) of R 3 (e.g., the surface of a sphere) also define a correlation on that subspace. This property was used in this study to define underlying homogeneous isotropic correlation functions (with periodicity on a sphere), which are then mapped onto the polar stereographic grid of H-CMAQ, which has a uniform grid spacing on the projected plane. PvKF uses these concepts, together with the property that for the advection equation, the error variance can be forecast without knowing the error correlation [62]. Thus, in a PvKF framework, the error variances are dynamically evolving according to the model's advection scheme, but the spatial error correlations are kept fixed and are computed using the same approach as taken by OI.
The algorithm of PvKF is decomposed into two steps; a forecast step and an analysis step, which accounts for the effect of observations. A covariance function, P(x, x , t) , is a function of a pair of points, x = (x , y, z) and x = (x , y , z ), at time t. It is related to the correlation function,C(x , x , t) , through the standard expression where σ is the error standard deviation (i.e., σ 2 (x, t) is the error variance at the point x and time t). In PvKF, the error correlation is generally time-invariant, homogeneous and isotropic in the horizontal (i.e., depends only on the horizontal distance). 3D spatial correlations are constructed using a horizontal/vertical separability assumption, and for the horizontal correlation is assumed to be homogeneous and isotropic. Note that the separability assumption has been verified for long-lived species in the stratosphere by Menard et al. (2019) [63] (see Figure S3 therein).

Forecast Step
The forecast step consists of two model integrations, one for the state and the other for the error variance, where M represents the chemical transport model based on the atmospheric diffusion equation [64,65] of CMAQ, and M * denotes the advection model of error variances based on the CMAQ native advection scheme. X(x, t) represents the chemical concentration as a function of 3-dimensional model space x = (x, y, z) and time t. Note that the error covariance forecast step of the PvKF does not follow the standard Kalman filter equation, but uses the continuous formulation of propagation of error covariances, which can be solved by using the method of characteristic as discussed in Cohn (1993) [62]. This approach was applied in several studies of assimilation of long-lived species [35][36][37]40]. Using the advection of error variance also has two advantages. First, it avoids the loss of error variance in standard Kalman filter formulation, such as in EnKF (see Section 2 in Menard et al. (2021) [66]). Secondly, it significantly reduces the computational cost compared to other methods (e.g., EnKF). The M * operator is, in fact, the integration from t − δt to t of where V is the 3-dimensional advection wind, and q(x, t) accounts for the model error variance growth (e.g., any other processes that are not advection; we discuss further in Section 4.4 and in Part II). It is important to note that in the context of data assimilation, the effect of the observations usually has the largest impact on error covariances, so that it is common to use a simpler model to forecast the error covariances or to run the adjoint model (e.g., incremental 4D-Var [67] used operationally for weather forecasting using a lower resolution model with simplified physics). Here, as in the incremental approach, we use only advection in the forecast of error variances. After the integration time, δt, we obtain a forecast concentration, X f (x, t), and a forecast error variance, (σ f ) 2 , from which the covariance function can be reconstructed by applying the correlation function from Equation (5). Therefore, a forecast error covariance function, P f (x, x , t), is obtained, considering that the correlation function is time-invariant (i.e., stationary). Thus, and equivalently in matrix notation where Σ f is a diagonal matrix of forecast error standard deviations and ⊗ is the Kronecker product of matrices. The matrix form is useful to write out the analysis step.

Analysis Step
Let us start as if H is only a horizontal operator (we will discuss the vertical aspects in the following Section 3.4). A computational simplification in the analysis step arises when observations are considered horizontally as point measurements. This allows simplification of the Kalman gain matrix and the analysis error covariance and variance. Horizontally, a point measurement observation operator can be modelled as a delta function at the observation location,x o . Specifically, if f (x) is a continuous function of space, x, and the horizontal observation operator, H h (as previously defined in Section 2.2), applied on f (H h [ f ] ), gives the value of the function at the observation location, f (x o ), we can write The H h [·] operator is thus associated with a delta function, δ(x − x o ). Although we never use this representation of the observation operator on the state fields themselves since they are represented discretely on a grid, we do apply this definition when we consider the application of the observation operator on error covariance functions. Thus, we have the following, [ (14) where subscript i and j refer to model grid points, whereas subscript o refers to the position of the observations. Considering the covariance function between any pair of observation locations, Suppose now we have K observations (with different locations) that are used in an analysis. The error covariance matrix in observation space (i.e., between each pair of observations) is HP f H T (a K × K matrix), where each element of this matrix is given by Equation (15). With this preamble, we are now able to formulate the analysis step for both the state estimate and the error variance. The analysis state is written as usual and requires the computation of the Kalman gain matrix K, where R represents the observation error covariance matrix. We should remark that in data assimilation, the observation error is not simply the instrument error, but should also contain an error of representativeness since, ultimately, observations are used to update the model state (see Section 4.4).
All state quantities (i.e., the forecast and the analysis) are written as vectors of dimension N x , where N x is the number of model grid points. We recall that Y o is a vector of dimension K that contains all observations processed in data assimilation. The matrix HP f is sometimes called the matrix of representors, where each observation location comes with all the model grid points. We can think of it as a vector of impact covariance functions. HP f can be denoted in column form as where each p i is a column vector (K × 1) that represents the forecast error between all observations and a single model grid point that we call the sensitivity error covariance functionin matrix form, (HP f ) T . The analysis update of the error covariance, called the analysis error covariance in data assimilation, takes the form, is derived using Equation (17).
In particular, an element (i, j) of the analysis error covariance matrix, P a , can be conveniently written as where represents the modelled innovation covariance matrix. The analysis error variance, i.e., when j = i, is simply computed as The analysis error variance is then used as an initial condition to integrate the variance through Equation (9). This completes the algorithm.
The next section deals with the three-dimensional observation operator (Equation (4)), which includes the vertical structure of the averaging kernel, rendering the analysis update more complex.

Analysis Step with 3D Observation Operator Using Averaging Kernels
The three-dimensional observation operator is given in Equation (4). First, it consists of interpolating the 3D field horizontally, as in Equation (12), to get vertical profiles at the observation location and then applying the vertical observation operator, H v , using the instrument averaging kernel to get the observation equivalent quantity. The vertical observation operator in function form is denoted as H v(z) . To apply it, first, we use a mapping function (i.e., vertical interpolation,V z , on z) to convert the model vertical concentrations to the vertical layers of the observation retrieval, so that y m (see Equation (1)), the a priori y p , and the averaging kernel A are in the same vertical coordinates. Thus, in an observation location, we get where f o denotes the vertical model concentrations at the observation location and z * is the vertical coordinate as in averaging kernels.
In the second step, H v(z) carries out y m (z * , t) in Equation (1) and computes a single value equivalent to the retrieval,y m (See Section 2.2 for an explanation to Equation (1)).
Note that the quantities in bold (A, y p , ω, y m ) are defined on the vertical layers of the averaging kernel, and the non-bold (y m ) is in fact a scalar quantity representing the column-averaged quantity. Therefore, H v(z) maps a vector in the model vertical space to a scalar in the observation space.

The observation operator H, which is applied on functions, is the composition of
which is equivalent to the matrix form of the H in Equation (4). Before applying the observation operator on a covariance function, we need to specify on which variable it is applied. If it is applied on the spatial variable, x, it is equivalent, in matrix form, to a left multiplication of the observation operator on the covariance matrix, However, if the observation operator is applied on the spatial variable, x , it is then equivalent to a right multiplication of the transpose of the observation operator on the covariance matrix, Using the separable form of the covariance function in Equation (10), we thus get where H v(z) is to denote, in functional form, the vertical observation operator operating on z (and not z ). However, to use the vertical observation operator, the expression in square brackets in Equation (29) has to be written in vector/matrix form. Note that in matrix form, where N lev is the number of vertical levels in the model. This matrix can actually be written as diag(σ is a row vector that depends on the (second) vertical coordinate, z . Overall, the application of the observation operator in Equation (29) produces a 3D spatial field of elements, Based on Equation (28), a similar expression is obtained, where we let the reference observation location be o (2). To compute the (simulated) innovation covariance matrix (i.e., HP f H T ) we combine Equation (27) with Equation (28), and account for a pair of non-coincident observations, o(1), o(2), we get where β o(1),o (2) is an element of a N lev × N lev covariance matrix, 3.5. An Overview of the Assimilation Algorithm Figure 1 presents a flowchart of the forecast and analysis steps of the PvKF algorithm described above. The algorithm's forecast step (left side) involves two parallel model simulations, one for the state's transport with the atmospheric diffusion equation [64,65] and the other for the advection of the error variance (Equation (9)). In the first part of the analysis step (right side), the covariance function is convolved with the observation operator on the left for a series of available observations at a given time (Equation (27)). This is followed by a second convolution with the observation operator on the right for the same observations to obtain the innovation covariance matrix (Equations (28) and (33)-(35)). The second part of the analysis step (Equation (22)) performs a Cholesky decomposition and inversion of the modelled innovation covariance matrix. Finally, the analysis and analysis error variance increment are computed for each set of observations at a time t (Equations (16) and (23)). A complete PvKF assimilation algorithm in a matrix form is presented in Appendix B. As mentioned earlier, the PvKF assimilation relies on the continuous properties of operators, which makes it a suitable scheme for evolving the analysis and its error variance. An example of the error variance evolution with synthetic GOSAT observations over land is shown in Figure 2, emphasizing the combination of the 3rd part of the analysis step and the 2nd part of the forecast step of the flowchart (Figure 1) in creating the analysis error variance evolution. In Figure 2, the error variance is initiated with a constant field using a 5% error (standard deviation) of the globally averaged methane concentration (Figure 2a). The reduction of error variance occurs in the presence of every observation at a particular time and location based on the correlation function (Figure 1-analysis step) while the updated error variance field is propagated with the flow (Figure 1-forecast  step). The reduction gradually grows over land (Figure 2b) by assimilating a new batch of observations over the same regions according to GOSAT revisit time (i.e., 3 days). After 8 days, the analysis error variance field (Figure 2c) shows a noticeable reduction over Western Asia, followed by Central Asia, Eastern Africa, and the southern/eastern part of North America, mainly due to denser GOSAT observations and higher innovations at those regions. Then, owing to the advection, the reduction of error variance spread out with the flow over other regions with fewer or no observations, such as oceans and higher latitudes. On day 12 (Figure 2d), most of the lands, except polar regions and near the Equator, are subjected to a significant error reduction, suggesting a higher impact of the GOSAT observations due to their density over those areas. The maximum reduction (light blue) indicates a 20-40 times reduction in the initial error, while assimilating more observations within PvKF after 16 days does not provide a tangible reduction in the analysis error ( Figure 2e). Nonetheless, over time, the variance reduction significantly spreads out over the entire domain, particularly over the oceans with no observations (Figure 2f). This highlights the behaviour of the assimilation system that provides an error estimation over the whole domain.

Initial Conditions
The impact of the initial condition can persist long in the model. For different species, it is recommended that H-CMAQ be initialized to the clean background and driven based on its emissions and chemical transport [47]. However, due to an almost well-known background field of methane, the long lifetime of atmospheric methane (~10 years) [43], and the high level of uncertainties in its source and sink [68], conducting a lengthy model integration might be unnecessary. Furthermore, in the context of data assimilation, the sensitivity to the initial conditions disappears rapidly with time [35].
In this study, we initialize the model using prescribed vertical profiles as well as a 2D concentration field obtained from previous global analyses. Our model setup uses the approach of Olsen et al. (2013) [69], which is derived from a nonlinear polynomial fit to global models and various types of measurements from 2003 to 2006. This initial field, taken as a first guess, is a function of latitude and altitude varying smoothly from the North Pole to the Equator with no temporal and longitudinal variation. The details of this preparation can be found in Xiong et al. (2008) [70]. Maintaining the shape of the vertical profiles, we rescale the initial field to our simulation period, April 2010, based on the annual/monthly increase in globally-averaged atmospheric methane [71].
With the aim to remove the potential biases in the model, we obtain accurate surface observations during the assimilation period from GLOBALVIEWplus CH4-ObsPack v3.0 [72] compiled by National Oceanic and Atmospheric Administration (NOAA) (Figure 3a). In our bias correction, we assume that a significant part of bias arises from the emissions at the surface, and the rate of change of concentration with height remains the same (shape of the vertical profile). In fact, we multiply the vertical profiles by a constant that depends on the discrepancy at the surface (we assume that vertical mixing in the troposphere is unchanged). Note that the model vertical coordinate is extended up to the upper troposphere/lower stratosphere, and the profiles are adapted from the analysis of global methane studies [69,70]. This assumption is made mainly due to the fact that other accurate observations such as Total Carbon Column Observing Network (TCCON) are used for the validations of our assimilation results (see Section 5 of Part II).
The bias removal involves adjusting the initial concentration field (i.e., initial guess) using a latitudinal linear regression model to match the surface observations (i.e., flasks and towers). Accordingly, first, the monthly averaged linear fit of Observation-Model against latitude using the first guess model initial conditions is obtained (Figure 3b-green dots). Next, the achieved fit is rescaled (with respect to globally averaged concentration from NOAA [71]) to our simulation period in April 2010 (Figure 3c-red dots), and eventually, the fit is employed to remove the bias of the entire space of the initial conditions (Figure 3d-blue dots). It indicates that the latitude-based bias in the initial conditions is subtracted, resulting in a better (unbiased) agreement of the model and the surface observations. This could imply that the polynomial function [69] deriving our first guess initial conditions is too smooth; thus, it tends to overpredict the lower latitude concentrations compared to the surface in-situ observations. In addition, the hemispheric absolute mean bias is decreased from 17.1 ppb for the rescaled initial to 4.3 ppb for the bias-corrected initial. Note that the negative values in Figure 3b-d correspond to the positively biased model forecasts (i.e., model overestimation). We consider the model that is initialized with the bias-corrected field as our nearly unbiased model and rely on this to remove the potential bias between GOSAT and H-CMAQ. An illustration of the bias correction of GOSAT with respect to H-CMAQ is provided in the following subsection. Note that the bias correction here is only applied for one month (i.e., April 2010) and may not be representative for another month or a longer period; thus, one needs to take into account the same type of bias correction for the time of assimilation.

Observation Bias Correction
GOSAT provides high accuracy retrieval data (~0.7% precision) with reasonable nearsurface sensitivity and global coverage, making it a strong candidate for methane assimilation analysis [55,73]. The potential bias in GOSAT XCH 4 retrieval has been addressed previously by evaluating the data against other types of measurements [74][75][76][77]. Both PR and FP versions of GOSAT data used here are post-processed and validated against surfacebased Fourier transform infrared (FTRS) methane column abundance from TCCON. It was shown that the difference between XCH 4 retrieval and TCCON correlates with the albedo, α, at 1.6 nm in band 2 [57].
In addition to the retrieval bias, GOSAT can still have biases relative to atmospheric chemical transport models. This bias most likely originates from the emissions as well as the limitation of global models to realistically simulate the methane in the stratosphere, particularly at higher latitudes [78][79][80]. For example, Turner et al. (2015) [58] showed that their GEOS-Chem simulation of GOSAT features a positive latitudinal bias, where the mean Model-Observation is adjusted based on a fit to a quadratic regression function. Cressot et al. (2014) [52] obtained a linear regression fit as a function of air mass factor (AMF) to correct for the biases in GOSAT with their chemistry-transport model, LMDz-SACS. Following them, we parametrize the bias in GOSAT with respect to our unbiased CMAQ simulation. First, the model is mapped to observation space using our observation operator (H) described in Section 2.2. Next, GOSAT-CMAQ is obtained as a function of retrieval parameters. Accordingly, a separate linear regression fit of GOSAT-CMAQ is computed for each of latitude, air mass factor (AMF), solar zenith angle θ s , and satellite viewing zenith angle, θ v (Figure 4). Except for the θ v with no specific correlation pattern, the largest correlation of GOSAT-CMAQ is found with respect to latitude (r = 0.43), followed by θ s (r = 0.40) and AMF (r = 0.33). All three fits show similar behaviour, indicating that the discrepancy is smoothly increasing toward the larger values of those parameters (Figure 4a [52]. Perhaps it is related to the fact that LMDz is a low-resolution general circulation model (GCM), whereas CMAQ and GEOS-Chem appear more as chemical transport models with higher resolution. The positive bias of CMAQ (Model > Observation) was found previously in other atmospheric chemistry models such as GEOS-Chem and addressed mainly as extratropical stratospheric bias due to the extra meridional stratospheric transport [80][81][82] as well as polar vortices [83].  Figure 5 compares the Observation-Model before bias correction (GOSAT 0 -CMAQ) and after bias correction with respect to latitude (GOSAT bias(lat)-CMAQ). The hemispheric absolute mean bias (MB) significantly decreased from 32.9 ppb to 0.9 ppb, while the residual standard deviation (σ) turned out to be slightly smaller (20.1 ppb to 18.1 ppb) after bias correction. As expected from the correlation values in Figure 4, using AMF or θ s instead of latitude to remove the potential bias of GOSAT resulted in a less significant bias reduction because of a larger hemispheric σ and MB. Figure S1 in the Supplementary Materials indicates a similar comparison using bias-corrected GOSAT data with respect to AMF and θ s . In addition, in an attempt to maximize the information in the regression model, we adopted a simple multivariate linear regression [84]. However, we found that such an algorithm cannot considerably improve our prediction model (see Tables S1-S3 in Supplementary Materials). This suggests that all three parameters are dependent and can be highly correlated to each other. Therefore, a bias-corrected GOSAT-CMAQ with only latitudinal adjustment could provide a reasonable bias correction to our data. After bias correction, the model discrepancy with observations can be primarily random and could be attributed to the random emission errors, the correlation lengths (L h , L v ), and random errors in the observations (ε o ), initial (ε i ), and forward model (ε q ), for which we provide a description in Section 4.4.

Construction of Spatial Correlation Functions on the H-CMAQ Grid
The H-CMAQ grid is uniform on the projected plane. Nearly all spatial correlation models rely on an underlying uniform and isotropic grid representation. This is what most variational assimilation systems take into account, and further, assume either an infinite or periodic domain. Here, we will explain how to construct spatial error correlations on the H-CMAQ grid while representing an underlying homogeneous isotropic correlation function. The procedure consists in constructing a homogeneous isotropic and periodic correlation function on the surface of a sphere that is independent of the grid, and then mapping it onto the H-CMAQ grid.
The construction of homogeneous isotropic correlation functions on the surface of a sphere has been developed in Gaspari and Cohn (1999) [45]. Background information on the method is also given in Ménard (2000) [35]. A correlation function is a function of a pair of points (i , j). Consider a sphere of radius a, with position vectors (from the centre of the sphere) r i , r j for the points i and j. We define a Cartesian coordinate system in R 3 ,(x C , y C , z C ), where we use the superscript C to recall the Cartesian space. A point i (or j) on the surface of the sphere that has (in a spherical coordinate system) a longitude ϕ i and colatitude (or inclination) θ i is related to the Cartesian coordinates, as follows and similarly for the grid point j. The chordal distance D i j between the position vectors Using a chordal distance to define a "distance" for a correlation function on a sphere has the double advantage that a correlation model originally defined in R can be used in R 3 to define a correlation on a sphere and has the property of being periodic [45].
Three models have been considered in this study: the Gaussian (or double exponential) model, the First-Order-Auto-Regressive (FOAR), or exponential model, and the Second-Order-Auto-Regressive (SOAR) model Note that L c denotes a correlation length, defined from the curvature at the origin, while L * c or L * * c is a model parameter adapted from Ménard et al. (2016) [85]. We recommend always using L c , as it is a physically meaningful quantity.
The second step consists of mapping the H-CMAQ grid onto the corresponding point on the surface of the sphere (see Appendix A for information on the polar stereographic projection used in H-CMAQ). As mentioned in Gaspari and Cohn [45] (1999), if there exists a one-to-one transformation, (see Equation (A1)) then we can define a correlation function with respect to the (x p , y p ) coordinates. The computation then goes as follows where C is one of the correlation models above (Equations (38)-(40)).

Observation, Model and Initial Error Covariance Modelling
As in any assimilation system, the input error covariances need to be specified. In this section, we describe the modelling and assumptions of those error covariances. It is generally assumed that observation errors, model errors, and initial errors are uncorrelated, and furthermore that the observation and model errors are serially (temporally) uncorrelated. These are the standard assumptions used to derive the Kalman filter. We should note that because of the dynamics and the interplay with the analysis, as the forecast error becomes realistic, it also becomes correlated in space and time and with past observation errors (this is a standard result in Kalman filtering theory; see, for example, Jazwinski (1970) [86]).
Let us assume that the observation error R and the model error Q are both spatially uncorrelated. The observation error has two components: the measurement error (ε m ) provided by the instrument team and the representativeness error (ε r ) arising from the mismatch between the subgrid-scale represented in the observation and the gridded model quantity [87,88]. We assume that the observation error covariance is diagonal and takes the form of R = (ε o ) 2 I.
Many studies estimate the representativeness error as an additive source of error to the measurement error [35,36,[89][90][91]. However, we tune the overall observation error by considering a multiplicative correction factor applied to the measurement error as denoted in Equation (43). A multiplicative factor is consistent with other types of errors (i.e., modelling and initial error) used in this study. In fact, it is used in the majority of data assimilation papers [35,92]. Therefore, the representativeness error is included as part of the correction factor f o .
where R denotes the observation error covariance after tuning. Note that initially, no representativeness error is accounted for R . The model error (ε q ) is assumed to be proportional to the analysis [35]. We simply assume a uniform accumulation of model error in time, where it becomes almost equivalent to a particular time average of the analysis, X a (x, t), (i.e., monthly averaged analysis is used in this study). Model error covariance is considered as a diagonal matrix, Q = (ε q ) 2 I [93,94], and a relative model error standard deviation, f q , is defined for tuning Q, which has the form Q = ( f q X a (x, t)) 2 I, (44) where Q denotes the model error covariance after tuning. Note that model error variance, q, in Equation (9) is equivalent to the diagonal elements of Q . We attempt to tune the initial concentration error (ε i ) separately, although it is often integrated as part of the modelling error. The initial error covariance matrix is also assumed to be diagonal. We assume that the initial error before tuning is about 5% of the initial concentration (i.e., ε i = 0.05 X f 0 ) with the same spatial distribution. By conducting a similar analysis as in Section 4.2 on the first day of the simulation, we found that the mean standard deviation of the residual (Observation-Model) is relatively smaller (~20 ppb ≡ 1.2%). The parametric form of the tuned initial error covariance is where P 0 represents the initial error covariance after tuning.

One-Observation Experiment
We conduct a one-observation experiment using a single simulated observation. It is a standard experiment to verify that the mechanics of the assimilation system is working and can also be used to verify the validity of some assumptions in the formulation [27,95]. The one-observation experiment can also be used to compare different assimilation systems [96].
When a single observation is assimilated, the analysis increment has the same spread in space as the spatial correlation function-a property that can be derived directly from the analysis equation and Kalman gain.
With the one-observation experiment, first, we will verify that our formulation of error covariances (in R 3 × R 3 and projected onto the polar stereographic grid of H-CMAQ) does yield a homogenous isotropic correlation. Secondly, we will show that our assumption to use only advection to propagate the error variance is adequate for assimilation on a time scale relevant to GOSAT assimilation. Errors in the wind, emissions, and any errors not accounted for our simple advection of error variance such as horizontal and vertical diffusion effect on error variance will contribute to the model error.
The assimilation system is examined with a single simulated observation and arbitrary error parameters, including f i = 0.05, f q = 0.015, f o = 0.5. Note that the initial concentration is equivalent to the bias-corrected concentration field (see Section 4.1), and initial variance is obtained from Equation (45) where ε i = 0.05 X f 0 . These quantities are kept the same for all the cases in this experiment. In this experiment, we consider the SOAR correlation model with a horizontal correlation length of 500 km and 10 model vertical layers (starting from the surface) of the sigma-pressure coordinate (referred to 10σ l length scale).  (1)) generated with the model while accounting for the GOSAT averaging kernels and a priori. To define the magnitude of the synthetic observations, we assume a multiplicative factor of 1.4 (40% higher) to the model column-averaged concentrations. The assimilation consists of propagating the analysis and its error variance for a duration of 3 days after the single observation is taken into account. We only examine 3 days, which corresponds to the GOSAT revisit time, that is, the time after which a new batch of observations becomes available for the same region.
For the observation in the high latitude (Figure 6a-d), the analysis increment and the error variance reduction are shown near the surface, whereas for the observation in the lower latitude (Figure 7a-d), they are demonstrated at about 600 hPa (~layer 23). By comparing Figure 6a with Figure 7a, we note that the error correlation, captured by the analysis increment, is indeed homogeneous and isotropic. The apparent increase in radius of the spatial spread in lower latitudes is due to the polar stereographic projection (we verified that the length scale is, in fact, the same). By comparing the propagated analysis increment with the propagated error variance reduction (i.e., Figures 6b,d and 7b,d), we found that the transport of the analysis increment and the reduction of error variance are quite similar in terms of their spread and spatial distribution. This suggests that the propagation of the error variance using advection only is a reasonable assumption to make, since it maintains a similar structure as the methane fields. Furthermore, we found a comparable weight of the analysis error variance at the start day (Day 0) and after 3 days; in particular, we note that the maximum reduction of error variance has not changed more than 5% after 3 days (i.e.,  In another series of experiments, we examine the contribution of error correlation to address the vertical estimation properties, especially the quantities at the surface. To illustrate the propagation in the vertical direction, we obtain a vertical cross-section (Figure 7a-d) passing through the observation at Day 0, starting from (28 • N, 120 • W) to (15 • N, 80 • W). The observation is at the center of the analysis increment and center of the error variance reduction (Figure 7a,c), also called increment/reduction. Figure 8a,c illustrates the increment/reduction at the moment when a single observation is assimilated. Several factors such as the averaging kernel, the model layer pressure weight (ω), and the correlation length scale based on the SOAR correlation model define the increment/reduction patterns of the analysis and error variance. A relatively large correlation length (L v = 10σ l ) results in a stretched pattern of increment/reduction along the vertical direction. The vertical distribution also indicates that they are mainly influenced by ω (Figure 9c), resulting in a maximum increment/reduction at mid-troposphere (~600 hPa, or 23rd layer). On Day 3, these quantities are shown on the same plane. The largest increments/reductions are shifted toward the southeast of the initial point (Figure 7d), mainly due to advective transport. The large distension is mainly influenced by the wind pattern, while the diffusion improves the smoothness of the field, especially in lower levels (Figure 6b,d). We should note that the higher increment/reduction to the lower troposphere and near-surface enhances the ability of the assimilation in constraining the surface quantities such as emissions within a short period of assimilation integration (i.e., 3 days).   Figure 8b,d indicate that there is reasonable consistency between the propagation of analysis increment and variance reduction in the vertical direction. It shows that the vertical effects from a single observation can persist for at least 3 days. Furthermore, a significant part of analysis increment and error variance reduction remains in the lower elevation with a downward shift of the maximum values. These increments/reductions are quite small between the mid to upper troposphere, except for another slight increase at the upper layers, which occurs likely due to a zero-flux assumption across the model top boundary [46,98]. Hence, similar to the surface, part of the increments/reductions lingers near the upper layers. We examine ( Figure 9) the impact on the analysis increments and error variance reductions for different vertical correlation lengths (L v =1σ l , 10σ l , 20σ l ). It indicates that the error variance reduction is significantly more sensitive than the analysis increment to the vertical correlation length scale. As mentioned earlier, the layer pressure weights normalize the adjustments and allocate more impact on the mid-layers. We remove this effect by considering uniform layer pressure weights, which results in emphasizing the influence of the averaging kernels on the vertical profile of the analysis increment and error variance reduction during the analysis step (Figure 9a,b,d,e). Thus, higher sensitivity of GOSAT retrieval to the surface associated with its average kernel (Figure 9f) results in a similar effect on the analysis (Figure 9d). Likewise, the error variance, whose vertical sensitivity is more substantial than the analysis, exhibits a larger influence from the surface (Figure 9e). It implies that for observations with the highest retrieval sensitivity to the near-surface (e.g., GOSAT), integrating the reduction of error variance with the assimilation scheme enhances our ability to retrieve information from the surface. For a single-observation problem, one can theoretically show that the error variance reduction maintains a quadratic relationship with the analysis increments (Equation (23)). Therefore, besides the continuous estimate of the analysis and its uncertainty, the distinctive feature of PvKF in explicitly computing the error variance can be encouraging to infer surface quantities (i.e., emission inversion).

Timing (Computational Efficiency)
The computational cost of advanced assimilation schemes is often compared against the cost of one model integration. The cost of the forecast step in PvKF involves two model simulations: the forecast of concentration and the forecast of the error variance. Thus, the forecast step of PvKF requires a little less than twice the computational time of the H-CMAQ forecast simulation, since the chemistry and diffusion are turned off for the propagation of error variance. When observations and analyses are included, the total computational cost depends on how many observations are assimilated per time step. The computational time required to perform the analysis step increases rapidly as the number of observations increases, due to the inversion of the innovation covariance matrix (or solving the equivalent linear problem) and the computation of the analysis error variance. Note that, contrary to 4D-Var, the storage space of PvKF always remains the same for any length of integration.
The computational cost of assimilation with real GOSAT data (developed by SRON/KIT) should also take into account the number of actual quality-controlled observations. Obtaining quality-controlled observations consists of several steps. In addition to using the quality-controlled observations provided by the instrument team, we apply two additional filters on real GOSAT data before including them in the assimilation. First, we remove outliers whose departure from the global mean of the observations is three times larger than its standard deviation, and second, we conduct a thinning process on the observations with the aim of providing uncorrelated observation errors [99] (see Section 3 of Part II for more details). Accordingly, the filtered GOSAT data holds below 100 retrievals per hour, which is about one-third of observations before filtering (i.e., full GOSAT data). Note that the quality control steps are performed as pre-processing steps; hence, they do not impact the computational cost of assimilation.
In order to make an assessment of the computational cost of the PvKF algorithm, we investigate the assimilation of an arbitrary number of observations per hour. For this, we have an arbitrary number of simulated observations that imitate the behaviour and influence of real GOSAT observations. All the simulated observations are retrieval products and have averaging kernels and a priori columns similar to the retrieved XCH 4 , and are basically proxy retrievals (i.e., PR) of GOSAT observations developed by SRON/KIT. The advantage of this approach is that we can generate an arbitrary number of "retrieved" observations. Therefore, the simulated observations could be generated at other times than for the retrievals. Both model (H-CMAQ) and analysis are carried out for a one-hour simulation of size 187 × 187 × 44 in the computational domain with 108 km resolution. Figure 10 compares the computational time of the analysis step (dashed red line) as normalized to one model integration (solid black line). Contrary to the PvKF forecast (solid blue line), the analysis cost largely varies depending on the number of assimilated observations, attaining an equivalent cost to one model integration at about 850 observations. The total assimilation time of PvKF (dashed green line) is obtained by adding the analysis time with approximately twice the model running time. For the number of observations equivalent to the real GOSAT data (both filtered and full data), the analysis step is much shorter than the model (i.e.,~0.1 per model integration). In this case, matrix inversion of the innovation covariance matrix (i.e., Γ −1 = (HP f H T + R) −1 in Equations (17) and (20) uses Cholesky decomposition [100] to solve either the inversion of the innovation covariance matrix or a system of linear equations Γb = d. Thus, our hourly assimilation scheme with real GOSAT data maintains approximately twice the H-CMAQ computational time. Nonetheless, for the higher number of measurements (e.g., >500) per hour generated synthetically with GOSAT, our computational time estimate increases exponentially to near O(p 3 ), which is comparable to the order of a Cholesky decomposition to compute the matrix inversion (i.e., it is dominated by the analysis step computation). As a remark, we can roughly compare the computational efficiency of the PvKF assimilation described above with other popular schemes such as 4D-Var and EnKF. For example, 4D-Var entails the adjoint of CMAQ where the computation of the backward mode requires at least twice (and up to three times) the time as the forward simulation [101]. In addition, several forward-backward iterations are required until the algorithm's convergence. The adjoint computation also requires a relatively large (~10 times the model) amount of storage space [102]. The computational time of EnKF assimilation also highly depends on the number of ensembles carried out with the model that adopts it. In general, dozens of ensembles are expected to maintain proper assimilation [32]; for example, Peng et al. (2015) [103] used 48 ensembles in their CFI-CMAQ. In summary, PvKF assimilation can achieve a high level of computational efficiency compared to the other typical assimilation schemes. We demonstrate the performance of this framework using real GOSAT observations in the second part of this paper.

Summary and Conclusions
We have designed an assimilation system based on the parametric (variance only) Kalman filter, or PvKF, with the hemispheric CMAQ model and GOSAT methane observations (the assimilation scheme can be used for any long-lived species). The scheme is capable of providing analysis together with its uncertainty (i.e., error variance) while being computationally cheaper than other popular data assimilation schemes such as 4D-Var and EnKF. The analysis is derived sequentially with a minimum hourly batch of observations that could maintain real-time assimilation. The uncertainty is obtained by advection of error variance using the H-CMAQ model with a predefined error correlation model. The assimilation system was tested with a single simulated observation experiment to verify basic properties, as well as the conservation of information and the appropriateness of using advection of error variance. This scheme does not assume a perfect model, nor does it require the development of adjoint or ensemble simulations.
To conduct PvKF methane assimilation, we modified H-CMAQ to include methane transport, chemical reaction with OH, and emissions from both anthropogenic (EDGAR v6) and natural (WetCHARTS v1.0) source categories. Preparation of an unbiased initial field and addressing the bias in GOSAT with respect to independent surface measurements (GLOBALVIEWplus CH4-ObsPack v3.0 compiled by NOAA) is demonstrated. We found that a latitudinal correction provides a reasonable bias correction to our data. Thus, the model discrepancy with observations can be primarily attributed to the random errors in the model, observations, and emissions.
Results using simulated observations indicate that the expected behaviour of the analysis error variance and analysis increment, which is derived from the model, is fairly consistent, while the information content (i.e., total variance) is conserved. Note that this may not be the case for EnKF if inflation is not added. We also demonstrate that the effect of a single observation can persist within a period of the GOSAT revisiting time, which is about 3 days. In addition, it is shown that the vertical error correlation could assist in deducing quantities at or close to the surface.
We also discussed the computational cost of the PvKF assimilation against an arbitrary number of observations per time step. We found that the assimilation scheme with GOSAT maintains approximately twice the H-CMAQ computational time, while a larger number of observations per hour (e.g., >500), which is not the case of GOSAT, suggests an exponential increase in the computational time. Nonetheless, with a couple of thousands of observations per hour, the method still performs acceptably but is slower. We emphasize that this assimilation system does not assume a perfect model, in agreement with the fact that the wind and emissions are not entirely known. Yet, PvKF is computationally advantageous compared to 4D-Var and EnKF.
The main limitation of this method is related to the lifetime of the species. PvKF is welladapted to long-lived species, such as methane, and still applies to shorter lifetime chemical species, but with the caveat that a smaller fraction of the total forecast error variance is explained by the advection of error variance. In this case, the residual error variance (i.e., unexplained error variance) is captured by the stationary model error. Thus, for chemical species with a shorter lifetime, the PvKF more resembles an OI (Optimal Interpolation) scheme. Another limitation is that the framework's feasibility depends on the observation characteristics (e.g., observation number and density). The larger number of observations we assimilate, the more accurate analysis we may obtain. However, the assimilation scheme is limited to a certain number of observations due to the computational capacity of PvKF, as explained in Section 5.2. In addition, increasing the number of observations can result in a higher spatial density, which increases the error correlation in observation space. This contradicts the necessary condition to obtain an optimal PvKF analysis (see Part II, Sections 2 and 3). Therefore, the number of observations may limit both the efficiency and the quality of the assimilation.
In general, we found that the PvKF algorithm is sufficiently adaptable as a lightweight scheme for carrying long-lived tracers inside a chemistry-enabled atmospheric model. In a companion paper, we will discuss this framework with the objective of obtaining optimal assimilation of GOSAT and deducing realistic statistics for transport, observations, and model parameters, including correlation lengths.
A potential application of this algorithm, which has not yet been pursued in this study, is to improve the inverse modelling of methane emissions on a highly resolved regional domain by integrating an accurate initial and the inflows of methane concentrations at the lateral boundaries of the regional model and their uncertainties.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs14020371/s1, Figure S1: (a) GOSAT bias(AMF) -CMAQ with air mass factor bias correction, (b) GOSAT bias( θ s ) -CMAQ with solar zenith angle bias correction. MB and σ denote domain-wide mean bias and standard deviation of the residual. Table S1: Evaluating the multiple (iterative) regression algorithm based on two measures: Mean Square Error (MSE) and the correlation coefficient r. Latitude, θ s , and AMF represent the order of parameters in the multiple regression algorithm, respectively. Table S2: Evaluating the multiple (iterative) regression algorithm based on two measures: Mean Square Error (MSE) and the correlation coefficient r. θ s , latitude, and AMF represent the order of parameters in the multiple regression algorithm, respectively. Table S3: Evaluating the multiple (iterative) regression algorithm based on two measures: Mean Square Error (MSE) and the correlation coefficient r. AMF, θ s , and latitude represent the order of parameters in the multiple regression algorithm, respectively. Figure S2: Hemispheric spatial distribution of the relative methane concentration loss due to chemical reactions. It shows daily average values after two weeks of the model forecast.   For the transformation into the horizontal gridded model, we use the latitude and longitude of the center of each gridcell. According to CMAQ documentation [98], we account for the Earth radius of 6,370,000 m, the true latitude of 45 • , and the meridian taken as the true longitude. Note that the polar stereographic projection is integrated into the observation operator, so that each observation point is transformed using the same set of parameters and equations described above.