Impact of Molecular Spectroscopy on Carbon Monoxide Abundances from TROPOMI

The impact of SEOM–IAS (Scientific Exploitation of Operational Missions–Improved Atmospheric Spectroscopy) spectroscopic information on CO columns from TROPOMI (Tropospheric Monitoring Instrument) shortwave infrared (SWIR) observations was examined. HITRAN 2016 (High Resolution Transmission) and GEISA 2015 (Gestion et Etude des Informations Spectroscopiques Atmosphériques 2015) were used as a reference upon which the spectral fitting residuals, retrieval errors and inferred quantities were assessed. It was found that SEOM–IAS significantly improves the quality of the CO retrieval by reducing the residuals to TROPOMI observations. The magnitude of the impact is dependent on the climatological region and spectroscopic reference used. The difference in the CO columns was found to be rather small, although discrepancies reveal, for selected scenes, in particular, for observations with elevated molecular concentrations. A brief comparison to Total Column Carbon Observing Network (TCCON) and Network for the Detection of Atmospheric Composition Change (NDACC) also demonstrated that both spectroscopies cause similar columns; however, the smaller retrieval errors in the SEOM with Speed-Dependent Rautian and line-Mixing (SDRM) inferred CO turned out to be beneficial in the comparison of post-processed mole fractions with ground-based references.


Introduction
Many species present in the atmosphere influence Earth's radiative transfer by absorbing, emitting, and scattering electromagnetic energy at certain wavelengths [1]. The interaction of radiation with matter makes molecular spectroscopy a powerful tool in investigating the composition, distribution, and evolution of atmospheric constituents. Key [30]) was used to calculate the absorption cross sections of the individual molecules. Note that CO is only responsible for ≈1% of the total optical thickness τ in that spectral range.
In order to infer the amount of atmospheric constituents from an observed spectrum, an accurate description of molecular absorption at different pressure p and temperature T levels is mandatory. In high resolution line-by-line (lbl) models, the cross section of a molecule k m is calculated by the superposition of many lines l, where each line is the product of a temperature dependent line strength S (m) l and a normalized line shape function +∞ −∞ g(ν) dν = 1 that is describing mechanisms, such as pressure and Doppler broadening [31]. Therefore, the best possible knowledge of the spectral parameters, such as line positionν, line intensity S, line width γ (air-and self-broadening), temperature exponent n, lower-state energy E, and their variation with T and p, is required. However, advances in high resolution absorption spectroscopy and the advent of sensors, such as TROPOMI, with wide spectral ranges at rather high spectral resolutions and excellent SNR ratios have indicated discrepancies between spectroscopic models and observations [32][33][34][35][36][37]. It was found that physical processes beyond broadening mechanisms described by the Voigt function should be taken into account for accurate atmospheric characterization. Moreover, studies [24,[38][39][40][41][42] also indicate that improved molecular spectroscopy is crucial to eliminate systematic residuals in atmospheric spectra and that trace gas retrievals in the SWIR will benefit accordingly.

Spectroscopic Line Data and Line Profiles
The SEOM-IAS (SEOM, Scientific Exploitation of Operational Missions-Improved Atmospheric Spectroscopy) is an improved line parameter database of H 2 O, CH 4 , and CO (available on Zenodo [43,44]) compiled within the framework of an ESA project according to the needs of the TROPOMI instrument. Fourier transform spectrometer (FTS) and continuous wave cavity ring-down spectroscopy (CRDS) measurements (performed at the German Aerospace Center (DLR) and at Université Grenoble Alpes, respectively) were analyzed in the 4190-4340 cm −1 spectral range. The spectroscopic database was obtained from high resolution FTS measurements employing a multireflection cell with absorption path lengths from 14.4-168 m and a temperature range 198-361 K (12 pure and 32 air-broadened CH 4 measurements, 1 pure and 4 air-broadened CO measurements, 7 pure and 23 air-broadened H 2 O measurements, and 4 pure HDO measurements). A multispectrum fitting software [45] developed at DLR was used for the analysis of the measured spectra. For modeling of absorption lines in the multispectrum fitting, a quadratic speed-dependent hard collision model based on the implementation of the Hartmann-Tran (HT) profile was used [35,[46][47][48][49]. In order to account for line-mixing, the profile was extended using the first and second order perturbation approximation by Rosenkranz [50] and Smith [51]. The CRDS measurements served as validation.
The HT profile with vanishing correlation (η = 0) reduces to the speed-dependent Rautian (SDR [52][53][54][55][56][57]) profile. The transmissions in Figure 1 were computed with line-mixing included, i.e., the SEOM with Speed-Dependent Rautian and line-Mixing (SDRM) profile. Figure 2 shows a close up view of the molecular cross sections for CO, CH 4 , and H 2 O near 4295 cm −1 , where all three molecules possess a fairly strong and almost co-located transition. For each molecule, two cross sections were calculated, one with the SDRM profile (including the extended set of line data) and another with the classic Voigt model, while both use SEOM line data as input. The differences between the two turned out to be around two orders of magnitude smaller than the actual cross sections itself with the maximum disagreement located close to the line center positions.

Previous Studies
An initial validation of SEOM tested the line list with atmospheric spectra from solar occultation measurements, and the new database was found to be a significant improvement over High Resolution Transmission (HITRAN) 2016 (M. Birk, personal communication).
An assessment of the operational TROPOMI CO product for various spectroscopic inputs, including SEOM, was recently published by Borsdorff et al. [58]; however, only the spectroscopic data was substituted, while the remaining retrieval settings were identical to the ones of the operational processing. The study quantified the quality of the spectral fits and biases in the CO column and found that "updating the CH 4 cross sections is the main reason for the improved CO product". They concluded that molecular spectroscopy data plays a key role for the quality of the retrieval.
In a recent study [59], we examined the impact of SEOM spectroscopic information on CO total columns from SCIAMACHY and found that the best retrieval results for SEOM line data are obtained when higher-order 'beyond Voigt' effects in molecular absorption are taken into account. The outcome indicates that, strictly speaking, the classic Voigt profile is not an adequate line model for the SEOM line list (confirmed by M. Birk, personal communication), although the largest impact on the improved fitting residuals (≈3% on average, up to 15% for individual observations) is attributed to the updated line parameters while the line profile has less contribution. Although the CO mole fractions increased by 4 − 11% using SEOM spectroscopy the difference to H16 was found to be significant for only a limited set of spectra (see Reference [59, Figure 7 and A4]).

Methodology
In this study, we investigated the impact of the SEOM spectroscopy on the retrieved CO from TROPOMI SWIR observations by comparing the spectral fitting residuals, deduced columns, and corresponding retrieval errors with the most recent releases of HITRAN 2016 (H16, High Resolution Transmission; [60]) and GEISA 2015 (G15, Gestion et Etude des Informations Spectroscopiques Atmosphériques 2015 [61]). Besides the target gas CO, the impact on the interfering, and hence co-retrieved species CH 4 and H 2 O (including their isotopologues), was examined. Note that the reason to stay with the current version of GEISA (2015 instead of 2019) was that not all molecules required for the retrieval of CO in the specified spectral range were updated at the time of submission (R. Armante, personal communication).

Retrieval Setup
The retrievals in this study were performed with the latest version of the scientific retrieval algorithm BIRRA (Beer Infrared Retrieval Algorithm [25,62]) which has been developed at the German Aerospace Center (DLR) since about 2005. In addition to enhancements in the GARLIC (Generic Atmospheric Radiation Line-by-line Infrared Code [63,64]) forward model described in [59], this most recent version of BIRRA incorporates TROPOMI calibration key data (CKD), such as tabulated instrument spectral response functions (ISRF).

Input Data
An updated framework is providing auxiliary data for the prototype retrieval of CO abundances from TROPOMI. Since information on the amount of CO is inferred by optimally varying forward model parameters during the inversion process, the quality of the input data affects the accuracy of the retrieval.

Calibrated Level 1b Spectra
The TROPOMI level 1b data (version 1.0) from band 7 contains spectrally and radiometrically calibrated Earth radiance and solar irradiance spectra in the 2305-2345 nm (≈ 4338-4265 cm −1 ) spectral range (see Figure 3). These quantities already include corrections from the CKD that account for several effects, such as offset, dark-current, pixel-quality, non-linear response, and noise, and were derived during the on-ground calibration campaign prior to launch [65,66]. In-flight, the CKD of, e.g., the pixel-quality, the ISRF, and stray-light correction, is monitored by TROPOMI's calibration unit and updated over the lifetime of the instrument if necessary. This is crucial as the operational level 0-1b processor marks data with quality assessment flags, e.g., in order to exclude bad and dead pixels that are deemed unusable for generation of the level 2 product. The actual number of available pixels in the selected retrieval window between 4277.20-4302.90 cm −1 is dependent on the bad and dead pixel-mask (BDPM) and ranges from 146-154 for observations considered in this study. Note that TROPOMI is commanded to perform a solar irradiance measurement near the day-night terminator at the northern side of the orbit only every 15 orbits, i.e., approximately once every calendar day [ Kurucz [67] and an equivalent black body radiator at 5777 K, respectively. Both were added for illustrative purpose only.

The Instrument's Spectral Response
The forward model needs to include an accurate description of the ISRF S in order to model the physics of the measurement with adequate accuracy. In the SWIR, the TROPOMI ISRFs vary across the spectral and spatial dimension S ij and are provided for each of the 1 ≤ i ≤ 256 ground pixels, as well as for 1 ≤ j ≤ 24 equally spaced central wavelengths of the spectral axis ranging from 2298 to 2344 nm [68]. Eight tabulated ISRFs remain within the range of sufficiently strong CO absorption lines defining our fitting window; hence, interpolated response values were used for most spectral pixels. The rather smooth variation in the spectral dimension is beneficial for the interpolation of responses to pixels where no tabulated values are available (see Figures 4 and 5). Nonetheless, accounting for those variations in the instrument's response is important, particularly when testing spectroscopic data and models.
During on-ground calibration, van Hees et al. [68] found that the accuracy of the ISRF CKD is well within the requirements for trace-gas retrievals. Moreover, van Kempen et al. [66] found that the differences between in-flight and on-ground CKD measurements are small, and no corrections need to be applied.    wavenumber (x = ν) domain, respectively.

Atmospheric Input Data
The physical description of a measurement by the forward model requires input for some atmospheric state-variables, such as pressure p, temperature T, and specific humidity q, since, e.g., the cross section of molecules k m need to be calculated at different atmospheric levels in order to accurately model lbl absorption through the atmosphere.
Note that BIRRA [25] in Section 2.2.2 utilizes a separable least squares where the state-vector x is separated into two vectors η ⊂ x and β ⊂ x comprising the linear and nonlinear parameters [70], and that initial guess values are only required for the nonlinear parameters. In the forward model, the 'true' optical depth of a molecule τ m is described as with α m ∈ η and n ref m the initial guess molecular number density. In Figure 6, the initial guess for CO and CH 4 mole fractions are shown. Both resemble AIRS (Atmospheric Infrared Sounder) version 6 initial guess profiles [71,72] with varying concentrations from the northern hemisphere to the southern hemisphere. The AIRS CO initial guess comes from MOZART (Model for OZone And Related chemical Tracers [73]) monthly mean hemispheric profiles, while CH 4 is described by a function of latitude and altitude. Pressure and temperature, as well as the specific humidity, were taken from the 4-times daily reanalysis product [74] Figure 6. Initial guess mole fraction profiles for CO (four panels on the left) and CH 4 (right) resembling the Atmospheric Infrared Sounder (AIRS) version 6 first guess. The CO profiles are provided with monthly granularity for both hemispheres while only January, April, July and October (clockwise from top-left) are shown here.

Cloud Filtering and Topographic Information
Auxiliary input data on clouds was obtained from the S5P-NPPC product which is available for the TROPOMI bands 3, 6, and 7 (SWIR). The Visible Infrared Imaging Radiometer Suite (VIIRS [75]) aboard the Suomi-NPP (Suomi National Polar-Orbiting Partnership [76,77]) spacecraft leads ahead the S5P in loose formation orbit by 3.5 min in local time ascending node and reports cloud information with high spatial resolution on nominal and scaled TROPOMI field of views (FOV). The cloud mask data is grouped in four classes, namely confidently cloudy, probably cloudy, probably clear, and confidently clear. Prior to the CO retrieval the cloud fraction was calculated for the 1.5 scaled FOV of TROPOMI's band 7 (in the along and across-track dimension), and the ratio of the 'confident' and 'probable' classifications was formed. Besides the retrieved CH 4 absorption, those quantities serve to identify conditions that might lead to errors in the retrieved columns due to path modifications by clouds and aerosols (scattering) of the observed light, which is not yet considered in the forward model. In particular, an observation was rejected if the cloud fraction specified in the S5P-NPPC product exceeds 10% or if the number of VIIRS pixels that fall into the 'probable' classification (i.e., not the 'confident' classification) exceeds 20%. These rigorous filter criteria avoid observations with large retrieval inaccuracies caused by scattered photons [78] and minimize any bias that arises from changes in the retrieval's vertical sensitivity by modifications in the column averaging kernel (CAK [23], Section 5) in Figure 7 (also see Reference [28], Section 3, and Figure 4 [79] and Figure 3).
Furthermore, the calculation of the double-path transmission between the reflection point (e.g., Earth's surface) and observer and between Sun and reflection point (see Equation (1)) requires topographic information on terrain elevation. Therefore, the ETOPO global relief model [80] with 2-min grid spacing and an adequate vertical and horizontal datum provides elevation data for each TROPOMI observation in the radiative transfer calculation.

Vertical Sensitivity and Relation to Priors
In the context of profile retrievals, the sensitivity of the inversion process to the true atmospheric state is given by the averaging kernel. Nonetheless, column density retrievals also have some altitude dependent sensitivity, i.e., the perturbation of elements in the state-vector x at different altitudes result in a non-uniform retrieval response [81] (Sections 2 and 3). Figure 7 shows the altitude sensitivity for three elements of BIRRA's state-vector. The vertical sensitivity for the target gas and CH 4 reveals to be close to unity across the full range of TROPOMI observer zenith angles (OZA), while the CAK of H 2 O tends towards zero at higher altitudes, where the retrieval is less sensitive to the true atmospheric state.

Assessing the Quality of the Fit
Standard diagnostics are used to assess deficiencies in the forward model of the retrieval. In particular, the scaled 2-norm of the spectral fitting residual is a suitable criterion as it becomes smaller the better the forward model I(x) can mimic the measurements I obs . Note that m is designating the number of available TROPOMI measurements in our fitting window and is dependent on the BDPM (see Section 3). n is the number of elements in the state-vector x which is constant and in the setup for this study includes the molecular scaling factors α ⊆ η and a second-degree polynomial in wavenumber r ⊆ β representing the surface reflectivity. Furthermore, it is important to note that the absolute value of σ 2 is dependent on the SNR and since no normalization is applied, strictly speaking, only residuals from the same observation should be compared against each other. Nonetheless, it is deemed permissible to compare the scaled norm of the fitting residual within a region as the environment is quite homogeneous in terms of temperature, humidity, and, even at TROPOMI's spatial resolution, surface reflectance.
Errors in molecular spectroscopy can introduce systematic spectral residuals that in consequence result in larger retrieval errors of the corresponding quantities according to The least squares covariance matrix Ξ contains the Jacobians J ≡ ∂I ∂x of the fitted parameters and diagonal elements represent the errors of the state-vector components [25,82]. Therefore, besides the analysis of the residual norms in Section 3.1, the fitted state-vector elements α, along with their error estimates, are examined in Section 3.2.

Results
The investigation was performed for a set of TROPOMI orbits in 2019 that cover various climatological regions, namely the Sahara, Central-Europe, Amazonia, and Siberia (for details, see Table 1). Aside from Central-Europe, the areas were selected according to different pairs of temperature and humidity values, and Europe contains strong anthropogenic sources (cities, large harbors, airports, etc.), as well as CO background levels (rural areas, many alpine regions, etc.). The individual orbits are given in Table 1 and were selected based on (low) cloud coverage. Nonetheless, post-processing steps include rigorous cloud filtering, the removal of non-converged retrievals, and the disposal of measurements with very small SNRs (e.g., observations above large bodies of water, such as lakes, rivers, etc.).

Spectral Fitting Residuals
The elements of the residual vectors according to Equation (2) are depicted in Figure 8. The histograms are separated by regions, starting with Sahara at the top-left and depicting Central-Europe, Siberia, and Amazonia in the clockwise direction. The residuals are following a normal distribution (except for Sahara) with an expected value around zero, indicating that the majority of the measurement errors are caused by random errors, such as instrument noise, etc. This is crucial in order to get the so-called best linear unbiased estimate for the state-vector x [83]. In the Sahara region, however, the distribution of residuals deviate from the Gaussian form in particular around the center of the curve and most significantly for G15. The non-uniform distribution over the Sahara is reduced for the SDRM retrievals. Note that the SEOM-Voigt case only considers spectral parameters that describe the mechanisms of pressure and Doppler broadening and was included in order to discriminate the impact of line data versus model. Figure 9 shows one residual vector for a randomly picked measurement per region. It reveals that the modeled spectra for both H16 and G15 exhibit the largest disagreements in a spectral region close to 4295 cm −1 . This feature is significantly reduced when using SEOM line data and virtually eliminated when the SDRM line profile is applied (see Figure 2), as well causing a rather uniform distribution of the residuals across wavenumbers. In addition, G15 reveals some discrepancy round 4293 cm −1 . Note that spectral ranges with increased differences show the same positive or negative deviations across geographic regions, indicating that the radiance (transmission) is persistently overor underestimated for those wavenumbers.
In Figure 10, the individual detector-pixel residuals are examined for all measurements across seasons. The average of the absolute differences is given by E(|ρ|). It shows that the pixels close to 4295 cm −1 consistently exhibit major disagreements and that the retrievals over the Sahara (first panel) show the largest discrepancies on average since the absolute values of the elements in ρ are dependent on the SNR (see Section 2.4). Furthermore, in order to identify molecular transitions that possibly cause the discrepancies, optical depths of the absorbing molecules CO, CH 4 , and H 2 O are depicted in three separate but aligned panels below. It becomes obvious that the disagreements around 4295 cm −1 coincide with three rather strong and overlapping absorption lines that were shown in Figure 2. SDRM and H16, particularly, do not agree in that part of the spectrum, and the overall spectral fit quality is improved by ≈ 4 − 8% for SDRM retrievals.  The substitution of H16 line data with SDRM for each molecule individually shows that the impact on the fit quality is small when CO is replaced (<0.5%) but improves considerably when H 2 O (>7%) and CH 4 (>5%) is updated. The given numbers are averages across all investigated regions. Since the residuals for G15 are similar with respect to H16 (except for some pixels between 4280-4285 cm −1 and around 4293 cm −1 as indicated in Figure 9), they are not depicted in Figure 10.

Impact on Retrieved Columns and Corresponding Errors
The effect of absorption line data on the retrieved CO total columns N m [molec cm −2 ] and corresponding errors is shown in Figures 11-14. Each figure depicts the results for one region and contains the target and co-retrieved quantities according to The mole fractions for some molecular number densities n ref m are shown in Figure 6. It is important to note that H 2 O and CH 4 , although the latter has strong absorption lines across the CO spectral fitting window and can be used to identify light path modifications [78], are byproducts primarily considered due to their spectral interference with the target gas. The distribution of the errors for CH 4

and H 2 O is shown in Appendix A.
For the Sahara region depicted in Figure 11, the majority of CO is distributed between 1.0 − 2.5 · 10 18 molec cm −2 and the histograms for SDRM and HITRAN are similar. The CO retrieval errors according to Equation (4) are illustrated in the top-right and are below 1.0 · 10 17 molec cm −2 across spectroscopies with the median around 1.7 · 10 16 molec cm −2 in case of SDRM and around 2.2 · 10 16 molec cm −2 for the other two line lists. Note that the majority of the CO errors for SDRM are even below 6.0 · 10 16 molec cm −2 . Although there is almost no absolute difference in the medians of the CO columns for SDRM and H16 distributions, they were found to be significantly different according to the non-parametric Kolmogorov-Smirnoff test (p-value < 1.0 · 10 −5 ) [84]. While this holds true for SDRM and G15 distributions (CO difference ≈ 1.0 · 10 17 molec cm −2 ), the magnitude of their retrieval errors is significantly different, as well.
A similar CO distribution is observed for Central-Europe in Figure 12, but values cover a greater range. Again, SDRM and H16 cause similar concentrations, while G15 is significantly different based on the distribution of errors. Moreover, the SDRM-based CO product is, again, the most precise, i.e., incorporates the smallest fitting residuals and thus retrieval errors.        In Figure 13, the CO columns over Amazonia show a rather different distribution where SDRM retrievals cause significantly higher concentrations. The difference is also significant for the two co-retrieved molecules. The errors in the top-left are larger compared to the two previous regions; however, SDRM-based errors are, again, the smallest. The difference in CO for both SDRM-H16 and SDRM-G15 was found to be 2.4 · 10 17 and 1.3 · 10 17 molec cm −2 , respectively. The errors are consistently <1.0 · 10 17 molec cm −2 which indicates that the differences in SDRM retrieved CO concentrations over Amazonia are significant with respect to both other line lists, suggesting that the 'true' value might be outside the specified error range of the product. Furthermore, it should be noted that the S5P-overpass in the second quarter The distribution of CO over Siberia in Figure 14 is similar to that for Sahara and Central-Europe, i.e., the histograms for SDRM and HITRAN almost resemble each other, while GEISA is shifted towards higher columns on average. Again, this shift is larger than the product error and thus considered to be significant. In contrast to the other regions, the CH 4 columns for G15 are also shifted. For H 2 O, the medians are rather equally spaced across cases as it is true for the errors.

CO over Amazonia and Central-Europe
The trend of differences between SDRM and H16-based CO mole fractions over Amazonia and Europe is depicted in Figure 15. The standard set of filter criteria described in Section 3 was applied on a mostly sunny, high pressure influenced day over Central-Europe on 21 September 2019 (orbit 10046) and a day with rather average cloud coverage over Amazonia on 17 August 2019 (orbit 9553). Note that only the cloud filter was applied for the calculation of the differences since filtering on errors, etc.,would create different sized datasets with even less measurements available for subtraction.  Over both regions, the differences tend to be larger for higher CO mole fractions, although the trend is within the error of most observations (see second column of Figure 16). While the SDRM retrieval errors for background CO concentrations range between ≈0.5-1.5 ppbv, the errors over elevated concentrations are somewhat larger, though the relative error is rather similar for either case. The spatial distribution of the CO differences is depicted in the third column of Figure 16, while SDRM-based mole fractions are depicted in the first column. Particularly, the difference plot over Europe reveals a striping pattern in the satellites' along track direction. It is a well-known but not yet understood feature of push-broom spectrometers that is changing from orbit to orbit [58].

Comparison to Ground-Based Observations
In order to estimate the quality of the retrieval product, a comparison with observations from co-located TCCON (Total Column Carbon Observing Network [85,86]) and NDACC (Network for the Detection of Atmospheric Composition Change) ground-based (g-b) measurements was carried out.
Filtering of the TROPOMI retrievals is crucial in order to compare valid CO mole fractions to g-b references. However, it is important to note that since the errors in the retrievals are dependent on the spectroscopy, filter criteria based on errors might be appropriate for one retrieval (e.g., SDRM-based) but not the other (e.g., H16-based). The values given in Table 2 are mean and median values for the respective TROPOMI overpass on the specified day. Since SDRM inferred CO columns exhibit smaller errors (see Section 3.2), one set of strict filter criteria leads to different numbers of observations remaining for comparison after post-processing. The actual number of remaining measurements after filtering is primarily dependent on the weather conditions at the time of overpass and surface characteristics around a station. If no TROPOMI observations remained within a reasonable small radius around the g-b station after filtering (e.g., <50 km), the radius for co-location was increased in steps of 50 km up to 200 km. The mean value was calculated for measurements that remained after strict filtering (taking retrieval errors, etc., into account), while the median was computed for the non-filtered retrieval output (i.e., only filtered for clouds). By only rejecting cloudy pixels using the S5P-NPPC product (see Section 2.2.4), both retrievals deliver the same number of CO mole fractions after post-processing but include observations with large errors.
The comparison to g-b reference observations in Figure 17 shows that differences vary across sites. Although no consistent over-or underestimation of BIRRA retrieved CO mole fractions from TROPOMI is obvious, most TCCON sites observe larger values. In accordance with results from Section 3.3, the validation shows that SDRM-based retrievals cause larger columns on average (also see Reference [59] Figure 9). Table 2. Daily mean and median values for the SDRM-and H16-based CO mole fractions from TROPOMI measurements compared to Total Column Carbon Observing Network (TCCON) and Network for the Detection of Atmospheric Composition Change (NDACC) g-b observations. 'Non-filtered' specifies that only cloudy TROPOMI pixels were eliminated, while 'Filtered' additionally considers retrieval errors in the post-processing steps. 'Radius' designates the maximum distance for co-location, i.e., only TROPOMI observations from within that distance were compared to the g-b site. Values in brackets designate the number of observations after post-processing.   Table 2. Note that the mean was calculated upon the strictly filtered retrieval output, while the median includes all cloud-free retrievals.

Spectral Residuals
The results in Section 3.1 indicate that the SEOM line data has positive impact on the retrieval by causing significantly smaller residuals on average. Furthermore, the impact on the CO fitting residuals underlines results from previous studies [58] and demonstrates that improvements in spectroscopy are important, particularly for retrievals from measurements with sufficient SNR and high spectral resolution [59].

CO Mole Fractions
As shown in Section 3.2, the number of measurements where differences in CO mole fractions become significant is considerably dependent on the specific sets of line data compared. The results in Section 3.3 indicate that the increase in retrieved CO is within the error bar of most observations, particularly with respect to H16. Nonetheless, Figures 15 and 16 also demonstrate that discrepancies can become significant over selected scenes, preferably for measurements over elevated CO concentrations. A comparison to Reference [59] in Figure 7 shows that CO retrievals from SCIAMACHY using SEOM spectroscopy also exhibit a tendency towards larger values. Moreover, the outcome in Section 3.3 indicates that the ratio of change in CO to its errors is roughly proportional across most geographic areas between SCIAMACHY and TROPOMI.

Validation
Since g-b observed CO columns can vary considerably throughout a day (>15 ppbv), representation errors, such as spatial and temporal mismatch, should be taken into account when interpreting differences for a single TROPOMI overpass. Due to advection of CO over time (e.g., at 3 ms −1 ), TROPOMI observations at some distance from the g-b station do also provide a valuable source of information in the comparison and should be considered, as well.
It is also important to note that TCCON data is calibrated to World Meteorological Organization (WMO) in situ trace gas measurement scales in order to tie its observations to in situ measurements [86,87]. The systematic difference between CO from TCCON and observations that are not tied to the WMO scale was examined in detail by Kiel et al. [88]. They found that this correction factor (1.0672 for the GGG2014 dataset) is the main source of the observed difference (also see Reference [62]) and that the choice of different spectroscopic line lists have only minor influence on the overall bias. This is in accordance with the results in Section 3.4, although different spectroscopies were compared.

Summary and Conclusions
The investigation on the impact of molecular spectroscopy on CO total columns from TROPOMI SWIR observations found that SEOM line data with the adequate model improves the spectral fit quality by reducing the residuals to TROPOMI measurements with respect to both H16 and G15.
The results demonstrate that molecular spectroscopy has a significant effect on the precision of the CO retrieval. The reduced spectral fitting residuals and smaller retrieval errors were found to be statistically significant across the examined regions, making the SDRM-based CO product more precise. The magnitude of the impact is dependent on the climatological region and spectroscopic reference but ranges from ≈ 10-20% (up to 30% for individual observations with respect to G15). Updates in the H 2 O and CH 4 cross sections were identified to be the main reason for the improved fit quality. These findings underline the important role that accurate spectroscopic information plays in meeting the missions' requirements.
In contrast to the fitting residuals, the differences in CO columns between SDRM and H16 were found to be rather small across most regions (≤ 3%), while some larger discrepancies were found for individual observations with elevated molecular concentrations, particularly over Amazonia. Similar to the spectral residuals, the average disagreements to SDRM are larger for G15, with the largest differences to SDRM-based retrievals found over Siberia. In the other two examined regions, the impact is less significant with respect to H16 but stays significant for the majority of G15-based retrievals.
The comparison to TCCON and NDACC g-b observations revealed that the smaller retrieval errors in the SDRM inferred columns are beneficial when comparing post-processed CO mole fractions to g-b references since stricter filter criteria can be applied on the TROPOMI observations within a given distance from the station.
Overall, many aspects of the findings underline recommendations from earlier investigations [41,42] and are in good agreement with similar conclusions from Hochstaffl and Schreier [59].  Table 2 was provided by J.N., R.S. and Y.T. (details see Table A1). All authors have read and agreed to the published version of the manuscript.
Funding: The first author receives funding from the DLR-DAAD Research Fellowships Program which is offered by the German Aerospace Center (DLR) and the German Academic Exchange Service (DAAD). The laboratory spectroscopy work was funded by ESA within the SEOM-IAS project (ESA/AO/1-7566/13/I-BG). The TCCON station Garmisch has been supported by the European Space Agency (ESA) under grant 4000120088/17/I-EF and by the German Bundesministerium für Wirtschaft und Energie (BMWi) via the DLR under grant 50EE1711D. The Paris TCCON site has received funding from Sorbonne Université, the French research center CNRS, the French space agency CNES, and Région Île-de-France.

Acknowledgments:
We would like to thank Thomas Trautmann, Günther Lichtenberg, Peter Haschberger for constructive criticism of the manuscript. We also acknowledge the TCCON and NDACC ground-based Fourier transform spectrometer networks for providing data. TCCON data used in this publication were obtained from the TCCON Data Archive, hosted by CaltechDATA: https://tccondata.org. The dataset references are listed in Table A1. The NDACC data were obtained from sites listed in Table A2 and are publicly available via ndacc.org.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are frequently used in this manuscript: