The investigation was performed for a set of TROPOMI orbits in 2019 that cover various climatological regions, namely the Sahara, Central-Europe, Amazonia, and Siberia (for details, see
Table 1). Aside from Central-Europe, the areas were selected according to different pairs of temperature and humidity values, and Europe contains strong anthropogenic sources (cities, large harbors, airports, etc.), as well as
background levels (rural areas, many alpine regions, etc.). The individual orbits are given in
Table 1 and were selected based on (low) cloud coverage. Nonetheless, post-processing steps include rigorous cloud filtering, the removal of non-converged retrievals, and the disposal of measurements with very small SNRs (e.g., observations above large bodies of water, such as lakes, rivers, etc.).
3.1. Spectral Fitting Residuals
The elements of the residual vectors according to Equation (
2) are depicted in
Figure 8. The histograms are separated by regions, starting with Sahara at the top-left and depicting Central-Europe, Siberia, and Amazonia in the clockwise direction. The residuals are following a normal distribution (except for Sahara) with an expected value around zero, indicating that the majority of the measurement errors are caused by random errors, such as instrument noise, etc. This is crucial in order to get the so-called best linear unbiased estimate for the state-vector
[
83]. In the Sahara region, however, the distribution of residuals deviate from the Gaussian form in particular around the center of the curve and most significantly for G15. The non-uniform distribution over the Sahara is reduced for the SDRM retrievals. Note that the SEOM-Voigt case only considers spectral parameters that describe the mechanisms of pressure and Doppler broadening and was included in order to discriminate the impact of line data versus model.
Figure 9 shows one residual vector for a randomly picked measurement per region. It reveals that the modeled spectra for both H16 and G15 exhibit the largest disagreements in a spectral region close to
. This feature is significantly reduced when using SEOM line data and virtually eliminated when the SDRM line profile is applied (see
Figure 2), as well causing a rather uniform distribution of the residuals across wavenumbers. In addition, G15 reveals some discrepancy round
. Note that spectral ranges with increased differences show the same positive or negative deviations across geographic regions, indicating that the radiance (transmission) is persistently over- or underestimated for those wavenumbers.
In
Figure 10, the individual detector-pixel residuals are examined for all measurements across seasons. The average of the absolute differences is given by
. It shows that the pixels close to
consistently exhibit major disagreements and that the retrievals over the Sahara (first panel) show the largest discrepancies on average since the absolute values of the elements in
are dependent on the SNR (see
Section 2.4). Furthermore, in order to identify molecular transitions that possibly cause the discrepancies, optical depths of the absorbing molecules
,
, and
are depicted in three separate but aligned panels below. It becomes obvious that the disagreements around
coincide with three rather strong and overlapping absorption lines that were shown in
Figure 2. SDRM and H16, particularly, do not agree in that part of the spectrum, and the overall spectral fit quality is improved by
for SDRM retrievals.
The substitution of H16 line data with SDRM for each molecule individually shows that the impact on the fit quality is small when
is replaced (<0.5%) but improves considerably when
(>7%) and
(>5%) is updated. The given numbers are averages across all investigated regions. Since the residuals for G15 are similar with respect to H16 (except for some pixels between 4280–4285
and around
as indicated in
Figure 9), they are not depicted in
Figure 10.
The averages of the scaled norm
are itemized by region and season in
Table 1. The results confirm above-mentioned findings, i.e., the retrievals using SEOM line data effectively cause smaller discrepancies to TROPOMI observations. More precisely, SDRM-based retrievals reduce
with respect to H16 and G15 by approximately 10–15
and 15–20
, respectively. The smaller disagreements between SDRM and H16 are likely attributed to the fact that both are not completely independent sets of line data and some of the updates from SEOM are already included in H16 [
58].
3.2. Impact on Retrieved Columns and Corresponding Errors
The effect of absorption line data on the retrieved
total columns
and corresponding errors is shown in
Figure 11,
Figure 12,
Figure 13 and
Figure 14. Each figure depicts the results for one region and contains the target and co-retrieved quantities according to
The mole fractions for some molecular number densities
are shown in
Figure 6. It is important to note that
and
, although the latter has strong absorption lines across the
spectral fitting window and can be used to identify light path modifications [
78], are byproducts primarily considered due to their spectral interference with the target gas. The distribution of the errors for
and
is shown in
Appendix A.
For the Sahara region depicted in
Figure 11, the majority of
is distributed between
and the histograms for SDRM and HITRAN are similar. The
retrieval errors according to Equation (
4) are illustrated in the top-right and are below
across spectroscopies with the median around
in case of SDRM and around
for the other two line lists. Note that the majority of the
errors for SDRM are even below
. Although there is almost no absolute difference in the medians of the
columns for SDRM and H16 distributions, they were found to be significantly different according to the non-parametric Kolmogorov-Smirnoff test (
p-value
) [
84]. While this holds true for SDRM and G15 distributions (
difference ≈
), the magnitude of their retrieval errors is significantly different, as well.
A similar
distribution is observed for Central-Europe in
Figure 12, but values cover a greater range. Again, SDRM and H16 cause similar concentrations, while G15 is significantly different based on the distribution of errors. Moreover, the SDRM-based
product is, again, the most precise, i.e., incorporates the smallest fitting residuals and thus retrieval errors.
In
Figure 13, the
columns over Amazonia show a rather different distribution where SDRM retrievals cause significantly higher concentrations. The difference is also significant for the two co-retrieved molecules. The errors in the top-left are larger compared to the two previous regions; however, SDRM-based errors are, again, the smallest. The difference in
for both SDRM-H16 and SDRM-G15 was found to be
and
, respectively. The errors are consistently <
which indicates that the differences in SDRM retrieved
concentrations over Amazonia are significant with respect to both other line lists, suggesting that the ‘true’ value might be outside the specified error range of the product. Furthermore, it should be noted that the S5P-overpass in the second quarter.
The distribution of
over Siberia in
Figure 14 is similar to that for Sahara and Central-Europe, i.e., the histograms for SDRM and HITRAN almost resemble each other, while GEISA is shifted towards higher columns on average. Again, this shift is larger than the product error and thus considered to be significant. In contrast to the other regions, the
columns for G15 are also shifted. For
, the medians are rather equally spaced across cases as it is true for the errors.
3.3. over Amazonia and Central-Europe
The trend of differences between SDRM and H16-based
mole fractions over Amazonia and Europe is depicted in
Figure 15. The standard set of filter criteria described in
Section 3 was applied on a mostly sunny, high pressure influenced day over Central-Europe on 21 September 2019 (orbit 10046) and a day with rather average cloud coverage over Amazonia on 17 August 2019 (orbit 9553). Note that only the cloud filter was applied for the calculation of the differences since filtering on errors, etc., would create different sized datasets with even less measurements available for subtraction.
Over both regions, the differences tend to be larger for higher
mole fractions, although the trend is within the error of most observations (see second column of
Figure 16). While the SDRM retrieval errors for background
concentrations range between ≈0.5–1.5
, the errors over elevated concentrations are somewhat larger, though the relative error is rather similar for either case. The spatial distribution of the
differences is depicted in the third column of
Figure 16, while SDRM-based mole fractions are depicted in the first column. Particularly, the difference plot over Europe reveals a striping pattern in the satellites’ along track direction. It is a well-known but not yet understood feature of push-broom spectrometers that is changing from orbit to orbit [
58].
3.4. Comparison to Ground-Based Observations
In order to estimate the quality of the retrieval product, a comparison with observations from co-located TCCON (Total Column Carbon Observing Network [
85,
86]) and NDACC (Network for the Detection of Atmospheric Composition Change) ground-based (g-b) measurements was carried out.
Filtering of the TROPOMI retrievals is crucial in order to compare valid
mole fractions to g-b references. However, it is important to note that since the errors in the retrievals are dependent on the spectroscopy, filter criteria based on errors might be appropriate for one retrieval (e.g., SDRM-based) but not the other (e.g., H16-based). The values given in
Table 2 are mean and median values for the respective TROPOMI overpass on the specified day. Since SDRM inferred
columns exhibit smaller errors (see
Section 3.2), one set of strict filter criteria leads to different numbers of observations remaining for comparison after post-processing. The actual number of remaining measurements after filtering is primarily dependent on the weather conditions at the time of overpass and surface characteristics around a station. If no TROPOMI observations remained within a reasonable small radius around the g-b station after filtering (e.g., <50 km), the radius for co-location was increased in steps of 50 km up to 200 km. The mean value was calculated for measurements that remained after strict filtering (taking retrieval errors, etc., into account), while the median was computed for the non-filtered retrieval output (i.e., only filtered for clouds). By only rejecting cloudy pixels using the S5P-NPPC product (see
Section 2.2.4), both retrievals deliver the same number of
mole fractions after post-processing but include observations with large errors.
The comparison to g-b reference observations in
Figure 17 shows that differences vary across sites. Although no consistent over- or underestimation of BIRRA retrieved
mole fractions from TROPOMI is obvious, most TCCON sites observe larger values. In accordance with results from
Section 3.3, the validation shows that SDRM-based retrievals cause larger columns on average (also see Reference [
59]
Figure 9).