Validation of Carbon Trace Gas Profile Retrievals from the NOAA-Unique Combined Atmospheric Processing System for the Cross-Track Infrared Sounder

This paper provides an overview of the validation of National Oceanic and Atmospheric Administration (NOAA) operational retrievals of atmospheric carbon trace gas profiles, specifically carbon monoxide (CO), methane (CH4) and carbon dioxide (CO2), from the NOAA-Unique Combined Atmospheric Processing System (NUCAPS), a NOAA enterprise algorithm that retrieves atmospheric profile environmental data records (EDRs) under global non-precipitating (clear to partly cloudy) conditions. Vertical information about atmospheric trace gases is obtained from the Cross-track Infrared Sounder (CrIS), an infrared Fourier transform spectrometer that measures high resolution Earth radiance spectra from NOAA operational low earth orbit (LEO) satellites, including the Suomi National Polar-orbiting Partnership (SNPP) and follow-on Joint Polar Satellite System (JPSS) series beginning with NOAA-20. The NUCAPS CO, CH4, and CO2 profile EDRs are rigorously validated in this paper using well-established independent truth datasets, namely total column data from ground-based Total Carbon Column Observing Network (TCCON) sites, and in situ vertical profile data obtained from aircraft and balloon platforms via the NASA Atmospheric Tomography (ATom) mission and NOAA AirCore sampler, respectively. Statistical analyses using these datasets demonstrate that the NUCAPS carbon gas profile EDRs generally meet JPSS Level 1 global performance requirements, with the absolute accuracy and precision of CO 5% and 15%, respectively, in layers where CrIS has vertical sensitivity; CH4 and CO2 product accuracies are both found to be within ±1%, with precisions of ≈1.5% and ⪅0.5%, respectively, throughout the tropospheric column.


Introduction
The U.S. National Oceanic and Atmospheric Administration (NOAA) Joint Polar Satellite System (JPSS) is a NOAA-operational low earth orbit (LEO) satellite series that features the hyperspectral infrared (IR) Cross-track Infrared Sounder (CrIS) [1] and Advanced Technology Microwave Sounder (ATMS) [2] systems. Four satellites are planned to fly in the same orbit over the next two decades beginning with the NOAA-20 satellite (which was referred to as JPSS-1 or J-1 prior to launch in late 2017), and was preceded by the Suomi National Polar-orbiting Partnership (SNPP) satellite launched in late 2011. The CrIS instrument is an advanced IR Fourier transform spectrometer (FTS) that obtains sensor data records (SDRs) consisting of well-calibrated IR Earth emission spectra over three bands (longwave 650-1095 cm −1 , midwave 1210-1750 cm −1 , and shortwave 2155-2550 cm −1 ), with 2211 channels in full spectral-resolution (FSR) mode (maximum optical path difference of 0.8 cm for all three bands and spectral resolution ∆ν = 0.625 cm −1 , with 713, 865 and 633 channels in the longwave, midwave and shortwave bands, respectively) [3]. The CrIS spectra allow for retrieval of atmospheric vertical profile environmental data records (EDRs) with the best possible vertical resolution (≈2-7 km for temperature and water vapor throughout the troposphere) comparable to predecessor sounding systems, namely the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) Metop-series Infrared Atmospheric Sounding Interferometer (IASI) [4,5] and the National Aeronautics and Space Adminstration (NASA) EOS-Aqua Atmospheric Infrared Sounder (AIRS) [6,7]. The NOAA-operational EDR retrieval algorithm for operational hyperspectral thermal IR sounders (viz., CrIS and IASI) is the NOAA-Unique Combined Atmospheric Processing System (NUCAPS) [8,9]. The NUCAPS algorithm is based upon the heritage methodology developed for the EOS-Aqua AIRS and is a modular implementation of the multi-step NASA AIRS Science Team retrieval algorithm Version 5 [10,11].
NUCAPS SNPP previously ran on CrIS spectra with reduced resolution in the midwave and shortwave bands (1.25 cm −1 and 2.5 cm −1 , respectively) due to truncated interferograms in those bands during operational processing [3]; these reduced-resolution spectra have been referred to as "nominal" or "normal" resolution as this was the originally planned operational resolution of the CrIS SDR. However, offline production of CrIS FSR began in December 2014 [3,12], with operational Interface Data Processing Segment production starting in March 2017. The move to FSR was motivated in part by a demonstration study showing the impact of the CrIS spectral resolution on the retrieval of the carbon monoxide EDR [13]. Given that the CrIS FSR mode has been operational since then (i.e., for the remainder of the SNPP lifetime as well as the follow-on JPSS satellite series, beginning with NOAA-20), the NUCAPS system was upgraded to run in FSR mode using the Stand-Alone Radiative Transfer Algorithm (SARTA) [14] delivered by the University of Maryland Baltimore County (UMBC). For more details on the NUCAPS algorithm theoretical basis and user applications, the reader is referred to other papers [9,10] and/or the Algorithm Theoretical Basis Document (ATBD) available online [15].
The Earth emission spectra (i.e., SDRs) measured by CrIS, IASI and AIRS contain information about the atmospheric temperature (T) and moisture (q) profiles, along with trace gases including O 3 , CO, CH 4 , CO 2 , SO 2 , HNO 3 and N 2 O. The NUCAPS physical retrieval module [15] retrieves these individual parameters in a sequential fashion, using channels rigorously determined to be sensitive to each parameter [16], beginning with cloud-cleared radiance spectra (i.e., clear-column IR spectra which are derived with the help of the collocated ATMS data) [10], followed by T, q, ozone (O 3 ) and the remaining trace gases, with the results output on the radiative transfer model (RTM) (or radiative transfer algorithm, RTA) 100 layer grid (T output is on layer boundaries or "levels"). The NUCAPS algorithm solves for the trace gases in an effort to optimize the retrieved thermodynamic (T and q) profile EDRs [9], but the long-term investments in the CrIS and IASI sounders onboard future operational NOAA and EUMETSAT LEO satellite missions (as indicated above) has motivated the exploitation of these space assets for the routine production of the carbon trace gas EDRs, namely carbon monoxide (CO) [17,18], methane (CH 4 ) [19], and carbon dioxide (CO 2 ) [20].
The validation of the NUCAPS T, q and O 3 profile EDRs with respect to high-quality reference datasets has been previously reported in Nalli et al. [21,22], where it was demonstrated that the SNPP EDRs meet JPSS Level 1 requirements; additional independent assessments of the SNPP T and q EDRs versus other reference datasets have been reported elsewhere [23,24]. Similar performances have been established for the EDR products from the NOAA-20 satellite, launched since the original SNPP validation effort. Since that time, the NUCAPS algorithm development team has focused on improvements to the operational carbon trace gas EDR products mentioned above. The improvements include updated a priori profiles (based on current zonal climatologies) and RTA tuning (empirically removing residual biases between the model and observations), along with optimized quality assurance (QA) flags (based upon the algorithm chi-square, χ 2 , degrees-of-freedom, and other quality measures). Thus, in this paper we focus our attention on validating the operational NUCAPS (offline v2.8) CO, CH 4 and CO 2 trace gas EDRs; additional details on the NUCAPS carbon trace gas retrievals can be found in a forthcoming paper (Warner et al., manuscript in prep for Atmos. Chem. Phys.).

Methodology
Carbon trace gas EDR validation was a new requirement within the JPSS calibration/validation (cal/val) program [25] beginning with the transition to the full spectral-resolution (FSR) CrIS NUCAPS. The JPSS Level 1 requirements for carbon trace gas profile EDRs are given in Table 1, which are defined for the global ensemble of total column, cloud-cleared cases. These requirements serve as the program metrics by which the system is considered to have reached Validated Maturity and meets mission requirements. Satellite sounder validation methodology has been well-established for T, q and O 3 profile EDRs within previous validation work, with the various coarse-layer statistical uncertainty characterizations conducted relative to baseline reference datasets (i.e., "truth") roughly classified within a "hierarchy" [26]. Profile statistics for layer gas concentrations are defined in terms of fractional errors, including systemic (i.e., bias or "accuracy"), random (i.e., 1σ variability or "precision"), and total combined error (i.e., root mean square error, RMSE). For carbon trace gases we have adopted a similar hierarchical approach based upon available reference datasets, consisting of (1) numerical model global comparisons, (2) satellite EDR intercomparisons, (3) surface-based observing network assessments, and (4) intensive field campaign in situ data assessments. Those at the base of the hierarchy may be readily employed during the early cal/val stages (or anytime thereafter) of a satellite's lifetime, whereas those near the top are employed during later stages. These are briefly overviewed below.
Numerical model output (analysis and/or forecast interpolated to NUCAPS footprints) enables the rapid comparison with large, global datasets obtained during "Focus Days" (i.e., days selected for the acquisition of global SDRs that are used for retrieving EDRs using the latest versions of offline code) and as such are extremely useful for early evaluation of the algorithm and identifying gross problem areas [27]. Numerical models used for such comparisons include the European Center for Medium-Range Weather Forecast (ECMWF), the NOAA CarbonTracker [28], and the Copernicus Atmosphere Monitoring Service (CAMS) [29]. Such analyses are useful in identifying regional or spectral biases. However, dynamical models (e.g., ECMWF) do not constitute independent correlative data given that they assimilate radiances, and generally do not model chemistry and/or surface fluxes.
Trace gas EDRs obtained from other satellite sensors or algorithms provide quasi-independent observations for global intercomparisons. Like numerical model comparisons, this approach also allows for the acquisition of large, global data samples that can facilitate early consistency checks. In addition, such data (depending on the sensor/algorithm) may be more reliable than model analyses, especially in the case of previously validated EDR products, thus providing additional global confidence. AIRS is extremely useful for this purpose given it is a mature, high-resolution IR sounder that runs an end-to-end algorithm similar to NUCAPS, with the Aqua satellite flying in the same 01:30, 13:30 local equator crossing time orbit. Other satellite sounder EDR datasets include Tropospheric Monitoring Instrument (TROPOMI) onboard the Copernicus Sentinel-5, the NASA Orbiting Carbon Observatory (OCO-2), Greenhouse gases Observing Satellite (GOSAT), and the Aura Microwave Limb Sounder (MLS). However, a limitation of these data for validation is that they may possess similar retrieval error characteristics (in the case of AIRS [27]) or different vertical sensitivity, and thus ultimately would require proper treatment of each sensor's averaging kernels [30] (cf. Appendix A).
Ground-based, remotely sensed observations obtained periodically from surface-based observing networks provide independent truth datasets with a global distribution reasonably representing global latitude zones roughly analogous to radiosonde observations (RAOB) for temperature and moisture. The most notable example of such a dataset is the Total Carbon Column Observing Network (TCCON) [31], a ground-based network of uplooking solar-spectrum FTS instruments that obtain total column measurements of traces gases (discussed more in Section 3.1). A newer source of in situ data are vertical profiles obtained from the balloon-borne AirCore sampling system [32,33]. NUCAPS EDR collocations with these ground-based networks thus provide independent datasets for statistical assessments [26]. However, limitations in these datasets include the time latencies needed for acquiring reasonable collocation sample sizes, uncertainties in unit conversions, and different sensitivities to atmospheric layers.
At the top of the validation data hierarchy are intensive aircraft campaigns that provide episodic, but generally comprehensive sets of in situ and remotely sensed vertical profile data from multiple ascents and descents of dedicated aircraft flying over a specified region. Aircraft campaigns thus allow for detailed performance specification over regions of interest. Examples of trace gas campaigns suitable for SNPP validation include the Atmospheric Tomography (ATom) mission [34] (discussed in Section 3.3) and, previously, the HIAPER Pole-to-Pole Observations (HIPPO) [35] campaigns. The specific datasets used for NUCAPS trace gas validation are detailed in Section 3 below.

Data
Following the hierarchical approach described in Section 2, multiple complementary correlative truth datasets are relied upon to provide independent measurements for validation. We have leveraged three datasets for this purpose, namely uplooking spectrometer total-column data from TCCON, balloon-borne profiles from AirCore, and finally aircraft in situ vertical profile data from ATom. Existing satellite EDR datasets from other platforms (viz., Aqua AIRS and TROPOMI) have also been utilized for global intercomparisons to demonstrate the NUCAPS products look qualitatively reasonable and geographically consistent (Warner et al., manuscript in prep for Atmos. Chem. Phys.). Specifics of the layer and unit conversions required for conducting quantitative statistical assessments of NUCAPS (Section 4) are explicitly described for completeness and reproducibility.

TCCON
The Total Carbon Column Observing Network (TCCON) [31] is a ground-based network of Bruker 125HR uplooking solar-spectrum FTS that obtain spectral measurements in the near-IR region that encompasses the CO, CH 4 , CO 2 , N 2 O, and O 2 absorption bands [31], thus comprising an independent data source for validating NUCAPS. The interferograms are collected with a 45 cm optical path difference (45 cm was chosen deliberately to optimize retrievals of CO 2 in the 6000 cm −1 band) yielding a spectral resolution of ≈0.02 cm −1 . The total column retrievals of these trace gases is achieved via a retrieval algorithm (called GFIT) that includes both the forward and inverse model calculations. The inverse algorithm employs least-squares fitting by scaling an a priori [31]. The a priori profiles used in the GFIT system, x 0 , were obtained from the NOAA National Centers for Environmental Prediction, National Center for Atmospheric Research (NCEP/NCAR) analysis. TCCON station data can facilitate intercomparisons (acting as a "transfer-standard") between retrievals from multiple satellites.
The total column trace gases retrieved by TCCON are in dry mole fractions (DMF), whereas the NUCAPS algorithm retrieves trace gas layer abundances (in molecules/cm 2 ) on the 100 RTA model layers. Thus, a conversion scheme must be implemented. Furthermore, because TCCON and NUCAPS have fundamental differences in vertical sensitivity, it is desirable that the TCCON column averaging kernels (AKs) be utilized in the integration of the NUCAPS observation. For explicitness, the conversion scheme and application of TCCON column AKs in the integration of NUCAPS retrieved trace gas profile EDRs are detailed in Appendix A.
In practice the column AKs are dependent only on the solar zenith angle, θ , of the measurements [31]. Thus a single set of column AKs from the Lamont Site, a(θ ), are provided for gridded values of θ = 10 • , 15 • , . . . , 85 • (shown in Figure 1), which can then be interpolated to the solar zenith angle of the measurement. From Figure 1 a fundamental limitation in the utility of TCCON data for IR sounder (e.g., NUCAPS) validation becomes evident, namely the TCCON tendency for higher sensitivity in the upper layers of the atmosphere, except at larger θ for CH 4 and CO 2 . The TCCON vertical sensitivity must be taken into account when comparing against the sounder retrieved EDRs, which typically have peak sensitivity in the mid-troposphere (discussed more in Section 4).

AirCore
The NOAA Global Monitoring Laboratory (GML) AirCore sampling system [32,33,36] is an innovative in situ sampling approach that employs long, coated stainless-steel tubes to collect a sample of the ambient atmospheric air column (i.e., a "core" analogous to an ice-core). The tubes are open at one end, filled with a "fill gas" (with known levels of CO 2 , CH 4 , and CO), and configured in a tight coil so that they can be deployed upon a suitable platform, notably helium or hydrogen-filled 3000 g balloons. The AirCore is evacuated upon balloon-borne ascent and then fills with ambient air upon parachuted descent. However, unlike a radiosonde, the sampling package is tracked during its return from ≈30 km altitude to the surface (e.g., via a parachute in the case of a balloon) and subsequently sealed and recovered, where it can then be brought back to the lab for analysis using a laboratory-grade trace gas analyzer (e.g., Picarro, Inc.). AirCore thus allows in situ measurement of mole fraction samples for various trace gases (e.g., CO, CH 4 and CO 2 ) without requiring an aircraft or onboard data transmission system (e.g., as with an ozonesonde). A distinct advantages of AirCore is the capability for multiple deployments with a distributed geographic coverage over land. In this capacity AirCore has promise to be a surface-based network like TCCON, albeit with calibrated, high-resolution profiles measured in samples that survey 98% column, somewhat analogous to an ozonesonde network.
Because the length scale of diffusion is <0.5 m over the time it takes to analyze an AirCore sample (≈4 h), >100 discrete samples can be measured in a 100 m AirCore tube. The resultant profile resolution surpasses the 100-layer forward model grid employed in the retrieval. Thus we follow the approach documented in Nalli et al. [26] (Appendix B op. cit.), performing molecular-integrations of column densities for each trace gas constituent, allowing us to redivide the atmospheric path to the 101 layer boundaries, which then allows the computation of the effective layer values in a physically rigorous manner. The conversions to NUCAPS RTA layer abundances therefore requires concurrent measurements of temperature and water vapor, which are obtained from a radiosonde package flown on the AirCore payload.

ATom
The Atmospheric Tomography (ATom) mission [34] deployed an extensive gas and aerosol measurement payload on the NASA DC-8 aircraft for global-scale sampling of the atmosphere, profiling continuously from 0.2-12 km altitude. Flights occurred during all four seasons, originating from the Armstrong Flight Research Center in Palmdale, California, USA, flying north to the western Arctic, south to the South Pacific, east to the Atlantic, north to Greenland, before returning to California across central North America or the North American Arctic. Figure 2 shows the flight paths for the 2016-2018 sampling periods (ATom-1, -2, and -4). ATom-1 and -2 data were first used for SNPP NUCAPS development and validation prior to our J-1 validation effort; we subsequently obtained ATom-4 data for NOAA-20 validation (note that ATom-3 was still pre NOAA-20), and simply combined it with our existing ATom-1 and -2 collocation data for SNPP going forward. During ATom flights, the aircraft repeatedly ascended to 10-12 km, leveled-off, then descended to different heights at different rates. The raw aircraft data are recorded as a function of time, with reported altitudes featuring small-to-medium scale fluctuations throughout any given flight. Correlative truth profiles must thus be extracted only from smooth, continuous ascent/descents, disregarding small-scale altitude fluctuations and periods when the aircraft leveled off. Through trial and error, we devised an approach for extracting these profiles from the flight data based upon three criteria, namely the ascent/descent rate, ∆z/∆t (to find actual ascents/descents), the time difference between successive ascending/descending profiles (given that level-off periods separate the ascents and descents), and the thickness of the interquartile range covered by the profile (to ensure reasonably complete tropospheric profiles).
In this work we use the NOAA Picarro G2401-m in situ measurements of CO, CH 4 and CO 2 from ATom. More information on the ATom NOAA Picarro data can be found at https://espo.nasa. gov/atom/instrument/NOAA_Picarro. Because the 10 s average ATom Picarro data are given in mixing ratios in ppm (or dry air mole fractions) at a vertical resolution comparable to the RTM/RTA layering, we simply interpolate these data to the RTA levels (layer boundaries), then convert to layer abundances (molecules/cm 2 ) for the statistical assessment of the NUCAPS EDRs. This is in contrast to the molecule-conservation approach required for high-resolution data (e.g., AirCore) as briefly alluded to in the previous section.
Relevant to the JPSS requirements, the ATom statistics on total columns in Section 4 are computed as follows. NUCAPS performs retrievals of CO and CH 4 concentrations (as well as H 2 O and O 3 ) in layer abundance space (molecules/cm 2 ). Therefore column assessments for CO and CH 4 are performed for total column quantities by integrating the NUCAPS retrieved layer abundances; CO 2 , on the other hand, is retrieved in mixing ratios (ppm), and thus we need only take the mean for the total column (CO 2 is treated differently given that CO 2 channels are used first in the physical retrieval steps for the T profile retrievals). The column abundance for atmospheric species X (viz., CO and CH 4 ) is defined as the vertical integral of the number density N x from the top measurement z t to the measurement level height z (1) For the NUCAPS retrieval on the RTA layers, the total column may be computed from the finite difference formula [26] where z s is the surface altitude and the quantities N x,L δz L are the NUCAPS retrieved layer abundance for gas species x and RTA layer L (of thickness δz L ), L b is the bottom partial layer, and F BL is the bottom-layer multiplier factor defined as where p s is the surface level (boundary) pressure, P l b and P l b −1 are the bottom-layer boundary pressures (i.e., the pressures of the bottom two levels, l b and l b − 1).

NUCAPS Retrievals
The NUCAPS retrieval sensitivity to state profile parameters (e.g., trace gas concentration) can be inferred from the retrieval AKs. The AK matrix is theoretically defined as A ≡ ∂x/∂x [37][38][39][40], where A is a square matrix dimensioned m × m, m being the number of layers for the retrieved (i.e., estimated) and "true" (correlative) profiles,x and x, respectively. Note that the retrievalx is related to x via the measurement equationx = I [F(x, b), b, c], where F is the forward model with parameters b (e.g., spectroscopy), and I is the inverse model (i.e., retrieval), with parameters c not included in F (i.e., unrelated to the measurement) [26,39]. In the case of the NUCAPS algorithm, trapezoidal basis functions are used in the physical retrievals of each parameter (e.g., CO, CH 4 , CO 2 ), and thus the corresponding A matrices must be transformed to "effective AKs" on the RTA layers, A e (dimensioned n × n, where n ≡ 100 > m is the number of RTA layers), the details of which can be found in earlier papers [26,40]. Figure 3 shows zonal-mean NUCAPS effective AKs taken from a global Focus Day (23 January 2020) for the tropics, northern and southern hemisphere (NH and SH) midlatitude, and polar zones. The Focus Day includes on the order of 220,000 NUCAPS retrievals over the entire globe, which, generally speaking, is considered representative of the range of global atmospheric conditions. The plots show the RTA column (or area, i.e., the row-sum along the first dimension) effective AKs for the CO, CH 4 , and CO 2 channels [16] (subplots a-c, respectively). It can be seen that the peak sensitivities comprise broad layers. For CO, this roughly spans 600 to 300 hPa and that the peak remains fairly constant from the poles to the tropics. However, sensitivity is markedly less in the polar zones, this related to the lower tropopause, with sensitivity lowest in the SH plausibly due to lower ambient concentrations associated with substantially reduced source regions. There may also be some seasonal variability not accounted for in the Focus Day sample (which was during boreal winter). For CH 4 and CO 2 , the peak sensitivities are somewhat lower in magnitude and higher in altitude than CO (≈400-200 hPa and 300-200 hPa, respectively), with the height and sensitivity likewise decreasing with latitude zone.  The ability of the CrIS sensor to provide information about the trace gas profiles is also demonstrated by considering the NUCAPS algorithm degrees-of-freedom (DoF), defined as the sum of the A matrix diagonals, representing the total vertical information content [9]. Figure 4 shows the NUCAPS DoF for CO, CH 4 and CO 2 for the same Focus Day as in Figure 3. DoF for CO are mostly ≥1 (i.e., contain ≥1 independent pieces of information from the CrIS spectra) for most of the globe equatorward of the polar zones, with the exception of some high altitude locations, whereas areas with DoF ≥1 for CH 4 and CO 2 are primarily limited to the tropics (where the tropopause is at a higher altitude). Generally we expect greater retrieval skill in the regions with higher DoF.

Results and Discussion
In the following sections, the NUCAPS carbon trace gas retrievals are statistically validated versus the collocated baseline datasets described in Sections 3.1-3.3. In these analyses, we apply essentially the same collocation methodology as that used for our earlier ozone profile validation [22], whereby we impose a space-time collocation criterion in an effort to strike a balance between collocation mismatch uncertainty and sample size.

TCCON Stations (SNPP Focus Days: Apr Jun Aug Oct Dec Feb)
As mentioned in Section 3.1, statistical comparisons of NUCAPS with TCCON requires unit conversions, as well as integration of the NUCAPS 100 layer profiles. In this case, the NUCAPS profiles (in layer abundances, molecules/cm 2 ) are first converted to dry mole fractions and then integrated into a total column value with or without the TCCON AKs applied, as detailed in Appendix A. The results for SNPP NUCAPS retrievals versus individual TCCON stations, ordered from south to north, are summarized in Figure 6; these plots show reasonable consistency of the SNPP NUCAPS retrievals versus individual TCCON stations (similar results were obtained for NOAA-20, but not shown here due to space constraints). The positive bias evident in the CO results may in part be due to the different vertical sensitivities between the NUCAPS and TCCON measurements, as evidenced by column AK peak altitudes shown in Figures 1 and 3. The TCCON vertical sensitivities for CO are weighted toward the upper troposphere and above, whereas sensitivities for CH 4 and CO 2 transition to the troposphere for larger solar zenith angles, with a crossover point roughly in the mid-troposphere (≈450 hPa). NUCAPS retrievals, on the other hand, being derived from passive thermal IR spectra, tend on having peak sensitivity weighted toward the mid-troposphere. In addition to the different instrument sensitivities, however, there is also a known problem in the TCCON X CO scaling, wherein the TCCON data were scaled down by ≈6.7% to match older aircraft data. There is now less confidence placed in this value given that it has changed as more recent in situ profiles have been added for comparison [62], and thus it is believed that this scaling factor also contributes to the observed discrepancy between the NUCAPS and TCCON CO.  The results for the complete QA-ed samples (N = 472, 422, 540, yields = 74%, 67%, 85% for CO, CH 4 and CO 2 , respectively) are summarized as scatterplots and histograms in Figures 7 and 8, respectively. The scatterplots show reasonable correlation between the total column retrievals and TCCON measurements (r = 0.89, 0.63, 0.86 for CO, CH 4 and CO 2 , respectively), and the histograms show roughly Gaussian distributions in the errors. Featured in the histograms are results with (blue) and without (red) the TCCON AKs applied to the integrations, which for CH 4 and CO 2 basically show very little difference when the AKs are applied. However, for CO a somewhat larger bias is seen when TCCON AKs are applied. At first this may seem counterintuitive, but this is likely because, as already mentioned, the TCCON AKs for CO (unlike CH 4 and CO 2 ) all peak above the UT/LS, whereas the NUCAPS AKs for CO peak in the mid-troposphere. Thus, greater weight is given to the upper-troposphere/lower-stratosphere (UT/LS) when TCCON AKs are applied to NUCAPS, and given that NUCAPS has no skill above 100 hPa, we therefore would expect less agreement in the total column results.

Statistical Analysis versus AirCore Baseline
Like TCCON, AirCore data can provide spot-checks and an additional evaluation method for comparing results from multiple satellites (viz. SNPP and NOAA-20). NOAA/GML provided us with 42 complete AirCore profiles launched over the period of 22 March 2018 to 30 January 2020. The AirCore balloon launches were timed for LEO satellite overpasses, specifically the Orbiting Carbon Observatory-2 (OCO-2) within the NASA A-Train constellation (01:30 and 13:30 local equator crossing time orbit), which fortuitously collocate with the SNPP and NOAA-20 overpasses in the same afternoon orbit. NUCAPS FORs are included within ∆r ≤ 100 km radius and ∆t within ±2 h of the AirCore launches; Figure 9 shows the locations of collocated NOAA-20 NUCAPS FOR along with the AirCore launch sites. One may see that the samples are primarily located over North America, with a handful located in Europe. The original "high density" profiles were reduced to the NUCAPS 100 RTA layer abundances as described in Section 3.2. To perform the unit conversions it was necessary to utilize the AirCore payload InterMet-1 radiosonde temperature and relative humidity (RH) measurements, but negative RH values were sometimes reported by the sonde in the UT/LS. To get around this problem, we simply adjusted these values to a small positive number (1%), as the stated accuracy of the InterMet-1 RH sensor is ±5%. To justify this, we performed a simple sensitivity test to determine the error in layer abundance for a +1% RH perturbation (performing conversions with 1% added to the entire RH profile and then subtracting the unperturbed values). The results are presented as a function of ambient water vapor mixing ratio (ppmv) and pressure altitude in Figure 10, where it can be seen that the error from a 1% RH adjustment is negligible. Because of the limited DoF and vertical sensitivity of the instrument, we also use AKs in our evaluation of the NUCAPS carbon trace gas profile retrievals. Thus the analysis will include results based upon "smoothed" correlative truth data, x s , which is obtained by applying the NUCAPS effective-AKs A e to the original high-resolution truth profile x [26,39,40] ln(x s ) = ln(

NUCAPS-J01 AirCore Collocations (-2 2 hrs, 100 km)
where x 0 is the a priori profile. Using x s in place of x in the statistical analyses effectively removes the null-space error associated with the limited vertical resolution inherent in the radiances used by the retrieval algorithm. However, caution must be exercised when using this approach. When the algorithm possess little-to-no sensitivity to a profile parameter x, the AK matrix becomes a null matrix, A e → 0 n,n , and the second term on the right in Equation (4) goes to zero. In this case, both the smoothed truth profile and the retrieval reduce to the a priori, x 0 . Although the result would indicate that the retrieval system is self-consistent and working properly, it would also give the misleading appearance of a perfect retrieval of the true atmospheric state, which is definitely not the case. Figures 11 and 12 show the resulting statistical comparisons of the collocated NOAA-20 NUCAPS retrievals versus the AirCore profiles, without and with AKs applied, respectively. The results for CO, CH 4 and CO 2 are shown in the left, middle and right plots, respectively. From Figure 9, we recall that the AirCore profiles are all located over Northern Hemisphere (NH) land-based sites (viz., North America and Europe). We subsequently found that several of these profiles exhibited very large gradients that are well outside the theoretical vertical resolution limitations of the CrIS sensor, with vertical gradients in the AirCore high-resolution profiles not well-captured by the NUCAPS climatological a priori. The profile statistics for AirCore are shown on the coarse-layers defined by the NUCAPS algorithm trapezoidal basis functions, similar to the statistical analyses of NUCAPS T(p)/H 2 O/O 3 profiles [21,22].  For this small NH continental sample, the NUCAPS retrievals are found to exhibit somewhat a positive bias in CO and CH 4 , both without ( Figure 11) and with AKs (Figure 12) applied. The latter indicates biases not arising from null-space errors ( 25% and 2.5%, respectively) and thus there is systematic error in the layers where NUCAPS has sensitivity. This results from AirCore profiles with higher observed concentrations (not shown here) in the lower troposphere ( 700 hPa), decreasing rapidly to the mid-troposphere (≈500 hPa), then increasing again to the upper troposphere (≈200 hPa). NUCAPS, on the other hand, has sensitivity in the mid-troposphere (Figure 3), and the a priori concentrations generally decrease with height. In contrast, the CO 2 retrievals exhibit a very small negative bias (≈0.5%), with most of that apparently null-space error as seen in the results with AKs applied (Figure 12 right), indicating that the retrieval is accurate in the layers of sensitivity (Figure 3). Precision magnitudes (random errors) for all three gases are somewhat comparable to their accuracies (systematic errors), with some of those errors originating from their null-spaces, especially carbon monoxide, and to a lesser extent, carbon dioxide. Given the limited size and geographic representation of the sample, these results should not be considered definitive or globally representative, but they do offer insight into the challenges inherent in retrieving regional profiles over land. But more importantly, these first-use results demonstrate the potential utility of the AirCore sampling system for operational trace gas validation.

Statistical Analysis versus ATom Baseline
The in situ global data from the ATom intensive campaigns are considered to be at the top of our validation "hierarchy" (cf. Section 2). Thus, while we relied more on the TCCON analyses for the developmental phases of the trace gas algorithms (per the hierarchal approach), we give higher weight to the ATom data for a final quantitative evaluation of the NUCAPS carbon gas EDR product performance relative to the metrics defined by the JPSS Level 1 requirements summarized in Table 1. Although JPSS requirements are applicable to the total system error (including null-space error), it is nevertheless imperative to include AKs in the validation of the carbon trace gases as in Section 4.2, with the caveats discussed above in that section. Similar to the analyses for TCCON and AirCore, NUCAPS FORs are collocated within ∆r ≤ 100 km radius and ∆t within ±1.5 h of the ATom measurements. Figure 13 shows the dates and locations of SNPP and NOAA-20 NUCAPS FOR collocated with the midpoint of extracted profiles from the ATom-1, -2, and -4 campaigns. These maps show the excellent global zonal representation of the validation sample, albeit primarily over oceans. Although the NUCAPS retrievals may generally be "easier" (i.e., more accurate) over ocean surfaces (i.e., where the surface emission/reflectance properties are relatively uniform and well characterized relative to the retrieval uncertainties) [67], this is not always the case [68], and operational satellite data have been demonstrated to make their greatest impact over the data-sparse oceans [69]. Thus the ATom data are of singular value for our validation.
Based on the NUCAPS-ATom collocation samples, the global profile error statistics for the NUCAPS retrievals (IR accepted cases, clear to partly cloudy, with trace gas QA applied (Warner et al., manuscript in prep for Atmos. Chem. Phys.)) are computed versus ATom NOAA Picarro baseline; as before (cf. Section 4.2), the results are summarized within Figures 14-17, with CO, CH 4 and CO 2 statistics shown in the left, center, and rightmost plots. Because the ATom profiles generally exhibit smaller vertical gradients and are closer to the NUCAPS a priori profiles (as opposed to AirCore), we display these results on the original 100 RTA layers (as opposed to trapezoidal coarse-layers). For reference, the JPSS Level 1 global specification requirements (Table 1) for accuracy (bias) and precision (variability) are included in the plots with dashed gray lines. Figures 14 and 15 show results for NOAA-20 and SNPP, respectively, and these are followed by Figures 16 and 17, which show the results with the NUCAPS AKs applied to the ATom profiles. Collocations (1.5 hr, 100   In the leftmost plots of Figures 14 and 15 we find that the CO accuracy (biases) for the broad layer between 400-600 hPa (which corresponds to the region where the algorithm has maximum sensitivity) are reasonably close to, or within, JPSS requirements; CH 4 and CO 2 biases, on the other hand, are well within requirements throughout the troposphere, with CH 4 bias statistically close to zero (at the 2σ level) below 400 hPa. Precision (variabilities) for CO and CH 4 fall somewhat outside the requirements, whereas the CO 2 precision meets requirements throughout the entire tropospheric column.

Conclusions and Future Work
This work has presented the formal validation of NOAA-20 and SNPP NUCAPS IR atmospheric carbon trace gas profile EDRs (CO, CH 4 and CO 2 ), in continuation of the validation of the T, q and O 3 profile EDRs described in earlier papers [21][22][23][24]. Because of the NUCAPS cloud-clearing methodology, the NUCAPS atmospheric profile EDRs, including trace gases, are retrieved under global, non-precipitating conditions, allowing the benefit and advantage of twice-per-day (per satellite) global yields on the order of 40-70%.
The NUCAPS IR sounder validation strategy employs a "hierarchical" approach drawing upon multiple independent baseline truth datasets [26], including TCCON ground-based spectrometers, AirCore profiles, and ATom aircraft-based in situ profiles. Based upon these globally representative data, we have conducted ongoing statistical analyses (per the JPSS Cal/Val Program) that have provided guidance for the recent NUCAPS trace gas algorithm improvements validated in this work (Warner et al., manuscript in prep for Atmos. Chem. Phys.). The NUCAPS optimal estimation (OE) physical retrievals generally improve upon the climatological a priori (not shown here due to space limitations) where CrIS has sensitivity (Figure 3). We have subsequently shown here that the carbon trace gas EDRs (CO, CH 4 , and CO 2 ) from the latest version of NUCAPS are performing reasonably within expectations. It is noted that the truth data used in these analyses span all global climate zones (tropical, midlatitude and polar), as well as land and ocean locations ( Figures 5, 9 and 13). Based upon our analysis comparing to global in situ vertical profiles from the ATom campaigns, it has been shown that the NUCAPS CrIS-FSR carbon trace gas profile EDRs generally meet JPSS Level 1 global performance requirements (Tables 1 and 2), with the exception of the stringent 1% CH 4 precision specification, which may be extremely difficult to achieve in practice.
Future work on the NUCAPS trace gas products include optimization of the damping parameters, implementation of QA for the CO 2 retrievals, improvements to the SARTA forward model surface emissivity first-guess (land, ocean and snow/ice), as well as exploring additional trace gas products (e.g., NH 3 , SO 2 , Isoprene, PAN) and collaborations with in situ data providers (e.g., NOAA/GML). The NUCAPS AKs are planned to be included in a future version as standard output in the operational NetCDF files (currently the AKs are output only to offline binary files), and the NUCAPS algorithm will also operationally be supported for data from the EUMETSAT Metop-B, -C and Metop-SG hyperspectral IASI systems. Finally, we express our appreciation to the 4 anonymous reviewers who provided constructive feedback that we used to improve the quality of these papers. The views, opinions, and findings contained in this paper are those of the authors and should not be construed as an official NOAA or U.S. Government position, policy, or decision.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: and the water vapor partial pressure is computed from p q (P) = P Q v (P) · 10 −6 1 + Q v (P) · 10 −6 (A8) where Q v is the volume mixing ratio in in parts per million (ppmv). The retrieval dry mole fraction is then computed from Equation (A2) as X d (P) = X(P) where from Equation (A6)

. Application of TCCON Column AKs
Rodgers and Connor [30] formulated the theoretical basis for performing rigorous intercomparisons of remotely sensed atmospheric soundings obtained by instruments with differing measurement characteristics. For total column estimates from two observing systems, C 1 and C 2 , the expected difference is given by [30] C where a 1 , a 2 are the column averaging kernels, and ε 1 , ε 2 are the column measurement errors, for each sensor, respectively, x is the "true" atmospheric profile state (implicitly in dry mole fraction, omitting the subscript d in vector notation for convenience), and x c is the central tendency of the ensemble (assumed to be Gaussian); we take the subscripts "1" and "2" to denote NUCAPS and TCCON, respectively. The corresponding variance, σ 2 , is given by [30] σ 2 C 1 − C 2 = (a 1 − a 2 ) T S c (a 1 − a 2 ) + (σ 2 1 + σ 2 2 ) , where S c is the background covariance matrix. Given a known "true" profile state, x, along with S c , Equations (A11) and (A12) can be used to verify rigorously whether a collocated NUCAPS and TCCON column observation are consistent within their theoretical measurement limitations. However, in the current application we are given only a profile estimate (NUCAPS retrieval,x 1 ), and a column estimate (TCCON observation, C 2 ) for the purpose of evaluating NUCAPS using TCCON as a reference, while the "true" profile state remains unknown. Given the significant differences between each system's AKs (cf. Figures 1 and 3), and that the NUCAPSx 1 is an OE retrieval, we estimate the TCCON observation of that state by integratingx 1 using the TCCON AKs [30] where C 0 and x 0 denote the TCCON column and profile a priori, respectively. This equation roughly follows from Equation (A11) by assuming a 1 ≡ i (the unit vector), x ≡x 1 (the NUCAPS retrieved profile is used in lieu of the unknown truth), ε 2 ≈ 0 (the TCCON measurement is accurate), and x c ≡ x 0 (i.e., the ensemble central tendency is captured by the TCCON a priori). C 12 can then be used in place of C 1 for comparisons against the TCCON observations, C 2 , in empirically estimating ε 1 .
The righthand side of Equation (A13) integrates the NUCAPS profile in a manner approximating what TCCON would have observed (under the same ambient environmental conditions) by applying TCCON AKs within the integration. The two terms are computed as follows: