Uncertainty Characterization and Propagation in the Community Long-Term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS)

The Community Long-term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS) retrieves multiple Essential Climate Variables (ECV) about the vertical atmosphere from hyperspectral infrared measurements made by the Atmospheric InfraRed Sounder (AIRS, 2002–present) and its successor, the Cross-track Infrared Sounder (CrIS, 2011–present). CLIMCAPS ECVs are profiles of temperature and water vapor, column amounts of greenhouse gases (CO2, CH4), ozone (O3) and precursor gases (CO, SO2) as well as cloud properties. AIRS (and CrIS) spectral measurements are highly correlated signals of many atmospheric state variables. CLIMCAPS inverts an AIRS (and CrIS) measurement into a set of discrete ECVs by employing a sequential Bayesian approach in which scene-dependent uncertainty is rigorously propagated. This not only linearizes the inversion problem but explicitly accounts for spectral interference from other state variables so that the correlation among ECVs (and their uncertainty) may be minimized. Here, we outline the CLIMCAPS retrieval methodology with specific focus given to its sequential scene-dependent uncertainty propagation system. We conclude by demonstrating continuity in two CLIMCAPS ECVs across AIRS and CrIS so that a long-term data record may be generated to study the feedback cycles characterizing our climate system.


Introduction
Modern-era hyperspectral infrared sounders measure emitted radiance at the Top Of Atmosphere (TOA) in hundreds of narrow spectral channels. Inversion methods convert these radiances (Level 1 sounding measurements) into vertical quantities of temperature, moisture as well as greenhouse-and pollutant gases (Level 2 retrieval products) for target applications. The Bayesian Optimal Estimation (O-E) inversion framework is widely accepted by the scientific community as the most suitable for retrieving science quality products because it allows uncertainty propagation per datum [1][2][3][4][5][6][7][8]. (We use the term "datum" here to refer to an individual Level 2 retrieval footprint.) Despite a good framework, however, the retrieval of multiple discrete Essential Climate Variables (ECVs; [9]) from sounding measurements for climate applications remains exceedingly complex. Most inversion algorithms either ignore or oversimplify the confounding influence of trace gases on temperature and moisture retrievals. Others that retrieve the full set of atmospheric state variables all at once, ignore cross correlation among variables with block-diagonal error covariance matrices. For these reasons, differences between sounding ECVs from multiple instruments and retrieval algorithms can confound understanding [10,11].
We present here a novel approach for the retrieval of multiple ECVs from a single spectral measurement that fully accounts for and minimizes cross correlations with a methodology that quantifies and propagates scene-dependent uncertainty sequentially. This specific source of uncertainty is often overlooked in traditional O-E systems (e.g., [12]) because it is difficult and cumbersome to quantify. The fact is that hyperspectral infrared radiances are highly correlated spectral signatures of many coincident variables. For example, the amount and height of clouds affect the observing capability of tropospheric ECVs (~800 cm −1 ); the amount of water vapor molecules affects mid-tropospheric methane (CH 4 ,~1300 cm −1 ); surface emissivity affects carbon monoxide (CO,~2150 cm −1 ); the rate at which temperature warms or cools with pressure (known as lapse rate) affects vertical resolution and, when low, hinders the ability to retrieve fine structural features in temperature and water vapor. Basically, no set of narrow spectral channels is sensitive to only one atmospheric variable. Even when channels are selected to have high signal-to-noise for only one target retrieval variable [13], it has sensitivity to other atmospheric state variables, which becomes a source of uncertainty in O-E inversion. What complicates matters is that the interference from background state variables is highly scene-dependent, with each variable changing differently over time and space. If this source of uncertainty is not quantified, then the inversion system will propagate it into the retrieved ECVs instead of their associated uncertainty estimates ultimately limiting the usefulness of these ECVs in climate applications and worse, lead to misinterpretation.
Historically, inversion methods are designed to retrieve state variables for weather applications [14][15][16][17][18][19][20][21][22][23] that can tolerate a degree of scene-dependent uncertainty and correlation because the high rate of change in weather systems greatly exceeds product uncertainty. It is when one considers climate applications that depend on the quantification of a low rate of change and subtle correlations among ECVs over large space-time scales that product uncertainty must be carefully treated and fully characterized. In this paper, we introduce the Community Long-term Infrared Microwave Coupled Atmospheric Product System (CLIMCAPS) that we designed to retrieve ECVs from a multi-decadal harmonized record of the Atmospheric Infrared Sounder (AIRS, in low-Earth orbit since 2002) and Cross-track Infrared Sounder (CrIS, in a similar orbit since 2011) for climate applications. The CLIMCAPS inversion methodology was derived from the AIRS Science Team (AST) implementation of a Bayesian O-E approach [5][6][7][8][24][25][26][27] for the AIRS instrument [2,28]. CLIMCAPS shares its AST heritage with the NOAA-Unique Combined Atmospheric Processing System (NUCAPS) that was operationally implemented as the AST version 5.9 retrieval algorithm [29] for the Infrared Atmospheric Sounding Interferometer (IASI) on MetOp-A/B/C, CrIS on SNPP, as well as the Joint Polar Satellite System (JPSS) series [30] of which NOAA-20 is the first. Like NUCAPS, CLIMCAPS retrieves multiple atmospheric and diagnostic variables with which to characterize the atmospheric state [31][32][33]. These include profiles of temperature, water vapor, greenhouse and pollutant gases (O 3 , CO, CH 4 , SO 2 , HNO 3 , and N 2 O), cloud top pressure, cloud fraction, surface temperature and emissivity as well as an array of diagnostic metrics that characterizes scene complexity and retrieval quality.
Unlike these legacy systems, CLIMCAPS adopts a novel scene-dependent uncertainty propagation scheme to retrieve ECVs that are largely independent of each other and the background atmospheric state. Scene-dependent uncertainty is one of the main sources of uncertainty in remote sounding systems and must be quantified and propagated to characterize ECV cross-correlations and correctly interpret differences when compared to other Climate Data Records (CDRs). CLIMCAPS evaluates the information content of each datum individually and does not make a priori assumptions about the type or amount of information a measurement should contain. With this it maintains the true instrument observing capability over long time periods. Our work is motivated by a desire to move beyond single variable estimates of the climate system by creating a long-term record of multiple coincident ECVs that are spectrally uncorrelated and also well-characterized so that complex feedback cycles may be studied. With this we address one of the remaining challenges in climate science today [34].
In this paper, we adopt the international metrological norms for "error" (bias or accuracy) and "uncertainty" (doubt or precision) as was done in similar work elsewhere [10,35,36]. We start with an overview of the modern-era hyperspectral infrared instruments on low-earth orbiting platforms (Section 2.1) followed by a generalized overview of the O-E framework (Section 2.2.1) and how we adopted it to achieve scene-dependent uncertainty propagation in CLIMCAPS (Section 2.2.2). This is followed by a discussion of our choice for a priori estimates of temperature and moisture (Section 2.2.3) with a discussion of retrievals in cloudy scenes (Section 2.2.4). We present results for a single datum to demonstrate the type and amount of uncertainty that propagates form ECV (temperature) to ECV (water vapor) (Section 3.1) and then expand our scope to demonstrate systematic effects at larger scales for six global days across four seasons (Section 3.2). We conclude with a summary and discussion of future work (Section 4).

Hyperspectral Infrared Sounders
Modern era hyperspectral infrared sounders are preceded by the High-Resolution Infrared Radiometer (HIRS), which measures TOA radiance in broad spectral bands with a record that dates back to 1979. It makes an important contribution with its long-term record of retrieved cloud properties [37], but it took the launch of hyperspectral infrared sounders to enable the observation of multiple coincident variables characterizing the atmosphere's thermodynamic structure and its chemical composition. Hyperspectral measurements with narrow spectral channels characterize atmospheric variables in narrow pressure layers to achieve a vertical resolution of 1 to 5 km depending on height. The sounders on EOS/Aqua, MetOp, SNPP and NOAA-20 satellites (Table 1) have the accuracy and precision to make direct and diagnostic measures of atmospheric change [38]. In sun-synchronous orbits, these observing systems achieve daily global coverage. EOS/Aqua was launched in 2002 with a local overpass time of 13:30. SNPP was launched in 2011 in a similar afternoon orbit with NOAA-20 following in 2017. These instruments have had a large impact on the quality of numerical weather prediction models through the assimilation of channel subsets from cloud-free scenes [14,[39][40][41][42]. Moreover, the operational and scientific value of retrieved sounding products is reflected in the literature with studies on a wide range of subjects; from weather and climate feedback studies to regional radiative processes and the forecasting of extreme weather events [17,20,22,[43][44][45]. SNPP and EOS/Aqua now have more than seven years of overlap. Demonstrating data continuity between these two platforms is important for extending the record of hyperspectral infrared into future decades when subsequent launches of the CrIS instrument (Table 1) will occur. Table 1. Overview of the three modern-era hyperspectral infrared (IR) instruments on low-Earth orbiting satellite platforms. CrIS on SNPP was at first transmitted at nominal spectral resolution (NSR) but later switched to full spectral resolution (FSR) so we list statistics for both. These instruments measure the top of atmosphere radiance with spatial footprints known as (FOV) and CLIMCAPS retrieves soundings from a cluster of FOVs knowns as the Field of Regard (FOR). Instrument noise is given in brightness temperature units as the noise equivalent delta temperature (NEDT) for a scene at 250K.  4 Field of Regard is the technical term but we refer to a retrieval footprint as the "datum" in this paper; 5 Longwave infrared; 6 Midwave infrared; 7 Shortwave infrared.
Combining multi-instrument products in a long-term record for climate applications, however, is not trivial, as shown for temperature retrievals from CrIS, IASI and AIRS using a statistical regression scheme [10]. This study concludes that despite similar observing capabilities and regression coefficients derived independently for each instrument to promote stability [46,47], a record of temperature retrievals from AIRS, IASI and CrIS do not form a homogenized (continuous) record but instead propagate Level 1 uncertainty that can only be explained by evaluating instrumentation differences. AIRS [48] is a grating spectrometer and integrates its instantaneous Field Of View (FOV) in a spatial point spread function with complicated shape (an oval "tophat" [49]) that overlaps by~25% along the scanline. CrIS [50] is a Michelson Interferometer and has FOVs that are circular and independent of each other but that rotate along the scanline. Spectrally, CrIS measures each scene with three detectors-longwave, mid-wave and shortwave infrared-whereas AIRS has many detectors that observe radiation in much narrower spectral bands. Each detector has a spatial footprint that varies slightly from all others so in complex cloudy scenes, an AIRS measurement made up of many detector footprints differs from a CrIS measurement of the exact same scene with only three detector footprints. AIRS and CrIS instrument differences can therefore be pronounced in complex cloudy scenes [10]. Other differences involve instrument noise as well as spectral line shapes and resolution. CLIMCAPS is primarily designed to generate a long-term record of multiple atmospheric ECVs from the AIRS [48] and CrIS [50] but once demonstrated can readily be extended to include IASI ( Table 1) for applications that require a higher frequency of observations about the diurnal cycle.

Bayesian Optimal-Estimation
The inversion of hyperspectral infrared radiance measurements (Level 1) into discrete atmospheric variables (Level 2) is an under-constrained, ill-posed, non-linear problem. We discuss here how inversion can be linearized and regularized (not accounting for iteration) within a Bayesian O-E framework [8] in broad, simplified terms by using the equations as outlined in the seminal text by Rodgers [1]: wherex and x a are the retrieved and a priori variables, respectively. G (Equation (2)) is the gain matrix that transforms information from radiance space (y) into atmospheric state space (x). The gain matrix is defined as: where K is the Jacobian that describes the sensitivity of forward model calculated radiances (y b ) to the target variable. Radiative transfer (or forward) models require as input a set of variables that describes the full atmospheric state (background variables, x b , together with target variable, x a ) as close to the true state as possible so that TOA radiances can be calculated with accuracy. S m is the measurement covariance matrix that is made up of a few components including random instrument noise, channel correlation and an estimate of the forward model uncertainty. Each retrieval system defines this matrix differently, depending on their assumptions. Our definition of S m for CLIMCAPS is described below (Figure 1; Section 2.2.4). S a is the a priori covariance matrix that characterizes the uncertainty in the existing knowledge about a target variable. In generic terms, a Bayesian inversion updates the prior estimate of a target variable, x a , with information about the true state of that variable from the TOA radiance measurements, y. The contribution that x a and y respectively make to the final solution are weighted by S a and S m . One of the most commonly used quality metrics in analysis and assimilation of satellite sounding retrievals is the averaging kernel, A = GK = ∂x ∂x , that describes the sensitivity of the retrieved variable to the true state of that variable. It ranges between 0 and 1 (Figures 4 and 6) and is interpreted as the degree to which the true state is represented in the retrieved solution, i.e., how much information the measurements contributed. If A = 1, then the measurements completely replace the a priori estimate in the solution. If A = 0, then the solution is the a priori estimate. Neither of these two extreme cases ever hold true, because the information content of a measurement is never negligible (i.e., where A = 0 and S m → ∞ ), nor is it ever free of error and uncertainty (i.e., where A = 1 and S m = 0). Remote Sens. 2019, 11, x FOR PEER REVIEW 7 of 24 Figure 1. Flow diagram of the CLIMCAPS sequential retrieval process and propagation of scenedependent uncertainty for a specific datum, or instrument footprint at a specific time.
Step 1 involves the retrieval of cloud properties (amount and cloud top pressure) that are used to clear the cloud signal from the measured radiance spectrum (R). All subsequent profile variables (temperature, water vapor and trace gases) are retrieved from this "cloud cleared" radiance spectrum (RCC), the sequence of which is in a specific order. The most linear inversion, namely temperature ( ), is done first (Step 2). Once the scene-specific temperature is known, water vapor ( ) is retrieved followed by ozone , carbon monoxide and nitric acid ( ) (Steps 3 through 6). With the scene-specific trace gas quantities now known, temperature can be retrieved a second time, but with more spectral channels because the uncertainty due to interfering gases can be quantified precisely (Step 7). The remainder of the CLIMCAPS greenhouse and precursor gases are then retrieved (Steps 8 through n). Each retrieval step employs a unique subset of RCC channels. The CLIMCAPS Bayesian retrieval approach depends on the definition of two error covariance matrices, one for the a priori estimate ( ) and another for the radiance measurement ( ) . is calculated off-line as a statistical ensemble uncertainty and remains static from datum to datum as well as through all the retrieval steps.
, on the other hand, is datum-specific and updated at each step. is an aggregate term of Figure 1. Flow diagram of the CLIMCAPS sequential retrieval process and propagation of scene-dependent uncertainty for a specific datum, or instrument footprint at a specific time.
Step 1 involves the retrieval of cloud properties (amount and cloud top pressure) that are used to clear the cloud signal from the measured radiance spectrum (R). All subsequent profile variables (temperature, water vapor and trace gases) are retrieved from this "cloud cleared" radiance spectrum (R CC ), the sequence of which is in a specific order. The most linear inversion, namely temperature T , is done first (Step 2). Once the scene-specific temperature is known, water vapor q is retrieved followed by ozone Ô 3 , carbon monoxide Ĉ O and nitric acid Ĥ NO 3 (Steps 3 through 6). With the scene-specific trace gas quantities now known, temperature can be retrieved a second time, but with more spectral channels because the uncertainty due to interfering gases can be quantified precisely (Step 7). The remainder of the CLIMCAPS greenhouse and precursor gases are then retrieved (Steps 8 through n). Each retrieval step employs a unique subset of R CC channels. The CLIMCAPS Bayesian retrieval approach depends on the definition of two error covariance matrices, one for the a priori estimate (S a ) and another for the radiance measurement (S m ). S a is calculated off-line as a statistical ensemble uncertainty and remains static from datum to datum as well as through all the retrieval steps. S m , on the other hand, is datum-specific and updated at each step. S m is an aggregate term of the random instrument noise, systematic channel correlations, forward model uncertainty as well as uncertainty due to interfering background state variables (S b ). This diagram highlights the propagation for temperature and water vapor retrieval uncertainty specifically and therefore do not group these terms with the background error term. Matrix K is the forward model weighting function, or partial derivative of the forward model with respect to the retrieval variable. This sequential retrieval approach with scene-dependent error propagation linearizes each retrieval step as much as possible to promote product stability with known uncertainty estimates in clear and cloudy scenes.
The error covariance matrix of the inverse solution (Equation (3)),Ŝ, for each retrieved variable within a datum is broadly made up of three sources, namely, (i) uncertainty in the "null space", or that part of the solution without any contribution from the measurement (first term on right-hand side of Equation (3)), (ii) uncertainty about the measurement contribution to the solution (second term), and (iii) uncertainty due to interference from background state variables not directly retrieved (third term).Ŝ Most retrieval systems only account for the first two components ofŜ by assuming S b to be negligible for the sake of algorithmic simplicity and computational efficiency. While the interference from background variables can be reduced through channel selection, for example, it can never be rendered negligible under all atmospheric conditions. The TOA radiance spectrum is a highly correlated signal of multiple atmospheric state and Earth surface variables. If S b is set to zero then the interference from background state variables are treated as information about the target variable. This spectral correlation between target and background variables in Level 1 radiances will affect the quality of Level 2 products in ways that cannot be quantified a posteriori because it was not propagated explicitly as a source of uncertainty but instead as source of geophysical information about the target variable. This poses a serious risk to data analysis and application, especially once Level 2 retrievals are aggregated into long-term Level 3 and Level 4 CDRs that can amplify systematic effects and lead to misinterpretation and wrong conclusions about trends and cross-correlations in the data. We, therefore, make the case that it is critically important to quantify and propagate all three terms inŜ (Equation (3)) as rigorously as possible, especially as far as the retrieval of ECVs go. The clear and transparent reporting of all sources of error and uncertainty is vital for promoting understanding and fostering trust in satellite sounding products.

Scene-Dependent Uncertainty Propagation
With scene-dependent uncertainty, we mean the doubt that exists about the full atmospheric state at a specific datum (9 CrIS/AIRS FOVs;~50 km at nadir; Table 1). It is a source of uncertainty in satellite sounding retrievals because spectral channels are sensitive to multiple atmospheric variables. The target retrieval may be tropospheric water vapor (H 2 O), but spectral channels sensitive to H 2 O may simultaneously be sensitive to temperature (CO 2 absorption lines) and clouds. A set of channels selected for a target variable may have high information content for a single variable but weak interference from many other state variables. In CLIMCAPS, we implemented a sequential retrieval approach with which to parameterize, quantify and propagate scene-dependent uncertainty so that the retrieved ECVs may be fully characterized and understood. Scene-dependent uncertainty has high variability over time and space and by accounting for it in a transparent and rigorous manner we may correctly interpret variability in CLIMCAPS ECVs from scene-to-scene, day-to-day and year-to-year. Climate science and applications depend on data products where all sources of uncertainty; however small, are propagated rigorously [35,36,51].
In this paper, we describe the CLIMCAPS sequential retrieval approach for temperature and atmospheric gases, water vapor specifically. A description of Earth surface variables-emissivity, reflectivity and skin temperature-is beyond scope. This, along with all relevant technical details, will be covered in full within the CLIMCAPS Algorithm Theoretical Basis Document (ATBD) that is in preparation by the same authors. Broadly speaking, the steps in a CLIMCAPS sequential retrieval are in order as follows: cloud cleared radiances, temperature (T), H 2 O, O 3 , CO, HNO 3 , and then again T, followed by CH 4 , CO 2 , SO 2 and N 2 O ( Figure 1).
Cloud cleared radiances are retrieved first because they cause the largest source of scene-dependent uncertainty in satellite sounding observations. Spectrally accounting for clouds and quantifying their uncertainty within each datum is key to accurately inverting a TOA radiance spectrum. A scene can be complex (e.g., pyrocumulus clouds over a mega-fire, multi-layer broken cloud fields, severe storm outbreak) or relatively simple (e.g., uniform cirrostratus or altocumulus decks). Once the cloud amount and cloud top pressure are retrieved and removed from the radiance measurement (known as "cloud clearing", Section 2.2.4), a profile of T is retrieved because a good estimate of scene-dependent temperature is required in the retrieval of all minor gases, including H 2 O. Following clouds and temperature are H 2 O, O 3 , CO, HNO 3 , in that order (Steps 3-6, Figure 1). Once the column density of these gases is known for the retrieval datum, T is retrieved a second time but with channels sensitive to H 2 O included because they contain tropospheric T information and help resolve vertical structure in narrower pressure layers. Please note that the second T retrieval uses the same a priori estimate and error covariance matrix (Section 2.2.3) as the first T retrieval, i.e., x a and S a are identical in Steps 2 and 7 ( Figure 1). Only the scene-dependent uncertainty and channel set are different.
A sequential retrieval approach has several key benefits. Firstly, it is very fast, because inversion is done one ECV at a time, using only a small subset of channels sensitive to the target ECV, e.g., out of a total of 2211 CrIS channels, CLIMCAPS uses 120 for temperature and 66 for water vapor ( Figure 2). These channels are selected for their information content relative to the target EVC as well as their spectral purity, or weak sensitivity to other variables. Secondly, a sequential retrieval is stable and robust because it solves for the most linear components first-retrieving variables with varying degrees of non-linearity simultaneously can cause instability in the solution and mask the real source of uncertainty. Lastly, a sequential approach allows CLIMCAPS to fully characterize and propagate scene-dependent uncertainty from one retrieval step to the next on a per datum basis, because the background error covariance matrix (S b ) can be updated with the retrieval error covariance matrix from the previous step. the random instrument noise, systematic channel correlations, forward model uncertainty as well as uncertainty due to interfering background state variables ( ) . This diagram highlights the propagation for temperature and water vapor retrieval uncertainty specifically and therefore do not group these terms with the background error term. Matrix K is the forward model weighting function, or partial derivative of the forward model with respect to the retrieval variable. This sequential retrieval approach with scene-dependent error propagation linearizes each retrieval step as much as possible to promote product stability with known uncertainty estimates in clear and cloudy scenes.
The first step is to retrieve the cloud cleared radiances (Step 1, Figure 1). As input, the measurement error covariance matrix, , is defined as the sum of the radiance error covariance matrix ( R R , Section 2.2.4, Equation (6)) and uncertainty in the background atmospheric state, defined as = ∑ , where is the a priori error covariance matrices for T, H2O, O3, CO, CO2, N2O, CH4, SO2, and HNO3. The retrieval of all subsequent variables is done with the cloud cleared radiances, which now has its radiance error covariance matrix updated with scene-dependent uncertainty about the state of clouds in the datum ( R R , Section 2.2.4, Equation (7)).
Step 2 ( Figure 1) is the first T retrieval with measurement error covariance matrix now defined as the sum of R R , and uncertainty in the background atmospheric state as the sum of a priori error covariance matrices for H2O, O3, CO, CO2, N2O, CH4, SO2, and HNO3. Water vapor is retrieved next (Step 3, Figure 1), and this time the background error term is updated with the retrieved temperature error covariance matrix such that = ∑ + T T . The same holds for steps 4 through n, where n is the number of retrieved ECVs. Each retrieval step is done retaining the scene-dependent uncertainty from all previous steps and using a different subset of spectral channels depending on the target ECV. Cross-track Infrared Sounder (CrIS) brightness temperature (BT) spectra for five different atmospheric states varying in percent cloud cover from black to light grey as follows: 0%, 20%, 40%, 60% and 80%. CLIMCAPS employs a subset of channels in the retrieval of each ECV to linearize the inversion problem, maximize information content and minimize scene-dependent uncertainty. Here we show the channel subsets for the three CLIMCAPS ECVs discussed in this paper, namely, 62 channels for cloud properties (green), 66 channels for water vapor (blue) and 120 channels for temperature (orange).
CLIMCAPS retrieved ECVs are characterized with error covariance matrices, averaging kernels, cloud clearing uncertainty, and numerous other diagnostic metrics to enable full uncertainty propagation from Level 2 into subsequent Level 3 and Level 4 climate data records.

Prior Estimate of Atmospheric State
Profiles of temperature and moisture can be derived from TOA infrared radiances using a set of regression coefficients that correlates a measured spectrum with a statistically probable atmospheric state. Both the AST and NUCAPS algorithms use regression retrievals as first guess in their O-E Figure 2. Cross-track Infrared Sounder (CrIS) brightness temperature (BT) spectra for five different atmospheric states varying in percent cloud cover from black to light grey as follows: 0%, 20%, 40%, 60% and 80%. CLIMCAPS employs a subset of channels in the retrieval of each ECV to linearize the inversion problem, maximize information content and minimize scene-dependent uncertainty. Here we show the channel subsets for the three CLIMCAPS ECVs discussed in this paper, namely, 62 channels for cloud properties (green), 66 channels for water vapor (blue) and 120 channels for temperature (orange).
The first step is to retrieve the cloud cleared radiances (Step 1, Figure 1). As input, the measurement error covariance matrix, S m , is defined as the sum of the radiance error covariance matrix (δRδR T , Section 2.2.4, Equation (6)) and uncertainty in the background atmospheric state, defined as is the a priori error covariance matrices for T, H 2 O, O 3 , CO, CO 2 , N 2 O, CH 4 , SO 2 , and HNO 3 . The retrieval of all subsequent variables is done with the cloud cleared radiances, which now has its radiance error covariance matrix updated with scene-dependent uncertainty about the state of clouds in the datum (δR cc δR T cc , Section 2.2.4, Equation (7)).
Step 2 ( Figure 1) is the first T retrieval with measurement error covariance matrix now defined as the sum of δR cc δR T cc , and uncertainty in the background atmospheric state as the sum of a priori error covariance matrices for H 2 O, O 3 , CO, CO 2 , N 2 O, CH 4 , SO 2 , and HNO 3 . Water vapor is retrieved next (Step 3, Figure 1), and this time the background error term is updated with the retrieved temperature error covariance matrix such that The same holds for steps 4 through n, where n is the number of retrieved ECVs. Each retrieval step is done retaining the scene-dependent uncertainty from all previous steps and using a different subset of spectral channels depending on the target ECV. CLIMCAPS retrieved ECVs are characterized with error covariance matrices, averaging kernels, cloud clearing uncertainty, and numerous other diagnostic metrics to enable full uncertainty propagation from Level 2 into subsequent Level 3 and Level 4 climate data records.

Prior Estimate of Atmospheric State
Profiles of temperature and moisture can be derived from TOA infrared radiances using a set of regression coefficients that correlates a measured spectrum with a statistically probable atmospheric state. Both the AST and NUCAPS algorithms use regression retrievals as first guess in their O-E inversion. The AST version 6 algorithm adopted a stratified neural network regression [52][53][54][55][56] while NUCAPS maintains a globally trained linear regression [57]. In general, regression retrieval algorithms can achieve reasonable accuracy as has been demonstrated for NUCAPS, AST and elsewhere in the direct-broadcast Community Satellite Processing System's Dual-Regression (CSPP) [16,46,47,[57][58][59]. A statistical regression retrieval, however, is inappropriate as an a priori constraint for CLIMCAPS retrievals for several important reasons (outlined in Table 2). A regression retrieval is highly scene-dependent, but its uncertainty can only be calculated statistically for a large ensemble afterwards and thus cannot be known at the time of retrieval. Moreover, it depends on the very same radiance measurements from which the variables are retrieved in the O-E inversion step. This means that it not only enhances instrument effects but also propagates and even amplifies the scene-dependent error and correlation among state variables. CLIMCAPS requires an a priori estimate and associated error covariance matrix that is largely independent of the spectral measurements and retrieval datum (space and time) so that all scene-dependent uncertainty can be attributed to known sources and carefully propagated. It is important that all sources of uncertainty can be fully characterized in the retrieved ECVs to help make sense of differences when compared to other ECVs and CDRs. We selected MERRA2 reanalysis product [60,61] as CLIMCAPS a priori estimate for temperature and water vapor and below give a detailed breakdown of the reasons why (Table 1). We calculate the a priori error covariance matrix ( Figure 3) for temperature and water vapor from an ensemble (233,135 profiles) of differences between the European Center for Medium-Range Weather Forecasts (ECMWF) [62,63] and MERRA2 from four global focus days in 2015-1 January, 1 April, 1 July and 1 October. ECMWF serves as an estimate of the true atmospheric state and a benchmark for the best estimate of the atmospheric state at the space-time resolution of AIRS and CrIS (~50 km at nadir, twice a day). Before differencing, the ECMWF and MERRA2 profiles are spatially and temporally interpolated with the AIRS (or CrIS) space-time resolution. They are then vertically interpolated with the 100-level retrieval pressure grid and smoothed using a 3-point running mean to remove random noise effects. All profiles where MERRA2 and ECMWF differences exceed 10 K in temperature or 500% in water vapor are rejected. In the upper stratosphere (< 50 hPa), the typically large differences in model profiles taper off.
CrIS (~50 km at nadir, twice a day). Before differencing, the ECMWF and MERRA2 profiles are spatially and temporally interpolated with the AIRS (or CrIS) space-time resolution. They are then vertically interpolated with the 100-level retrieval pressure grid and smoothed using a 3-point running mean to remove random noise effects. All profiles where MERRA2 and ECMWF differences exceed 10 K in temperature or 500% in water vapor are rejected. In the upper stratosphere (< 50 hPa), the typically large differences in model profiles taper off.  The a priori error covariance matrix (Figure 3) for temperature and water vapor remains static from datum to datum because MERRA2 uncertainty does not vary at the same scale of the AIRS and CrIS measurements. With MERRA2 as a priori estimate and its static error covariance matrix in the CLIMCAPS inversion system, we can be sure that all scene-dependent uncertainty and variability in the retrieved ECVs come from the radiance measurements alone.

Cloud Clearing
Clouds cause the highest scene-dependent uncertainty in sounding observations from space-borne infrared instruments [10,[25][26][27]47,[64][65][66][67]. By clouds, we mean any type of cloud, fog or smoke that prevent a "spectral" view of the Earth surface. There are three main approaches to clouds in retrieval systems today: (i) avoid cloudy scenes, (ii) model the full suite of cloud parameters, or (iii) remove cloud effects from measured spectra. Each of these choices holds significant implications for the quality and nature of retrieved ECVs and derived CDRs.
Clouds can be avoided altogether by limiting sounding retrievals to clear-sky atmospheres. This is the simplest approach because the inversion of clear sky radiances is a well-known problem and can be done with high accuracy. However, avoiding cloudy scenes means that every day a significant portion of the global atmosphere is not observed, thus causing large systematic sampling errors in long-term records because the number of clear-sky CLIMCAPS soundings is typically 5% on any given global day. Modelling clouds, on the other hand, is the most complex approach, since the effects of clouds on infrared radiances are very difficult to deconvolve. Soundings can be retrieved in the presence of clouds but at the risk of introducing large systematic errors due to the fact that the infrared sounding of clouds is highly under-constraint and strongly dependent on inversion constraints. These include robust a priori estimates of cloud amount, height and optical properties as well as forward models that accurately calculate radiative transfer through clouds, which is an exceedingly complex problem with large associated uncertainties.
There are retrieval methods that attempt to directly solve for clouds using single footprint infrared radiances [68,69] but these are experimental at best and do not quantify or propagate cloud uncertainty rigorously enough for ECVs at global scales. The CLIMCAPS inversion system, instead, adopts a "cloud clearing" approach, which was developed in the 1970s [70][71][72] and is still operational today [2,29]. Cloud clearing removes the radiative effect of clouds from the infrared measurements without requiring knowledge of complex cloud parameters such as single scattering albedo, cloud phase or cloud density as a function of height for multiple cloud particle types. Perhaps the most significant advantage is that cloud clearing uncertainty can be accurately quantified for each datum and then propagated into subsequent retrieval steps so that ECV information content may be maximized.
The basic assumption of cloud clearing is that within a retrieval datum, the cloud types are constant and only the cloud fraction varies. With nine AIRS (or CrIS) FOVs within each datum (~50 km at nadir, Table 1) there is a high probability that the cloud fraction in at least one FOV varies from the rest. For all scenes with some contrast in cloud cover, a single cloud cleared radiance spectrum can be retrieved from the nine FOVs using a linear least squares approach. Geophysically, this is reasonable, because TOA infrared spectra respond proportionally to the amount of clouds in the measured footprint (Section 2.2.2; Figure 2). In this sense, the cloud clearing algorithm uses a datum's "spatial information content" (spatial heterogeneity in cloud sensitive channels) not the "spectral information content" (radiative sensitivity to cloud type, height and optical properties). The spatial information content is high where a datum is partly filled with clouds and the spectral differences among the nine FOVs are high as a result (Figure 2 illustrates how spectra are proportional to percent cloud cover). With nine FOVs sampling a partly cloudy scene, cloud clearing finds a cloud-free radiative pathway past the clouds from TOA to Earth surface. Spatial information content is low when a datum is homogeneous with no variation in cloud amount and thus no variation in cloud sensitive channels from FOV to FOV. This happens in the presence of uniform cloud decks (100% cloudy) or where each FOV is covered by the same amount of clouds (e.g., when all nine FOVs have 40% cloud cover). The dependence on spatial information content makes cloud clearing robust even in very complex cloudy scenes because it requires no information about the amount, height or optical properties of clouds. However, the downside is that the spectral region sensitive to clouds is also sensitive to the lower troposphere and Earth Surface. This means that a datum over coastlines, with part water part land, can be mistaken for a cloudy scene (large differences in CC spectral channels from FOV to FOV; Figure 1), or alternatively, very low clouds (e.g., in boundary layer) can be mistaken as a cool clear-sky scene.
A cloud cleared radiance spectrum (R CC ) is a least squares linear combination of the nine FOVs within each CLIMCAPS datum (Equations (4) and (5)). It is described in detail elsewhere [2], but will we give a brief overview here for the sake of clarity. A cloud cleared radiance spectrum, R cc , is derived from nine FOVs as follows: where R is the average radiance in a datum, R j is one of nine measured FOV radiance spectra, and η j the least squares approximation derived for each datum (3 × 3 FOV), as follows: with λ being the regularization factor and x b the background atmospheric state used to calculate a clear-sky TOA spectrum. δRδR T is the measurement uncertainty (Equation (6)), and defined as the sum of the average random instrument noise component (NE∆N) and the forward model uncertainty calculated off-line for an ensemble of cases and held constant through all retrievals.
where C is the channel correlation matrix and for AIRS is the identity matrix and CrIS the apodization correlation matrix. S m is therefore a combination of random noise (NE∆N), channel correlation (C) and systematic effects due to assumptions in the radiative transfer model (ε). δRδR T is updated for the retrieved cloud cleared radiances as follows: with A a scalar that quantifies the amplification in random noise due to cloud clearing (see [2] for a detailed discussion), δηδη T the least squares error covariance matrix (9 × 9) and S r the transformation matrix from FOV to spectral space with N channels: CLIMCAPS uses R cc and R j to calculate cloud top pressure for up to two layers as well as up to 18 effective radiative cloud fractions to help diagnose product quality and scene complexity in subsequent research studies.

Datum-Specific Uncertainty Metrics
The sensitivity of TOA infrared radiance to the atmospheric state varies depending on localized conditions, e.g., the higher the load of CO molecules in the upper troposphere (e.g., due to a large fire event), the stronger its spectral signature in the 2150 cm −2 wavenumber region; or, the larger the difference in temperature between Earth surface and boundary layer, the higher the sensitivity to fine-scale thermodynamic structures, and so on. The Bayesian averaging kernel (A; Section 2.2.1) has found widespread use in the weather and climate communities as an intuitive metric of uncertainty. CLIMCAPS follows the AST and NUCAPS approach in calculating A [73]. The diagonal vector of A (Figure 4) is the degree to which measurements contribute to the final solution as a function of pressure. A Bayesian retrieval (Section 2.2.1) at specific pressure level, P(l), has total information content equal to unity, with a weighted contribution from the measurements equal to A(l, l) and a weighted contribution from the a priori estimate equal to [1 − A(l, l)]. In this way, A is a metric of the scene-dependent measurement information content within a datum, i.e., the amount of unique information contributed by the measurement given the constraints imposed by the inversion algorithm (e.g., channel selection and choice of a priori). Alternatively, when considering retrieval uncertainty at P(l), then A(l, l) quantifies the fraction contributed by S m , and [1 − A(l, l)] represents the null space and quantifies the fraction contributed by S a . be mistaken as an indicator of the accuracy of the retrieved variable. The CLIMCAPS for temperature (Figure 4, top row) is relatively stable across pressure levels and latitudinal zones as indicated by the narrow error bars. For water vapor (Figure 4, bottom row), the has high variability, both vertically and spatially. A map of the total vertical information content, known as the degrees of freedom and calculated as the sum of the diagonal, ∑ ( , ), reveal spatial patterns of uncertainty in temperature and water vapor ( Figure 5).  For a target variable, A is not constant (Figure 4) because it depends on, (i) the accuracy of the a priori estimate, as well as (ii) the information content in the measurements at a given point in time and space. If the a priori estimate approximates the true state, then A → 0 irrespective of the information content in the measurement about the target variable, because the true state is already known and no new information can be added from the measurements to improve it. Alternatively, if the a priori estimate is a generalized climatology and do not resemble localized conditions, then A → 1 if the measurement information content about a target variable is high, but A → 0 if its low, irrespective of the quality of the a priori estimate. A is thus a complex metric and in isolation does not fully characterize the retrieved ECV uncertainty; it should only be interpreted within the context of the retrieval system and assumptions made within. It is for this reason that a one-to-one comparison between As from different retrieval systems do not yield much insight and should never be mistaken as an indicator of the accuracy of the retrieved variable. The CLIMCAPS A for temperature (Figure 4, top row) is relatively stable across pressure levels and latitudinal zones as indicated by the narrow error bars. For water vapor (Figure 4, bottom row), the A has high variability, both vertically and spatially. A map of the total vertical information content, known as the degrees of freedom and calculated as the sum of the A diagonal, A(l, l), reveal spatial patterns of uncertainty in temperature and water vapor ( Figure 5). Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 24 As discussed above (Section 2.2.2, Figure 1), the retrieval of temperature is done in two separate steps; the first temperature retrieval contains scene-dependent uncertainty about clouds only while using static uncertainty estimates for all background variables, namely H2O, CO, O3 and HNO3. The second temperature retrieval updates its scene-dependent uncertainty estimates for clouds as well as the background variables. Scene-dependent knowledge of tropospheric H2O enables the inclusion of additional channels-sensitive to both temperature and H2O-so that thermodynamic structure may be retrieved with higher vertical resolution. The reason for this multi-step iteration in temperature can perhaps be best understood with averaging kernels (Figure 6), where we see an increase in tropospheric values of once scene-dependent knowledge about clouds and trace gases were fully characterized (magenta lines). Figure 6. CLIMCAPS averaging kernels for temperature from three individual datums centered on the [latitude, longitude] location in the title at 01h30 on 15 October 2018. CLIMCAPS retrieves temperature twice, once after cloud clearing (blue) and a second time after the retrieval of atmospheric gases (magenta)-H2O, CO, O3, and HNO3. "Step 2" and "Step 7" refer to the steps outlined in Section 2.2.2, Figure 1. The CLIMCAPS sequential retrieval approach allows scene-dependent error propagation to stabilize inversion and maximize information content. As discussed above (Section 2.2.2, Figure 1), the retrieval of temperature is done in two separate steps; the first temperature retrieval contains scene-dependent uncertainty about clouds only while using static uncertainty estimates for all background variables, namely H 2 O, CO, O 3 and HNO 3 . The second temperature retrieval updates its scene-dependent uncertainty estimates for clouds as well as the background variables. Scene-dependent knowledge of tropospheric H 2 O enables the inclusion of additional channels-sensitive to both temperature and H 2 O-so that thermodynamic structure may be retrieved with higher vertical resolution. The reason for this multi-step iteration in temperature can perhaps be best understood with averaging kernels (Figure 6), where we see an increase in tropospheric values of A once scene-dependent knowledge about clouds and trace gases were fully characterized (magenta lines). As discussed above (Section 2.2.2, Figure 1), the retrieval of temperature is done in two separate steps; the first temperature retrieval contains scene-dependent uncertainty about clouds only while using static uncertainty estimates for all background variables, namely H2O, CO, O3 and HNO3. The second temperature retrieval updates its scene-dependent uncertainty estimates for clouds as well as the background variables. Scene-dependent knowledge of tropospheric H2O enables the inclusion of additional channels-sensitive to both temperature and H2O-so that thermodynamic structure may be retrieved with higher vertical resolution. The reason for this multi-step iteration in temperature can perhaps be best understood with averaging kernels (Figure 6), where we see an increase in tropospheric values of once scene-dependent knowledge about clouds and trace gases were fully characterized (magenta lines). Figure 6. CLIMCAPS averaging kernels for temperature from three individual datums centered on the [latitude, longitude] location in the title at 01h30 on 15 October 2018. CLIMCAPS retrieves temperature twice, once after cloud clearing (blue) and a second time after the retrieval of atmospheric gases (magenta)-H2O, CO, O3, and HNO3. "Step 2" and "Step 7" refer to the steps outlined in Section 2.2.2, Figure 1. The CLIMCAPS sequential retrieval approach allows scene-dependent error propagation to stabilize inversion and maximize information content. Step 2" and "Step 7" refer to the steps outlined in Section 2.2.2, Figure 1. The CLIMCAPS sequential retrieval approach allows scene-dependent error propagation to stabilize inversion and maximize information content. CLIMCAPS calculates and outputs an error covariance matrix,Ŝ, for each retrieved ECV at every datum. Broadly speaking,Ŝ is the sum of three sources of uncertainty (Section 2.2.1; Equation (3)) namely, the null space (I − A), instrumentation as well as interference from background state variables. In CLIMCAPS we combine the latter two terms into one term since they are both spectrally based (Section 2.2.2). For temperature, the null space error covariance matrix is high (Figure 7a) and weakly correlated, whereas the magnitude of the spectral error covariance matrix is lower (Figure 7b) with stronger correlation, especially in the stratosphere. For tropospheric H 2 O (Figure 8c,d), the contribution from the two sources of uncertainty are similar in magnitude but correlation is introduced by uncertainty propagated from the measurements (broader orange feature in Figure 7d). All these sources of uncertainty and error contribute to the magnitude and structure of the averaging kernel of a retrieved ECV at a specific datum in time and space. CLIMCAPS calculates and outputs an error covariance matrix, , for each retrieved ECV at every datum. Broadly speaking, is the sum of three sources of uncertainty (Section 2.2.1; Equation (3)) namely, the null space ( − ), instrumentation as well as interference from background state variables. In CLIMCAPS we combine the latter two terms into one term since they are both spectrally based (Section 2.2.2). For temperature, the null space error covariance matrix is high (Figure 7a) and weakly correlated, whereas the magnitude of the spectral error covariance matrix is lower ( Figure  7b) with stronger correlation, especially in the stratosphere. For tropospheric H2O (Figure 8c,d), the contribution from the two sources of uncertainty are similar in magnitude but correlation is introduced by uncertainty propagated from the measurements (broader orange feature in Figure 7d). All these sources of uncertainty and error contribute to the magnitude and structure of the averaging kernel of a retrieved ECV at a specific datum in time and space.
The averaging kernels for three different CLIMCAPS scenes highlight the scene-dependence of information content ( Figure 6). For the same three scenes, we can evaluate the null space and spectral error covariance matrices, respectively, to help understand the retrieval and its better ( Figure 8). Figure 7. For a single CLIMCAPS datum, the retrieval error covariance matrix for temperature can be parameterized as a contribution from (a) the null space that propagates a priori estimate uncertainty, and (b) the spectral space that propagates all sources of error and uncertainty with a spectral component, such as instrument error, uncertainty due spectral interference from background state variables as well as forward model uncertainty. The same is true for H2O retrievals with error covariance matrices for the (c) null space, and (d) spectral space. The color scale is highly non-linear to highlight the subtle off-diagonal structures. Figure 7. For a single CLIMCAPS datum, the retrieval error covariance matrix for temperature can be parameterized as a contribution from (a) the null space that propagates a priori estimate uncertainty, and (b) the spectral space that propagates all sources of error and uncertainty with a spectral component, such as instrument error, uncertainty due spectral interference from background state variables as well as forward model uncertainty. The same is true for H 2 O retrievals with error covariance matrices for the (c) null space, and (d) spectral space. The color scale is highly non-linear to highlight the subtle off-diagonal structures.

Retrieved Essential Climate Variables-Temperature and Water Vapor
One of the dominant sources of uncertainty in satellite sounding retrievals is tied to local weather conditions, and therefore are highly variable from datum to datum and day to day. How, The averaging kernels for three different CLIMCAPS scenes highlight the scene-dependence of information content ( Figure 6). For the same three scenes, we can evaluate the null space and spectral error covariance matrices, respectively, to help understand the retrieval and its A better (Figure 8).

Retrieved Essential Climate Variables-Temperature and Water Vapor
One of the dominant sources of uncertainty in satellite sounding retrievals is tied to local weather conditions, and therefore are highly variable from datum to datum and day to day. How, then, can we assemble CDRs from satellite soundings to characterize long-term and large-scale patterns and processes? How can we use data whose quality depends on weather patterns in climate research? The answer is simple but not trivial: with rigorous uncertainty characterization, propagation and reporting. Knowledge about uncertainty in ECVs is as important as the retrieved quantity itself [35,36,51].
We designed CLIMCAPS to retrieve ECVs that can be assembled into CDRs for the study of feedback cycles and other climate processes, by adopting two key design elements that distinguishes it from other operational retrieval systems, such as NUCAPS and AST. These are, (i) an a priori estimate that is largely independent of the same sources of scene-dependent uncertainty as the measurements, and (ii) the sequential retrieval of multiple ECVs from a single measurement with full uncertainty propagation to account for all sources of scene-dependent effects.
We can demonstrate the value of CLIMCAPS with a statistical analysis of retrieved temperature and water vapor for the six global days from Aqua/AIRS and SNPP/CrIS spanning all four seasons: 1 July 2013, 1 January 2015, 1 April 2015, 1 July 2015, 1 October 2015 and 14 January 2016. In order to highlight the differences between CLIMCAPS and inversion systems that employ statistical a priori estimates for temperature and water vapor without formal error propagation, we compared CLIMCAPS against NUCAPS, We also compared NUCAPS and AST v.6, and can confirm that the retrievals from these systems yield similar results for the same six global days (not shown).
With respect to ECMWF, NUCAPS temperature T(p) error (bias) oscillates between −1K and 1K for day and night, while CLIMCAPS T(p) error (bias) remains within an 0.5 K irrespective of time of day (Figure 9). CLIMCAPS uncertainty (precision or standard deviation) for T(p) is less than half that of NUCAPS, which routinely exceed 1.5K in the boundary layer. With a statistical regression a priori estimate, NUCAPS scene-dependent variation in information content is amplified because the measured spectrum of a single footprint is used twice, first in deriving the statistical a priori that uses all spectral channels and then in the O-E retrieval of the final state (Section 2.2.1). This means that variation in retrieval quality from footprint to footprint is high because measurements with low information content will result in high retrieval uncertainty and vice versa. This is not the case for CLIMCAPS that has MERRA2 reanalysis as a priori. In CLIMCAPS only a small fraction of the measured spectrum is used twice. MERRA2 assimilates a small subset of AIRS and CrIS channels and it does so for clear-sky scenes only. Moreover, AIRS and CrIS are only two of many different instruments that MERRA2 assimilates; the contribution from any one instrument to a MERRA2 modelled state at any point in time is thus very low. NUCAPS temperature and moisture retrievals has high scene-dependent variability as quantified by the standard deviation with respect to ECMWF for different latitudinal zones (polar versus mid-latitude) and surface types (land versus ocean). On the other hand, CLIMCAPS statistics for the same scenes are consistently lower with minimal variability along the vertical axis ( Figure 10). Remote Sens. 2019, 11, x FOR PEER REVIEW 18 of 24      For the same set of focus days and ECMWF as estimate of the true state, we can illustrate how CLIMCAPS, unlike NUCAPS, achieves consistency in retrieved ECVs between two different instruments-Aqua/AIRS and SNPP/CrIS. The NCAPS global temperature, T(p), error profiles of AIRS and CrlS (Figure 11a) have strong variations with pressure that ranges between −1K and 1K. The vertical distribution of these variations from AIRS are different from those of CrlS. The NUCAPS global moisture, Q(p), error profiles have similar mismatches in vertical error (bias) between AIRS and CrIS, albeit with lower variation than T(p). The CLIMCAPS global T(p) and Q(p) error profiles (Figure 11b) are similar for AIRS and CrIS. We see this retrieval error consistency for CLIMCAPS across instruments also within latitudinal zones ( Figure 12).   In summary, CLIMCAPS minimizes the propagation of scene-dependent spectral uncertainty by having an a priori estimate for temperature and water vapor that is largely independent of the spectral measurements. NUCAPS, on the other hand, uses the full set of AIRS and CrIS spectral channels in deriving its a priori estimate for temperature and water vapor, thus amplifying scene-dependent errors and uncertainty. CLIMCAPS also quantifies and propagates uncertainty due spectral interference from background state variables, which means that during inversion this spectral component can be correctly assigned as uncertainty about the retrieved variable. If the spectral interference from background state variables is not explicitly quantified in the Bayesian equation, then it is treated as information about the retrieved variable during inversion. This not only introduces instabilities in the retrieval system, but perhaps most importantly, affects product uncertainty estimates in a way that confounds understanding and can lead to misinterpretations in climate applications and scientific research. Figure 11. Global statistics of (a) NUCAPS from SNPP/CrIS (solid lines) and Aqua/AIRS (dashed lines) as well as (b) CLIMCAPS from SNPP/CrIS (solid lines) and Aqua/AIRS (dashed lines) as the bias with respect to ECMWF of temperature T(p) in units Kelvin and moisture Q(p) as percent profiles from six focus days. In summary, CLIMCAPS minimizes the propagation of scene-dependent spectral uncertainty by having an a priori estimate for temperature and water vapor that is largely independent of the spectral measurements. NUCAPS, on the other hand, uses the full set of AIRS and CrIS spectral channels in deriving its a priori estimate for temperature and water vapor, thus amplifying scene-

Conclusions
The Community Long-term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS) retrieves multiple Essential Climate Variables (ECV) from measurements made by modern-era microwave and hyperspectral infrared sounding instruments on low-Earth orbiting satellites. These are profiles of temperature and water vapor, column amounts of greenhouse gases (CO 2 , CH 4 ), ozone (O 3 ) and precursor gases (CO, SO 2 ) as well as cloud properties. The hyperspectral Atmospheric InfraRed Sounder (AIRS) was the first instrument of its kind and launched in 2002 on the Aqua satellite. Its successor, the Cross-track Infrared Sounder (CrIS), has been in similar orbit since 2011. Plans are in place to continue this atmospheric sounding capability well into the 2040s. CLIMCAPS retrieves ECVs from both AIRS and CrIS so that a multi-decadal Climate Data Record (CDR) may be assembled.
The unique contribution we make with this paper is to discuss how scene-dependent uncertainty can be quantified and systematically propagated through a satellite sounding retrieval system to decompose a highly correlated spectral signature (Level 1) into multiple ECVs (Level 2) with minimal correlation. We demonstrate how the correct treatment of scene-dependent spectral effects result in retrieval products that have global consistency across seasons and instruments.
Future work will focus on validating CLIMCAPS Level 2 scene-dependent uncertainty estimates by using measurements from numerous field campaigns. Once validated, we will develop a methodology with which to rigorously propagate Level 2 uncertainty into Level 3 and Level 4 CDRs to support the study of complex feedback cycles and to characterize the climate system with multiple thermodynamically consistent ECVs. Funding: This research was funded by NASA, grant award number 80NSSC18K0975.