Atmospheric GNSS RO 1D-Var in Use at UCAR: Description and Validation

: This paper describes, along with some validation results, the one-dimensional variational method (1D-Var) that is in use at the University Corporation for Atmospheric Research (UCAR) to retrieve atmospheric proﬁles of temperature, pressure, and humidity from the observation of the Global Navigation Satellite System (GNSS) radio occultation (RO). The retrieved proﬁles are physically consistent among the variables and statistically optimal as regards to a priori error statistics. Tests with idealized data demonstrate that the 1D-Var is highly effective in spreading the observational information and conﬁrm that the method works as designed and expected, provided that correct input data are given. Tests for real-world data sets show that the retrieved proﬁles agree remarkably well with global weather analyses and collocated high vertical resolution radiosonde observations, and that the 1D-Var can produce value-added retrievals with respect to a priori proﬁles. We also ﬁnd that the retrieved proﬁles are of exceptional long-term stability, suggesting that the 1D-Var can provide an excellent climate data record.


Introduction
The Global Navigation Satellite System (GNSS) Radio Occultation (RO) technique [1][2][3] has become established as a climate-quality benchmark observing system for measuring the atmosphere's thermodynamic states [4,5]. It provides accurate, precise, high-verticalresolution, globally distributed sounding observations under all weather conditions and regardless of Earth's surface condition. These unique strengths make RO a valuable data source for Numerical Weather Prediction (NWP), as evidenced by routine, operational use of the data at weather forecasting centers worldwide [6][7][8]. In addition, RO becomes increasingly relevant for climate studies as its data record lengthens. This stable, drift-free, self-calibrating data record is ideal for monitoring the Earth's changing climate. The data coverage is of good global and temporal coverage, which reduces the sampling error in climate system studies [4]. Aside from weather forecasting and climate change monitoring, there is a considerable body of literature where RO data are utilized to study atmospheric processes, e.g., those related to the tropopause, the planetary boundary layer, atmospheric waves and tides, significant weather events, and so on. RO data also provide unique and valuable information on the ionosphere [9][10][11][12], but these applications are not discussed in this paper.
The RO technique is based on the GNSS signal recorded by a receiver onboard a low-Earth orbit (LEO) satellite as the GNSS satellite sets or rises behind the Earth's atmospheric limb. The primary measurement is the signal's phase delay from which the ray's bending angle and atmospheric refractivity can be deduced in sequence. While the bending angle and refractivity profiles can be assimilated into NWP models directly, these data are not convenient for many research studies of atmospheric processes and phenomena. Accordingly, efforts have long been made to provide data products of greater geophysical relevance and, ultimately, to make RO data more useful to the science community. Specifically, these decades to produce RO-based profiles of temperature, moisture, and pressure [38]. These 1D-Var profiles show good agreement with radiosonde observations [39][40][41][42][43] and have been used by numerous researchers around the world for many research problems, e.g., investigation of atmospheric processes and structures [44][45][46], validation of other observing systems [47,48], and testing of physical models and retrieval algorithms [49,50].
As the 1D-Var profiles are increasingly being used, many data users request information regarding the retrieval method and the data it produces. Recently, CDAAC started producing "wetPf2" data with an updated 1D-Var algorithm [51]. Hence, it is an opportune time to introduce data users to the atmospheric 1D-Var used by CDAAC for data production. This paper describes the 1D-Var algorithm and presents some validation results. The description here is not intended to provide an in-depth theoretical treatise, but the central ideas and some specific implementation details on which most of the data users' inquiries were focused.

GNSS RO Overview
Many previous studies, e.g., [1][2][3]27] described the observing principle of the RO technique. Only a brief overview of the observation technique is given here, focusing on the parts pertinent to the atmospheric 1D-Var. The primary observables are the GNSS signal's phase and amplitude recorded by a receiver onboard a LEO satellite during an occultation event, which occurs over a brief period (a few minutes) when a GNSS satellite sets or rises behind the Earth's atmospheric limb from the standpoint of the LEO satellite. By relating the time-frequency information content of the occulted signal to the time-varying occultation geometry, one can determine a profile of the bending angle as a function of the impact parameter, e.g., [52,53]. While the ionosphere exerts considerable influence on the ray's bending, the bulk of the first-order ionospheric effect can be eliminated by using two carrier frequencies and forming a linear (the so-called ionosphere-free) combination of the dual-frequency observations at some point in the data processing chain.
The inverse Abel transform of the bending angle gives the refractive index [54,55]: where α is the bending angle, a is the impact parameter, and n is the refractive index that can be approximated in the neutral atmosphere as follows [56]: where N is the refractivity, T is the temperature in K, p is the (total) atmospheric pressure in hPa, and e is the water vapor pressure in hPa. The coefficients are k 1 = 77.6 hPa K −1 and k 2 = 3.73 10 5 K 2 hPa −1 .
In a dry atmosphere, the moisture term (second rightmost) vanishes and as a result the refractivity is proportional to the air density ρ D as per the ideal gas law: The sequential application of Equations (3)-(5) is the traditional dry retrieval mentioned earlier. With the promise that the temperature and pressure in Equation (2) are closely known, the water vapor is separable. This corresponds to the traditional humidity retrieval.

The UCAR 1D-Var 2.2.1. Theoretical Framework
The COSMIC 1D-Var (hereafter, referred to as the 1D-Var) searches for the best likelihood thermodynamic states of the atmosphere for a given set of the observation, a priori profile, and their error covariances. Such a solution attains the minimum of a quadratic cost function when all the relevant errors are random with Gaussian probability density functions [57]: where J is the cost function, x is the state vector, x b is the background (a priori x), y o is the observation vector, H is the observation operator (often called the forward model), B is the error covariance matrix (ECM) of x b , R is the ECM of y o , and superscripts "T" and "−1" denote the matrix transpose and inverse, respectively. The observation vector represents a RO profile which can either be of refractivity (as a function of geometric height) or of bending angle (as a function of impact parameter). The state vector comprises atmospheric profiles of temperature, pressure, and moisture and are required for the forward modeling. The observation operator converts the state variables into the RO observed physical quantity and interpolates it to the observed heights. The state and observation vectors and their ECMs can be large and the observation operator can be complex. Hence, it is generally impossible to acquire an analytic solution to the minimization problem. The 1D-Var searches for the minimum of J through a Newton-type iteration for which the gradient of J is calculated with the adjoint technique [20,21], which provides an efficient means to evaluate the gradient. Practical applications do not often use Equation (6). Instead, mathematically equivalent yet computationally efficient forms are used. The 1D-Var makes use of the incremental formulation [58]: where , and H is the linearized version of the observation operator. The incremental formulation reduces the computational complexity of the minimization problem by circumventing the nonlinearity of H. Taking one step further, the 1D-Var employs the control-variable transform [23,59] which introduces a control vector v to rewrite Equation (7) as follows: where v = B −1/2 δx with B 1/2 being a square root of B. The control-variable transform turns B into an identity matrix-which is trivial to deal with [60]-and lowers the condition number of the Hessian of J, leading to faster convergence of the minimization [61]. To further accelerate the minimization, the 1D-Var reduces the elements of the control vector. Stratospheric humidity in the heights above 30 km is excluded. The pressure profile is Remote Sens. 2022, 14, 5614 5 of 32 reduced to a single-level (bottom) pressure. This is because the pressure profile contains redundant information to the extent that the hydrostatic balance is valid. In addition, pressure errors are highly correlated with themselves and with temperature and moisture errors, which impedes the search of the steepest descent of the cost function. The minimization starts with v = 0 and each iteration evaluates J and ∇ v J to update v in a manner similar to the Newton's method: where superscripts indicate the numbers of iterations, f (∇ v J) is the direction of steepest descent, and γ is the step size. The steepest direction and the step size are determined with a quasi-Newtonian limited-memory algorithm for large-scale optimization [62], which takes into account the past few iterations. In most cases, the minimization converges in less than 50 iterations. The convergence is declared when the change of the cost function with iteration is trivial and the norm of the gradient falls below a prescribed threshold. The control vector acquired upon completing the minimization, v a , gives the analysis increment, δx a = B 1/2 v a , which in turn furnishes the analysis vector, x a = x b + δx a . The retrieval process does not require the conversion of the state vector into the control vector, v = B −1/2 δx. What is needed instead is the reverse transform, δx = B 1/2 v, to compare observations to their model equivalents. This is important since the inversion of a large matrix is expensive and less stable, and even impossible when the condition number is large [63]. On the other hand, B 1/2 , which can be any matrix that satisfies B = B 1/2 B T/2 , is definable with a series of algebraic operations that transform the control vector into the state vector [64,65].

Observation Operators
The observation types admissible to the 1D-Var are neutral atmospheric bending angle and refractivity. The 1D-Var obtains these observations from a lower-level data product of CDAAC (atmPrf). Kuo et al. [66] and Schreiner et al. [67] describe the data processing leading to the atmPrf. The observation operator involves the forward observation modeling, which comprises the conversion of the state variables into the observed physical quantity and the vertical interpolation to the data points of the observation. Considering that the interpolation is trivial, we focus on the physical conversion. In addition, the refractivity represents a local state of the atmosphere which can be computed with Equation (2). The bending angle can be modelled using the forward Abel transform, the inverse of Equation (1): The forward Abel transform provides the bending angle profile corresponding to a refractivity profile given as a function of the height. Unfortunately, there are a few practical difficulties to overcome: a singularity at the lower limit, an infinite upper bound, and discretization error. One can avoid the singularity at x = a by using quadrature rules that do not require the integrand evaluation at the endpoint. The default discrete Abel transform used in the 1D-Var, amongst many options available, is a form based on the relation dx (x 2 −a 2 ) 1/2 = ln x + x 2 − a 2 1/2 [16]: where x = nr is the refractional radius and subscripts indicate grid indices in top-down order. This form is chosen mainly because of its numerical stability. While some other forms based on higher-order quadrature rules tend to be more accurate, they suffer from occasional overshooting when adjacent levels come too close in the refractional radius, i.e., x i − x i+1 → 0 . The overshooting occurs when the refractivity gradient approaches the critical refraction threshold, −157 km −1 , regardless of the layer's physical (altitudinal) depth, z i − z i+1 . In addition, the refractivity gradient becomes more difficult to evaluate accurately when x i − x i+1 → 0 . Moreover, the computed refractivity gradient (and the physical interpretation) becomes dubious when the refractional radius is inverted, i.e., x i+1 > x i . In the case of a super-refraction, the modeled bending angle is not a monotonic function of the height. To circumvent these difficulties, we check the monotonicity of the refractional radius before evaluating the forward Abel transform. To do so, we examine the refractivity gradient downwards, starting from 5 km. When the refractivity gradient negatively exceeds −150 km −1 for the first time, super-refraction is assumed to occur within the layer. In this circumstance, we truncate a priori profiles at the layer's top and exclude the lower part from being used for the discrete forward Abel transform. Using a finite upper bound for the infinite integral can cause significant modeling error at high altitudes. For instance, when Equation (10) is evaluated with NWP data and the integrand above the model top is disregarded, the computed bending angle is to be smaller than the observation. The modeling error can be reduced by raising the upper bound. To do so, however, the atmospheric structure above the model top must be specified by some means. A simple approach that can serve the purpose is employing the assumption that the refractivity above the model top continues to decrease exponentially, which is widely exercised in data assimilation where no information is available about the atmosphere outside of the model domain. However, the assumption of exponentially varying refractivity is equivalent to assuming that the temperature above the model top remains constant, which hardly ever occurs in the real atmosphere. For instance, the temperature in the thermosphere often exceeds 1000 • C, far greater than the typical stratospheric temperature. The reason is that as the density becomes less and less with increasing height, the atmosphere responds more strongly to solar insolation. For this reason, we specify the temperature above the model top using the empirical model of the US Naval Research Laboratory (NRL), MSIS [68]. The rationale behind this approach is that the temperature variability in the mesosphere and thermosphere can be explained to a good degree by atmospheric tides, seasonal insolation change, and the solar cycle, for which MSIS can serve as an acceptable first-order model. The air pressure is hydrostatically reconstructed from the MSIS temperature in the bottom-to-top direction under the assumption of a constant air composition, as done by [69]. This applies only to non-linear forward modeling.
The accuracy of the discrete Abel transform is dependent on the step size of the vertical integration. Because NWP data are available on finite model levels, Equation (11) can result in a considerable numerical error when evaluated only with the data points on those model levels. We reduce the discretization error by integrating the discrete Abel transform on a higher-resolution grid.
Besides the difficulties discussed so far, there is another issue to consider. Typically, the bending angle observation operator includes the refractivity observation operator-using temperature, pressure, and humidity the observation operator calculates the refractivity from which the bending angle is evaluated. The observed bending angle is provided as a function of the impact parameter, while all other variables are functions of the height. The refractional radius (=nr) is the model equivalent of the impact parameter at the ray's tangent point. The problem is that the simulated refractional radius, let alone the modeled bending angle, contains a model error due to imperfect temperature, pressure, and humidity. In this so-called errors-in-variables problem, the error of the independent variable (i.e., refractional radius) emerges as the error of the dependent variable (bending angle). This extra forward modeling error leads to suboptimal use of bending angle observation [38].
To address this issue, we undertake two separate, sequential minimizations. The first is carried out in the observation space (impact parameter coordinate) and the other in the model space (height coordinate). The first step replaces the Abel inversion used in the Remote Sens. 2022, 14, 5614 7 of 32 standard RO data processing, Equation (1), with a variational inversion of Equation (10). The primary concern regarding Equation (1) is that the standard Abel inversion accumulates observation errors at different altitudes and propagates them downward. For instance, an erroneous spike in observed bending angle severely deteriorates the retrieved refractivity in lower altitudes, regardless of the local quality of the bending angle observation there. Furthermore, the standard Abel inversion builds up systematic observation error, e.g., negative observation bias in the lower troposphere. The variational inversion aims to overcome these difficulties by combining observation error mitigation and refractivity recovery into an estimation problem. It provides the best likelihood refractivity-in light of observed bending angle, a priori refractivity, and their error covariances-as a function of the refractional radius. In simplest terms, the best refractivity profile is the one that fits the observed bending angle profile the most closely. As can be inferred from Equation (10), no information propagates back to temperature, pressure, and water vapor pressure at this stage. This variational inversion offers many advantages over the Abel inversion [38].
The second step is a regular 1D-Var procedure which takes the optimal refractivity acquired from the previous step as observation to give temperature, moisture, and pressure profiles. Here, the prior step's analysis (retrieval) error serves as the observation error, and the observation's height is determined by dividing the refractional radius by the optimal refractive index. The control vector's elements used in this step are a temperature profile (covering the entire height range), a profile of the pseudo (in reference to the background temperature) relative humidity (<30 km), and the pressure at the lowest height. This step constrains the state variables so that the virtual temperature and pressure are in hydrostatic balance, and the moisture remains non-negative and sub-saturated.
The essence of this two-step approach lies in that the data inversion problem is broken down into separate well-defined minimization problems in which the observation and the solution reside in the same coordinate all the time, circumventing the transform between the height and the refractional radius-and the uncertainty involved-during each minimization. Hereafter, the first minimization step is referred to as refractivity inversion and the second as the retrieval step.

A Priori Profiles
At the outset, we provide the 1D-Var with a priori temperature, moisture, and pressure profiles. Typically, we extract these profiles from a trace of NWP model data at the observed time and along the trajectory of rays' tangent points. Because the tangent points drift horizontally with the height, this extraction requires an iterative procedure to accurately determine the intersections between the tangent points' trajectory and the horizontal surfaces of individual model levels. The profile extraction is carried out prior to the 1D-Var and is thus not a part of the observation operator.
The 1D-Var prepares the background (profiles) by interpolating a priori profiles vertically to a computational grid on which the state vector's elements are located. The computational grid is defined in a generalized coordinate η as: where z b and z t indicate the grid's bottom and top, respectively. We set the top height to 150 km for the refractivity inversion, and the non-linear observation operator of this step uses MSIS internally to model the refractivity above the top height. The top height is set to 80 km for the retrieval step. The bottom of the computational grid varies from one RO event to another. It is set to the highest of the following: (1) the lowest tangent point of the observation, (2) the terrain height at the geophysical location, and (3) the top of the highest a priori super-refraction layer. In the rare case that z b is lower than the bottom of a priori profiles, we use a simple extrapolation: the specific humidity is held constant down to z b (assuming a well-mixed boundary layer), Remote Sens. 2022, 14, 5614 8 of 32 a standard lapse rate (6.5 • C km −1 ) is assumed for the temperature, and the pressure is obtained by integrating the hydrostatic equation downward.
The computational grid is a stretched one, meaning that we configure η values in the way that the layer depth increases gradually with increasing height. The stretched grid is similar to the vertical grids of NWP models, which tend to widen at stratospheric heights. If a regular, nonstretched grid were used, the upper part of the computational grid would hold highly redundant information of the a priori profiles. In that case, the a priori profiles would overcontribute to the cost function, preventing the 1D-Var solution from being constrained adequately by the observation. We chose the number of η levels considering the vertical resolution of the observation. The diffraction-limited resolution, which applies to the bending angles obtained by the geometrical optics method, is known to be 100-1500 m [1,2]. However, when wave optics methods are employed, the resolution can be enhanced by a factor of 10 [70] or higher to a few tens of meters [71]. To this end, the 1D-Var uses 860 η levels on which the layer depth increases progressively from 30 m at the bottom (when z b equals the mean sea level) to 3 km at the top for the refractivity inversion step.
Another reason for using such a dense grid, roughly an order of magnitude denser than today's NWP model grids, is that it reduces the discretization error of the Abel transform. While interpolations to different grids result in different errors, the interpolation error is seldom sensitive to the output (computational) grid for smooth input data sets such as NWP profiles. Moreover, univariate data interpolation is a well-defined problem for which many well-tested methods exist. The point is that the interpolation error is generally much smaller than the discretization error of the Abel transform [38]. As a result, using a fine grid improves the accuracy of the discrete Abel transform. For the non-linear observation operator, the 1D-Var uses a monotone piecewise cubic interpolation, which prevents overshooting by preserving the monotonicity of the input data set. The interpolation assumes that the temperature varies linearly with height, while the pressure and the moisture vary log-linearly. Currently, atmPrf provides bending angle profiles of equally spaced data points. The data interval is about 20 m, exceeding the theoretical resolution limit of the observation. We thin the observations so that each layer of the computational grid holds no more than three observation data points. While the specifics of the thinning are somewhat arbitrary, we intend to retain as much observational information as possible. The thinning is insignificant at heights below 30 km.
We use MSIS to specify the atmospheric states above the top of NWP models. In doing so, an abrupt switch to MSIS would cause a discontinuity at the top of the NWP data. To prevent such a discontinuity, we transition temperature from NWP to MSIS as follows: where z * b (z * t ) is the lower (upper) boundary of the transition layer and ∆T is the departure The temperatures below z * b and above z * t are identical to T NWP and T MSIS , respectively. Since MSIS is largely irrelevant to current weather, z * b is currently set to the top of NWP data. The depth of the transition layer, ∆z c = z * t − z * b , is 5 km. The humidity is constrained in such a manner that the specific humidity is required to be not less than one part per million by weight at all heights. After filling in the computational grid with the temperature and the moisture, we reconstruct the pressure by integrating the hydrostatic equation in the bottom-to-top direction. In doing so, we assume a uniform air composition for simplicity's sake. The hydrostatic pressure reconstruction completes the preparation of the background states. The backgrounds consist of hydrostatically balanced temperature, moisture, and pressure profiles on the computational grid.

Error Covariance Matrices
Error Covariance Matrices (ECMs) play vital roles in modern data assimilation and retrieval by determining how the information is spread in space and among the state variables. Accurate specification of ECMs is of prime importance in 1D-Var because the limited (1D) spatial scope disallows 1D-Var the use of some useful resources available to multi-dimensional systems such as other horizontally nearby observations, and model dynamics and balance operators involving 3D atmospheric structures and flows. Accurate ECMs are also important for 1D-Var to tackle the underdetermined retrieval problem.
The observation error used in the 1D-Var varies with the latitude, month of the year, and height. Wee [38] describes the methodology and procedure used for the error estimation. Here, we give a brief summary. We estimated the observation error for bending angle and refractivity by applying the Hollingsworth-Lönnberg method [72] to~1.5 million nearby pairs of RO profiles available for seven years, April 2007-April 2014. This approach uses the innovation statistics and rests on the assumption that background errors are mutually correlated in space, whereas observation errors are uncorrelated with themselves and the background errors. If the assumption holds, the change of innovation covariance with the separation (distance between paired soundings) is exclusively attributable to the spatial correlation of the background error. The error estimation proceeds with a least-squares fitting of distance-binned covariance values to an idealized (e.g., Gaussian) covariance model. The best-fitting covariance model at zero separation gives the background error variance, the subtraction of which from the innovation variance provides the observation error variance. Henceforward the square root of the observation (background) error variance is referred to as the observation (background) error. The observation error has shown significant latitudinal, altitudinal, and seasonal variations. Consideration of these spatiotemporal variations is vital for 1D-Var to specify a statically precise (locally accurate) observation error at the location of individual RO profiles. The observation ECMs are assumed diagonal for the sake of computational simplicity.
We estimated background ECMs using the NMC (National Meteorological Center) method [23], assisted by the Hollingsworth-Lönnberg method. The NMC method relates the background error to the difference between short-and long-term forecasts that are valid at the same time: where B N MC is a provisional background ECM, δf is the forecast difference, and E(·) is the statistical expectation. The square roots of the diagonal elements of B N MC give a measure of the forecast error, σ N MC f . The NMC method is opportune in that it offers the error estimate in all desired areas and times (e.g., on every model grid point) without being restricted by the coverage of observation networks. The resulting-otherwise challenging to obtain-error estimate is continuous in space and consistent among the model's state variables.
The NMC method is mainly empirical and lacks a theoretical basis [73]. A major limitation is that the choice of forecast lead times, which determines the variance of forecast differences, is arbitrary. Consequently, the forecast differences must be adjusted in magnitude to provide a realistic estimate of the background error [59,74]. We used 12 and 24 h operational forecasts from the European Center for Medium-Range Weather Forecasts (ECMWF). We adjust σ N MC f toward σ HL b , the background error of proper magnitude estimated using the Hollingsworth-Lönnberg method. Specifically, σ N MC f is multiplied with a scaling factor, S = σ HL b ·(σ N MC f ) −1 , to give the final background error σ b . We assume that the scaling factor remains constant in the horizontal direction and does not depend on the state variables. Under these assumptions, we determine the scaling factor by comparing refractivity error estimates. We use the resulting scaling factor to adjust σ N MC f in temperature, humidity, and pressure. In the stratosphere, the scaling factor showed a gradual increase with height, which is consistent with the fact that the NMC method underestimates the background error in observation-sparse areas. We use a second-order polynomial fitting to approximate the global-mean height dependence. Figure 1 shows the annual-and zonal-mean σ b in temperature and moisture. The temperature σ b is prominent around the tropical tropopause (~17 km), shows a sudden break over the subtropics, and gradually descends poleward to become~10 km at the poles. The tropical stratosphere shows a significant temperature σ b , which is about 2 • C in the height range of 20-30 km and does not change much with latitude and height. The large stratospheric temperature σ b in the tropics might be due to vertically propagating equatorial waves, which the forecast model cannot fully resolve. On the other hand, in the upper stratosphere and mesosphere, temperature σ b is more prominent over polar regions, which might relate to the strong thermal gradients associated with polar vortices and sudden stratospheric warming events. Specific humidity σ b (Figure 1b) peaks at about 2 km, corresponding to the average top height of cloud-topped marine boundary layers. The specific humidity σ b decreases rapidly with increasing height and becomes insignificant above 10 km. Relative humidity σ b (Figure 1c) shows an entirely different pattern. The moisture σ b in relative humidity is no longer significant in the lower troposphere. Instead, the relative humidity σ b is most prominent in the upper troposphere, a few kilometers below the tropopause. The difference between specific humidity and relative humidity in the background error presents the difficulty in prescribing the moisture σ b . In the troposphere, relative humidity σ b is less variable than specific humidity σ b , which varies by a few orders of magnitude. Hence, the use of relative humidity σ b eases the error specification, which is one of the reasons that the 1D-Var uses the (pseudo) relative humidity as the moisture control variable.
gradual increase with height, which is consistent with the fact that the NMC method underestimates the background error in observation-sparse areas. We use a second-order polynomial fitting to approximate the global-mean height dependence. Figure 1 shows the annual-and zonal-mean in temperature and moisture. The temperature is prominent around the tropical tropopause (~17 km), shows a sudden break over the subtropics, and gradually descends poleward to become ~10 km at the poles. The tropical stratosphere shows a significant temperature , which is about 2 °C in the height range of 20-30 km and does not change much with latitude and height. The large stratospheric temperature in the tropics might be due to vertically propagating equatorial waves, which the forecast model cannot fully resolve. On the other hand, in the upper stratosphere and mesosphere, temperature is more prominent over polar regions, which might relate to the strong thermal gradients associated with polar vortices and sudden stratospheric warming events. Specific humidity ( Figure 1b) peaks at about 2 km, corresponding to the average top height of cloud-topped marine boundary layers. The specific humidity decreases rapidly with increasing height and becomes insignificant above 10 km. Relative humidity ( Figure 1c) shows an entirely different pattern. The moisture in relative humidity is no longer significant in the lower troposphere. Instead, the relative humidity is most prominent in the upper troposphere, a few kilometers below the tropopause. The difference between specific humidity and relative humidity in the background error presents the difficulty in prescribing the moisture . In the troposphere, relative humidity is less variable than specific humidity , which varies by a few orders of magnitude. Hence, the use of relative humidity eases the error specification, which is one of the reasons that the 1D-Var uses the (pseudo) relative humidity as the moisture control variable.  Figure 2 shows the horizontal structure of the annual-mean background error. In the Northern Hemisphere, the temperature at the height of 5 km and the surface pressure are larger over the storm track regions in the Pacific and Atlantic Oceans. Over the Southern Ocean, they are larger along the circumpolar storm track. The specific humidity at 2 km is greater in warm latitudes and largest in the intertropical convergence zone (ITCZ). In contrast, relative humidity is more significant in higher latitudes. The background error exhibits strong geographical dependences related to spatially varying atmospheric structures and processes, observation coverages, and model deficiencies. The  Figure 2 shows the horizontal structure of the annual-mean background error. In the Northern Hemisphere, the temperature σ b at the height of 5 km and the surface pressure σ b are larger over the storm track regions in the Pacific and Atlantic Oceans. Over the Southern Ocean, they are larger along the circumpolar storm track. The specific humidity σ b at 2 km is greater in warm latitudes and largest in the intertropical convergence zone (ITCZ). In contrast, relative humidity σ b is more significant in higher latitudes. The background error exhibits strong geographical dependences related to spatially varying atmospheric structures and processes, observation coverages, and model deficiencies. The 1D-Var uses background errors represented on a 5 • × 5 • (latitude-longitude) grid for each month of the year.
The background error at heights above the model top relates to the uncertainty of MSIS. Following [69], the uncertainty of MSIS is parameterized by the spread of the MSIS states obtainable under diverse assumed conditions. We produced 120 different MSIS realizations for a given geophysical location and month with varying combinations of the key control parameters of the empirical model: hour of the day, day of the month, 10.7 cm solar radio flux (F10.7), and the geomagnetic amplitude index (Ap). The last two parameters, F10.7 and Ap, relate to the strength of solar activity. The spread of these MSIS realizations is scaled to match the NWP-based σ b in the uppermost 10 km of ECMWF forecasts. The size-adjusted spread is the background error in MSIS heights.
The off-diagonal elements of B represent inter-element error correlations. The NMC method furnishes a complete B that includes cross (inter-variable) correlations and auto (self) correlations, which is a virtue of the NMC method considering that the off-diagonal elements (especially those of multivariate correlation) are otherwise difficult to estimate. We used ECMWF forecasts to build B because the model top and vertical resolution are higher than any other model forecasts available to us, permitting us to estimate B at high altitudes. In practical use, however, short-term ECMWF forecasts may not always be the source of a priori profile. In that case, it is desirable to allow the 1D-Var to have the flexibility to adjust σ b , which can be accomplished by constructing error correlation matrix C: where c i,j (b i,j ) is the element of C (B) at i-th column and j-th row, and σ i is the square root of b i,i . We assume that C is generic and not dependent on a priori data source. It is then possible to construct a new B by swapping σ i values with known or desired values while retaining C. 1D-Var uses background errors represented on a 5° × 5° (latitude-longitude) grid for each month of the year. The background error at heights above the model top relates to the uncertainty of MSIS. Following [69], the uncertainty of MSIS is parameterized by the spread of the MSIS states obtainable under diverse assumed conditions. We produced 120 different MSIS realizations for a given geophysical location and month with varying combinations of the key control parameters of the empirical model: hour of the day, day of the month, 10.7 cm solar radio flux (F10.7), and the geomagnetic amplitude index (Ap). The last two parameters, F10.7 and Ap, relate to the strength of solar activity. The spread of these MSIS realizations is scaled to match the NWP-based in the uppermost 10 km of ECMWF forecasts. The size-adjusted spread is the background error in MSIS heights.
The off-diagonal elements of represent inter-element error correlations. The NMC method furnishes a complete that includes cross (inter-variable) correlations and auto (self) correlations, which is a virtue of the NMC method considering that the off-diagonal elements (especially those of multivariate correlation) are otherwise difficult to estimate. We used ECMWF forecasts to build because the model top and vertical resolution are higher than any other model forecasts available to us, permitting us to estimate at high altitudes. In practical use, however, short-term ECMWF forecasts may not always be the source of a priori profile. In that case, it is desirable to allow the 1D-Var to have the flexibility to adjust , which can be accomplished by constructing error correlation matrix C: where , ( , ) is the element of C ( ) at -th column and -th row, and is the square root of , . We assume that C is generic and not dependent on a priori data source. It is then possible to construct a new by swapping values with known or desired values while retaining C. In fact, this is what allowed us earlier to adjust toward .   Figure 3 shows an example of C used in the 1D-Var, at 60 • N for March. The error correlations widen with increasing height and differ significantly, e.g., the pressure error shows much broader autocorrelations than the temperature error (Figure 3b,c). In addition, they change signs with height and exhibit a fair degree of asymmetry with respect to height: e.g., the autocorrelation of temperature errors. These complex aspects make it challenging to model C accurately, especially the cross-correlations. As inadequate approximation or idealization of C results in loss of information, we decided to use the error correlation matrix in its complete form as provided by the NMC method. We believe that doing so allows the 1D-Var to benefit the most from the error correlation matrix to the extent that the NMC method remains realistic.
To specify B 1/2 which the 1D-Var needs, we first decompose B in the control variables' space, B v , as follows: where D is the diagonal matrix whose diagonal elements correspond to the background error. What followed next is an eigen decomposition of C using the method proposed by [75]: where columns of Σ are eigenvectors of C, which are mutually orthogonal (Σ T Σ = I, where I is the identity matrix), and Λ is the diagonal matrix of eigenvalues. Using Equations (16) and (17), we can rewrite B 1/2 v as: Finally, we can obtain B 1/2 as follows: where P is algebraic operations that convert the control vector into the state vector. In the 1D-Var, P involves two physical transforms: conversion between moisture variables and hydrostatic reconstruction of the pressure profile. The 1D-Var uses the pseudo-relative humidity RH * as the default moisture control variable. As a result, it receives a new estimate of the humidity increment δRH * at each iteration of Equation (9). Thereupon, the 1D-Var converts the humidity increment into the specific humidity increment: where q s b is the saturation specific humidity with respect to the background temperature T b and pressure p b . The specific humidity relates to the water vapor pressure as follows: where ε is a constant (=0.622). The saturation water vapor pressure e s is determined in the following manner: where e s w (e s i ) is the saturation water vapor pressure over water (ice), T 0 is 0 • C, and T i is −23 • C. For e s w and e s i , the 1D-Var uses the formula proposed by [76] as the default option among other formulae available. The transition between e s w to e s i in the second line of Equation (22) represents the freezing inhibition effect of supercooled water.
The lowest-level pressure is the default option for the pressure control variable, and the 1D-Var determines upper-level pressure increments through a hydrostatic balance operator. In a hydrostatically balanced atmosphere, the following relation holds: where ρ v is the virtual air density and the prime denotes the perturbation from the base state. Linearization of the ideal gas law leads to: where T v = T (1 + 0.608 q) is the virtual temperature. Accordingly, Equation (23) can be rewritten as: The discrete form of Equation (25), the hydrostatic balance operator, used for the 1D-Var is: where δ indicates the increment, the variables without δ correspond to the background state, the superscript stands for the vertical level (indexed from bottom to top), and the half-level index, k + 1 2 , denotes the arithmetic mean of adjacent full levels, k and k + 1. (25) is implicit (i.e., p appearing on both sides), and that the 1D-Var uses a non-staggered grid on which all state variables (pressure, temperature, and moisture) are collocated. It is worthwhile to mention that g and dz are absent in Equation (26), which allows one to use Equation (26) in arbitrary vertical coordinates, notwithstanding the change of g with the geometric height.

Equation (26) looks a bit complicated because Equation
where is the virtual air density and the prime denotes the perturbation from the base state. Linearization of the ideal gas law leads to: where = (1 + 0.608 ) is the virtual temperature. Accordingly, Equation (23) can be rewritten as: The discrete form of Equation (25), the hydrostatic balance operator, used for the 1D-Var is: where indicates the increment, the variables without correspond to the background state, the superscript stands for the vertical level (indexed from bottom to top), and the half-level index, + 1 2 , denotes the arithmetic mean of adjacent full levels, and + 1 Equation (26) looks a bit complicated because Equation (25) is implicit (i.e., ′ appearing on both sides), and that the 1D-Var uses a non-staggered grid on which all state variables (pressure, temperature, and moisture) are collocated. It is worthwhile to mention that and are absent in Equation (26), which allows one to use Equation (26) in arbitrary vertical coordinates, notwithstanding the change of with the geometric height. Although the 1D-Var does not require any simplification or heuristic approximation for B, there still are some practical difficulties to overcome. When the 1D-Var uses the entire pressure profile (instead of the lowest-level pressure) as the pressure control variable, B can be as large as 2580 × 2580. Computing 1/2 of that size at the runtime for each RO event is impractical, especially for real-time data processing. On that account, instead Although the 1D-Var does not require any simplification or heuristic approximation for B, there still are some practical difficulties to overcome. When the 1D-Var uses the entire pressure profile (instead of the lowest-level pressure) as the pressure control variable, B can be as large as 2580 × 2580. Computing B 1/2 of that size at the runtime for each RO event is impractical, especially for real-time data processing. On that account, instead of computing B 1/2 v on the fly, the 1D-Var ingests a precomputed B 1/2 v , the cost of which is trifling. The preparation of B 1/2 v is through the following procedure. We evaluate C on the a priori model grid, interpolate it to the computational grid, and compute and store C 1/2 . Because C did not strongly depend on the longitude and its temporal variation appear slow, we compute and store C 1/2 for each season and in 20 • latitude bins. In addition, we prepare C 1/2 for four possible bottom heights of the computational grid (0, 5, 10, and 15 km), among which the 1D-Var picks up the one closest to the actual bottom height. What allows the 1D-Var to use precomputed C 1/2 is that the computational grid has a fixed number of vertical levels, regardless of the bottom height, which is indeed the reason for using η coordinates in the 1D-Var. The bottom of the precomputed C 1/2 can differ by up to 2.5 km from the actual bottom of the computational grid. This difference causes a discrepancy in the grid spacing between the computational grid and C 1/2 , but only slightly because the grid's total vertical extent is far greater than the bottom-height difference. We consider this discrepancy a worthy trade-off for the efficiency. That is to say, the inaccuracy of specifying C due to this discrepancy is insignificant compared to the uncertainty of C itself. We stored the diagonal elements of D on a 5 • × 5 • grid for each month of the year, the expense of which is negligible. The 1D-Var interpolates these diagonal elements available on NWP model levels to the exact heights (rather than to η levels) of the computational grid at the runtime. The 1D-Var constructs B 1/2 v using D and C 1/2 based on Equation (18) and acquires B 1/2 through Equation (19).

1D-Var Response to Single-Level Perturbations of the Observation
The analysis vector x a , which minimizes the cost function in the form of Equation (1), is given by the solution of ∇ x J = 0 [57]: Interpretation of Equation (27) is facilitated by considering a special case in which x b and y o each have only one element, x b and y o , and y o is a direct observation of x and collocated with x b . In that case, the solution x a is: where σ 2 o and σ 2 b are observation error variance and the background error variance, respectively. Equation (28) states that the optimal solution is the combination of x b and y o that weighs x b and y o by their reciprocal error variances (accuracies). Hence, the solution x a lies between x b and y o and is closer to the more accurate one. In practice, however, RO bending angle and refractivity are indirect observations of the state variables and have numerous data points which are irregularly distributed in the computational grid. These observational data points lead to different analysis increments, x a − x b , which interact with each other and propagate to remote locations through the observation operator and error covariances. Moreover, the analysis increments can vary significantly depending on the 1D-Var implementation. For instance, multivariate B tends to cause the structure of the analysis increments to be more complicated. These make it challenging to analyze Equation (27) when applied to actual RO events.
An insightful approach in this regard is to see how a 1D-Var responds to a perturbation in bending angle at a single level. The black line in Figure 4a Figure 4b-d show the response of 1D-Var (i.e., change in the solution) to these perturbations. The 1D-Var attempts to fit the perturbed observation by altering the vertical refractivity gradient at the immediate height of the perturbation. Because the perturbations are of positive sign, the refractivity must increase below the height of the perturbation, or vice versa. In this case, the 1D-Var tends to increase lower-level refractivity, compelled by the Abel transform that propagates the observation information downward.
As can be inferred from Equation (2), an increase in refractivity accompanies a pressure rise, moistening, or cooling. In relative humidity, the 1D-Var analysis is wetter than the background by up to 16-24%, depending on the location of perturbed observation. The maximum cooling is 0.3-0.6 • C. It is noteworthy that a single-level observation perturbation influences broad areas through the background error correlation, e.g., the temperature and moisture responses appear in deep layers that are thicker than 1.5 km. In comparison, a diagonal B leads to pointwise responses, i.e., the temperature and moisture changes occur only at the immediate height that the observation is perturbated. The hydrostatic pressure response is far broader. Notably, observation perturbations induce sizable changes to lower-level pressures. For instance, the observation perturbed at 8 km causes 0.9 hPa pressure change at the lowest level. In this case, the 1D-Var tends to increase lower-level refractivity, compelled by the Abel transform that propagates the observation information downward. As can be inferred from Equation (2), an increase in refractivity accompanies a pressure rise, moistening, or cooling. In relative humidity, the 1D-Var analysis is wetter than the background by up to 16-24%, depending on the location of perturbed observation The maximum cooling is 0.3-0.6 °C. It is noteworthy that a single-level observation perturbation influences broad areas through the background error correlation, e.g., the temperature and moisture responses appear in deep layers that are thicker than 1.5 km. In comparison, a diagonal B leads to pointwise responses, i.e., the temperature and moisture changes occur only at the immediate height that the observation is perturbated. The hydrostatic pressure response is far broader. Notably, observation perturbations induce sizable changes to lower-level pressures. For instance, the observation perturbed at 8 km causes 0.9 hPa pressure change at the lowest level. RO observations have numerous data points (samples) that affect the 1D-Var solution differently. In this circumstance, 1D-Var does not determine its solution at a given height based solely on the local observation, but through consensus among many data samples across a considerable height range. The across-sample consensus makes the 1D-Var robust against outlying data samples. Meanwhile, a low-level bending angle contains information about the upper atmosphere. Therefore, it is difficult for a 1D-Var to closely fit a low-level bending angle unless it can accurately recover the upper atmosphere. Hence 1D-Var owes its effectiveness partly to the global observation fitting and the across-sample consensus. The broad and complicated response of the 1D-Var to a perturbed data sample underscores the importance of accurate error covariances. The 1D-Var likely facilitates an effective correction to the background using a complete observation profile through the consensus among overlapping responses to individual data samples. RO observations have numerous data points (samples) that affect the 1D-Var solution differently. In this circumstance, 1D-Var does not determine its solution at a given height based solely on the local observation, but through consensus among many data samples across a considerable height range. The across-sample consensus makes the 1D-Var robust against outlying data samples. Meanwhile, a low-level bending angle contains information about the upper atmosphere. Therefore, it is difficult for a 1D-Var to closely fit a low-level bending angle unless it can accurately recover the upper atmosphere. Hence, 1D-Var owes its effectiveness partly to the global observation fitting and the across-sample consensus. The broad and complicated response of the 1D-Var to a perturbed data sample underscores the importance of accurate error covariances. The 1D-Var likely facilitates an effective correction to the background using a complete observation profile through the consensus among overlapping responses to individual data samples.

Tests Using Synthetic Data
Verification of 1D-Var with real-world data is challenging. One reason is that the required input error covariances are never perfect. Another is that it is hard to acquire verification data that are accurate, of sufficient amount, and independent of and collocated with 1D-Var retrievals, and even if such data were available, they would still not represent "truth." One way to overcome this difficulty is designing and using a controlled system in which the atmosphere, the observation, and a priori data conform perfectly to our current understanding. For this purpose, we generated synthetic observation and a priori profiles using a high vertical resolution radiosonde profile of the Global Climate Observing System (GCOS) Reference Upper-Air Network (GRUAN) [77]. The radiosonde observation took place at the Lamont site (36.61 • S, 97.49 • W) at 08:30 UT on 4 April 2012 (ID: SGP-RS-01_2_RS92-GDP_002_20120904T083000). We used ECMWF analysis and MSIS to specify the atmospheric state above the balloon-burst point. We applied a lowpass filtering to the radiosonde temperature, pressure, and moisture profiles to reduce the radiosonde instrument noise. These smoothed profiles represent the true atmosphere from which a perfect observation is derived.
We simulated the perfect observation (red line in Figure 5a) by interpolating the true refractivity to the data levels of a nearby COSMIC profile (ID: C002.2012.248.07.24.G21). The interpolation furnishes a realistic observation density in the vertical direction. Next, we generated "observed" refractivity profiles by adding zero-mean, random Gaussian noise to the perfect observation at each data level. The noise equals the observation error in magnitude. By repeating this step with different profile sets of random noise, we obtained 100 realizations of the observation (represented by black dots in Figure 5a). SGP-RS-01_2_RS92-GDP_002_20120904T083000). We used ECMWF analysis and MSIS to specify the atmospheric state above the balloon-burst point. We applied a lowpass filtering to the radiosonde temperature, pressure, and moisture profiles to reduce the radiosonde instrument noise. These smoothed profiles represent the true atmosphere from which a perfect observation is derived. We simulated the perfect observation (red line in Figure 5a) by interpolating the true refractivity to the data levels of a nearby COSMIC profile (ID: C002.2012.248.07.24.G21). The interpolation furnishes a realistic observation density in the vertical direction. Next, we generated "observed" refractivity profiles by adding zero-mean, random Gaussian noise to the perfect observation at each data level. The noise equals the observation error in magnitude. By repeating this step with different profile sets of random noise, we obtained 100 realizations of the observation (represented by black dots in Figure 5a). Aside from the observation, we generated a priori profiles consistent with the diagnostic of the background error covariance described in Section 2.2.4. First, we interpolated the true temperature, pressure, and moisture profiles (red lines in Figure 5b) to ECMWF model levels. These model-level profiles are perfect, but only within the model's vertical resolution limit. In other words, the "perfect" a priori profiles cannot describe subgridscale fluctuations in the radiosonde-based "true" atmosphere. We created a vector of a priori error in the control variables' space as follows: Aside from the observation, we generated a priori profiles consistent with the diagnostic of the background error covariance described in Section 2.2.4. First, we interpolated the true temperature, pressure, and moisture profiles (red lines in Figure 5b) to ECMWF model levels. These model-level profiles are perfect, but only within the model's vertical resolution limit. In other words, the "perfect" a priori profiles cannot describe subgrid-scale fluctuations in the radiosonde-based "true" atmosphere. We created a vector of a priori error in the control variables' space as follows: where is the error vector of size M, ε i is the random, Gaussian noise associated with i-th eigenmode of B v whose eigenvalue and eigenvector are Λ i and Σ i , respectively. What followed next was a scaling to satisfy ∑ We applied linearized physical transforms (hydrostatic integration preceded by conversion to specific humidity) to , which yields a priori error profiles for temperature, pressure, and specific humidity. Finally, we attained model-level a priori profiles by combining these error profiles with the "perfect" a priori profiles. In doing so, we constrained the specific humidity to be positive and subsaturated. We repeated this procedure to produce 100 sets of a priori profiles (represented by black dots in Figure 5b). The noise added to the "perfect" a priori profiles tends to incur numerous super-refraction layers. The occurrence frequency and location of the super-refraction layers are so erratic that it was difficult for us to use bending angle as the observation type. Besides, the refractivity observation operator is local, allowing straightforward interpretation of the testing results. Thus, we chose to use refractivity as the observation. We carried out 10,000 1D-Var retrievals by alternating the combination of observed and a priori profiles, which is analogous to testing the 1D-Var under varying atmospheric conditions and various realizations of weather forecasting. Furthermore, the known true atmosphere permits straightforward evaluation of the retrieval error, freeing us from the uncertainty of real-world data. Figure 6 compares a priori profiles (black dashed lines) and 1D-Var retrievals (red solid lines) in random error, i.e., the standard deviation from the true atmosphere. Smallscale undulations, which are more pronounced in a priori error, are primarily due to the limited vertical resolution of a priori profiles. Since the observation and a priori profile are unbiased, systematic error is irrelevant and not shown here. The 1D-Var shows a remarkable reduction of the temperature error (Figure 6a) at heights above 10 km. Here, the error reduction means comparing retrieval error to a priori error. While still discernible, the temperature error reduction is minor at lower heights. The reduced temperature error accompanies a significant reduction of pressure error (Figure 6b), which is the hydrostatic response to the temperature change, even at the lowest height where the temperature error reduction is relatively small. The specific humidity error is reduced dramatically below 10 km (Figure 6c), where water vapor is abundant. In contrast, it is hardly discernible above 8 km, although the relative humidity shows some error reduction there (Figure 6d). The relatively small temperature error reduction in the lower troposphere is due to the dominance of a priori humidity error. of the super-refraction layers are so erratic that it was difficult for us to use bending angle as the observation type. Besides, the refractivity observation operator is local, allowing straightforward interpretation of the testing results. Thus, we chose to use refractivity as the observation. We carried out 10,000 1D-Var retrievals by alternating the combination of observed and a priori profiles, which is analogous to testing the 1D-Var under varying atmospheric conditions and various realizations of weather forecasting. Furthermore, the known true atmosphere permits straightforward evaluation of the retrieval error, freeing us from the uncertainty of real-world data. Figure 6 compares a priori profiles (black dashed lines) and 1D-Var retrievals (red solid lines) in random error, i.e., the standard deviation from the true atmosphere. Smallscale undulations, which are more pronounced in a priori error, are primarily due to the limited vertical resolution of a priori profiles. Since the observation and a priori profile are unbiased, systematic error is irrelevant and not shown here. The 1D-Var shows a remarkable reduction of the temperature error (Figure 6a) at heights above 10 km. Here, the error reduction means comparing retrieval error to a priori error. While still discernible, the temperature error reduction is minor at lower heights. The reduced temperature error accompanies a significant reduction of pressure error (Figure 6b), which is the hydrostatic response to the temperature change, even at the lowest height where the temperature error reduction is relatively small. The specific humidity error is reduced dramatically below 10 km (Figure 6c), where water vapor is abundant. In contrast, it is hardly discernible above 8 km, although the relative humidity shows some error reduction there (Figure 6d). The relatively small temperature error reduction in the lower troposphere is due to the dominance of a priori humidity error.  Figure 7 shows the true and observed refractivity and a priori profiles. Limited by the low temperature, the amount of atmos- We considered an Arctic case to substantiate this last point. We used the GRUAN observation taken place at the Barrow station (71.32 • N, 156.62 • W) at 18:00 UT on 22 January 2011 (ID: BAR-RS-01_2_RS92-GDP_002_20110122T180000), which is collocated with a COSMIC profile (ID: C004.2011.022.18.00.G31). Figure 7 shows the true and observed refractivity and a priori profiles. Limited by the low temperature, the amount of atmospheric water vapor is small, even in the lower troposphere. Under this atmospheric condition, the 1D-Var reduces the temperature error almost by half throughout the atmosphere (Figure 8a). The diminished temperature error induces a great deal of pressure error reduction, about 80% at the surface (Figure 8b). The substantial error reductions in temperature and pressure contrast with the smaller error reductions in a wet atmosphere ( Figure 6). The dry Arctic air restricts a priori humidity error to be small in terms of water vapor pressure and specific humidity, leaving little room for error reduction (Figure 8c,d). The synthetic data tests show that the 1D-Var produces high-quality retrievals whose error is considerably smaller than a priori error. The tests also indicate that the relative error reduction among the state variables depends on the atmospheric condition and respective a priori errors.

Validation Using NWP Model Data
Thanks to the remarkable advancements in numerical modeling, satellite observations, and data assimilation in the past five decades, NWP nowadays provides realistic descriptions of the atmosphere. Data assimilation systems combine model dynamics and physics with vast observational information from various observing platforms. Therefore, comparisons of RO data with NWP analyses greatly complement the comparisons to in-

Validation Using NWP Model Data
Thanks to the remarkable advancements in numerical modeling, satellite observations, and data assimilation in the past five decades, NWP nowadays provides realistic descriptions of the atmosphere. Data assimilation systems combine model dynamics and physics with vast observational information from various observing platforms. Therefore, Figure 8. Errors of a priori profile (black dashed lines) and 1D-Var retrieval (red solid) for the Arctic case in Figure 7 in terms of: (a) temperature ( • C), (b) pressure (hPa), (c) specific humidity (g kg −1 ), and (d) relative humidity (%). The error is defined as the standard deviation from the known truth.

Validation Using NWP Model Data
Thanks to the remarkable advancements in numerical modeling, satellite observations, and data assimilation in the past five decades, NWP nowadays provides realistic descriptions of the atmosphere. Data assimilation systems combine model dynamics and physics with vast observational information from various observing platforms. Therefore, comparisons of RO data with NWP analyses greatly complement the comparisons to individual observing systems. For example, Rieckh et al. [78] used 10 different model data sets to estimate the random error statistics of RO observations using the three-cornered hat method. Because global NWP data are routinely available, one can always compare them with 1D-Var retrievals, which differs from the comparisons to individual observations (e.g., radiosondes) whose distribution in space and time is irregular and often sparse. This widespread availability of NWP data permits comparisons over large samples and across a broad spectrum of spatiotemporal scales. In particular, the comparison to NWP data is beneficial for understanding how the character of 1D-Var varies over different geographical regions and under disparate atmospheric conditions. The tests and comparisons shown in this section make use of NWP model data.

Comparison against the Previous 1D-Var Version
Two decades ago, UCAR developed an atmospheric 1D-Var to overcome the difficulties faced by pre-existent, traditional retrieval methods, and recently the 1D-Var underwent a significant upgrade. Figure 9 compares two 1D-Var versions, the one described in this paper (NEW), which produces "wetPf2" data, and the previous version (OLD) used to produce "wetPrf" data. Shown here is the root-mean-square difference (RMSD) from the operational ECMWF analysis computed over the COSMIC profiles ("atmPrf") available for 2008-2014. A priori profiles, shared by OLD and NEW, are generated from the operational NCEP analysis, which is independent of the ECMWF analysis to a certain extent. In the temperature RMSD (upper panels), OLD varies significantly with the latitude and height and is significantly larger than NEW. The RMSD of OLD exceeds four degrees in the lower troposphere of some southern latitudes, as opposed to NEW which shows smaller, quasiuniform RMSDs across all latitudes. In the specific humidity RMSD (lower panels), OLD is significant in low latitudes with maxima around 15 • S and 20 • N, increasing toward the surface. Again, OLD is notably larger than NEW in all latitudes and heights.
The leading cause of the disparity between OLD and NEW is the difference in the degree to which 1D-Var fits RO observations. NEW determines the fitting based on the observation error relative to the background error, whereas OLD attempts to fit RO refractivity exactly. With the exact fitting, we intended OLD to reduce the dependence of the 1D-Var solution on the background and squeeze out the most information from observations. However, this exact fitting to imperfect observations comes at a cost. One good example is that an observed refractivity in the lower troposphere has significant negative bias under a cold, dry atmospheric condition, while the background is unbiased. In this case, OLD can reproduce the negatively biased refractivity by either reducing the background humidity or increasing the temperature. Of the two options, the first is not possible because OLD cannot dehydrate the already-dry background. Consequently, OLD seeks a solution warmer than the background, which leads to an excessive temperature correction when the observation bias is large. An extreme background correction can also occur when the observation is much larger than its background counterpart in a saturated atmosphere. Although these circumstances are occasional, the subsequent background correction is so significant that OLD becomes much larger than NEW in temperature and moisture RMSDs. We find that the exact fitting to imperfect observations pursued by OLD leads to suboptimal combinations of retrieved temperature, pressure, and moisture, causing OLD to bear much larger RMSDs. On the other hand, NEW effects modest and spatially slow-varying RMSDs (Figure 9b,d). and height and is significantly larger than NEW. The RMSD of OLD exceeds four degrees in the lower troposphere of some southern latitudes, as opposed to NEW which shows smaller, quasi-uniform RMSDs across all latitudes. In the specific humidity RMSD (lower panels), OLD is significant in low latitudes with maxima around 15°S and 20°N, increasing toward the surface. Again, OLD is notably larger than NEW in all latitudes and heights. Figure 9. Root-mean-square differences of 1DVAR retrievals from operational ECMWF analysis: (a,b) temperature (°C) and (c,d) specific humidity (g kg −1 ). Left panels, (a,c), are for the previous 1D-Var version (denoted OLD), while right panels are for the new 1D-Var (denoted NEW) described in this paper.
The leading cause of the disparity between OLD and NEW is the difference in the degree to which 1D-Var fits RO observations. NEW determines the fitting based on the observation error relative to the background error, whereas OLD attempts to fit RO refractivity exactly. With the exact fitting, we intended OLD to reduce the dependence of the 1D-Var solution on the background and squeeze out the most information from observations. However, this exact fitting to imperfect observations comes at a cost. One good example is that an observed refractivity in the lower troposphere has significant negative bias under a cold, dry atmospheric condition, while the background is unbiased. In this case, OLD can reproduce the negatively biased refractivity by either reducing the background humidity or increasing the temperature. Of the two options, the first is not possible because OLD cannot dehydrate the already-dry background. Consequently, OLD seeks a solution warmer than the background, which leads to an excessive temperature correction when the observation bias is large. An extreme background correction can also occur when the observation is much larger than its background counterpart in a saturated atmosphere. Although these circumstances are occasional, the subsequent background correction is so significant that OLD becomes much larger than NEW in temperature and moisture RMSDs. We find that the exact fitting to imperfect observations pursued by OLD Figure 9. Root-mean-square differences of 1DVAR retrievals from operational ECMWF analysis: (a,b) temperature ( • C) and (c,d) specific humidity (g kg −1 ). Left panels, (a,c), are for the previous 1D-Var version (denoted OLD), while right panels are for the new 1D-Var (denoted NEW) described in this paper.

Comparison of Observation Operators
Another crucial factor differentiating OLD and NEW is the observation type and operator used in 1D-Var. Figure 10 shows the differences of 1D-Var temperature (upper panels) and specific humidity (lower panels) to the ECMWF analysis in the mean (left) and standard deviation (right) in the tropics (marked T), middle latitudes (M), and polar latitudes (P). The middle latitudes are 30-60 • in both Hemispheres. OLD uses the standard RO refractivity, derived via the Abel inversion, as the observation (denoted as OLD N in the legend). While NEW can use the bending angle (NEW BA ) and the refractivity (NEW N ), it takes as the default option the two-step approach-variational refractivity inversion followed by the retrieval step. We want to point out that the observation errors of refractivity and bending angle used in this comparison are inter-consistent. We applied the same estimation (Hollingsworth-Lönnberg) method to refractivity and bending angle profiles belonging to an identical data set to ensure the consistency between refractivity and bending angle in the observation error. The consistency is valid between the observation error and the background error. Attaining these consistencies is vital for fair comparisons between observation types and operators.
Compared to OLD N , NEW N shows much better agreements with the ECMWF analysis in both mean difference and standard deviation, evincing the overall improvement over OLD. The comparison of NEW N with NEW BA shows that the standard refractivity is advantageous over the bending angle in the lower troposphere, mainly due to the posterior height determination [38] in the Abel inversion. On the other hand, NEWBA reduces the standard deviation of temperature slightly in the stratosphere. The 1D-Var with the twostep approach (denoted NEW VR ) outperforms NEW N and NEW BA , strongly supporting using the two-step approach in the 1D-Var. Readers are referred to [38] for more details on the approach. advantageous over the bending angle in the lower troposphere, mainly due to the posterior height determination [38] in the Abel inversion. On the other hand, NEWBA reduces the standard deviation of temperature slightly in the staratosphere. The 1D-Var with the two-step approach (denoted NEWVR) outperforms NEWN and NEWBA, strongly supporting using the two-step approach in the 1D-Var. Readers are referred to [38] for more details on the approach.

Dependence on the Background
1D-Var uses background profiles to enhance the RO observations that are neither perfect nor complete. However, a concern arising in this regard is that the 1D-Var solution is affected by the imperfect background. To investigate this effect, we carried out a statistical analysis using two sets of NEW, one of which (denoted COSMIC ERA ) uses the ECMWF's interim reanalysis (ERA) [79] as the source of a priori profiles and the other (COSMIC GFS ) uses the Global Forecast System (GFS) analysis of the NCEP. Figure 11 compares the RMS difference between GFS and ERA analyses in parallel with the difference between COSMIC ERA and COSMIC GFS . The two separate 1D-Var retrievals, COSMIC GFS and COSMIC ERA , are much closer to each other than GFS and ERA are, indicating that the 1D-Var solution is constrained well by the observation and does not strongly depend on the background. We observed the same from the recently launched COSMIC-2 [80,81] (not shown). The little dependence on backgrounds closely relates to the long-term stability of 1D-Var retrieval, which is one of the essential requirements to enable RO-based climate studies. trievals, COSMICGFS and COSMICERA, are much closer to each other than GFS and ERA are, indicating that the 1D-Var solution is constrained well by the observation and does not strongly depend on the background. We observed the same from the recently launched COSMIC-2 [80,81] (not shown). The little dependence on backgrounds closely relates to the long-term stability of 1D-Var retrieval, which is one of the essential requirements to enable RO-based climate studies.   A and B), which coincide with the times that NCEP made major changes in their forecasting system [82,83]. The ECMWF reanalysis (ERA) uses a frozen analysis system and is thus unlikely to cause these sudden changes. When GFS is used as the background (denoted COSMICGFS), the 1D-Var departure from the background (GFS-COS-MICGFS; upper right) resembles GFS-ERA. Thus, the 1D-Var produces a temporally continuous solution even though discontinuous background is given. While using ERA as the background (COSMICERA), on the other hand, the background departure (COSMICERA-ERA; lower left) does not show any abrupt changes, confirming that neither COSMIC nor ERA but GFS is the cause of the sudden changes. The RMSD between COSMICGFS and COSMICERA (lower right) is remarkably small, indicating that the 1D-Var produces almost identical humidity values regardless of the source of the background. The minor break appearing in COSMICERA-ERA at the end of 2013 seems to be caused by ERA, e.g., a change made to the observing systems used by the reanalysis.   A and B), which coincide with the times that NCEP made major changes in their forecasting system [82,83]. The ECMWF reanalysis (ERA) uses a frozen analysis system and is thus unlikely to cause these sudden changes. When GFS is used as the background (denoted COSMIC GFS ), the 1D-Var departure from the background (GFS-COSMIC GFS ; upper right) resembles GFS-ERA. Thus, the 1D-Var produces a temporally continuous solution even though discontinuous background is given. While using ERA as the background (COSMIC ERA ), on the other hand, the background departure (COSMIC ERA -ERA; lower left) does not show any abrupt changes, confirming that neither COSMIC nor ERA but GFS is the cause of the sudden changes. The RMSD between COSMIC GFS and COSMIC ERA (lower right) is remarkably small, indicating that the 1D-Var produces almost identical humidity values regardless of the source of the background. The minor break appearing in COSMIC ERA -ERA at the end of 2013 seems to be caused by ERA, e.g., a change made to the observing systems used by the reanalysis. As shown in Figures 6 and 8, 1D-Var can retrieve value-added profiles compared to the background. Real-world data tests also confirmed this point. Figure 13 shows an example where COSMIC-2 humidities (marked COSMIC2) are compared against short-term GFS forecasts (marked GFSFCST), which served as the background. We made the comparison by taking ECMWF analysis (ECMWFANAL) as the reference data for the period from 16 July 2019 to 1 February 2020. In the standard deviation from the ECMWF analysis, COS-MIC2 (right panels) is notably smaller than the background (left) in both specific humidity (upper) and precipitable water (lower). Besides validating that 1D-Var can produce valueadded retrievals, this presents the potential of COSMIC-2 data for improving atmospheric humidity in weather analysis and forecasting. Figure 13. COSMIC-2 humidity retrievals (marked COSMIC2) versus short-term GFS forecasts (used as the background; marked GFSFCST), with reference to ECMWF analysis (ECMWFANAL) for the period from 16 July 2019 to 1 February 2020. Left panels are standard deviations of GFSFCST-ECMWFANAL and right panels are those of COSMIC2-ECMWFANAL. Upper panels are for the specific humidity (g kg −1 ) and lower panels are for the precipitable water (mm). As shown in Figures 6 and 8, 1D-Var can retrieve value-added profiles compared to the background. Real-world data tests also confirmed this point. Figure 13 shows an example where COSMIC-2 humidities (marked COSMIC2) are compared against short-term GFS forecasts (marked GFS FCST ), which served as the background. We made the comparison by taking ECMWF analysis (ECMWF ANAL ) as the reference data for the period from 16 July 2019 to 1 February 2020. In the standard deviation from the ECMWF analysis, COSMIC2 (right panels) is notably smaller than the background (left) in both specific humidity (upper) and precipitable water (lower). Besides validating that 1D-Var can produce valueadded retrievals, this presents the potential of COSMIC-2 data for improving atmospheric humidity in weather analysis and forecasting. As shown in Figures 6 and 8, 1D-Var can retrieve value-added profiles compared to the background. Real-world data tests also confirmed this point. Figure 13 shows an example where COSMIC-2 humidities (marked COSMIC2) are compared against short-term GFS forecasts (marked GFSFCST), which served as the background. We made the comparison by taking ECMWF analysis (ECMWFANAL) as the reference data for the period from 16 July 2019 to 1 February 2020. In the standard deviation from the ECMWF analysis, COS-MIC2 (right panels) is notably smaller than the background (left) in both specific humidity (upper) and precipitable water (lower). Besides validating that 1D-Var can produce valueadded retrievals, this presents the potential of COSMIC-2 data for improving atmospheric humidity in weather analysis and forecasting. Figure 13. COSMIC-2 humidity retrievals (marked COSMIC2) versus short-term GFS forecasts (used as the background; marked GFSFCST), with reference to ECMWF analysis (ECMWFANAL) for the period from 16 July 2019 to 1 February 2020. Left panels are standard deviations of GFSFCST-ECMWFANAL and right panels are those of COSMIC2-ECMWFANAL. Upper panels are for the specific humidity (g kg −1 ) and lower panels are for the precipitable water (mm). Figure 13. COSMIC-2 humidity retrievals (marked COSMIC2) versus short-term GFS forecasts (used as the background; marked GFS FCST ), with reference to ECMWF analysis (ECMWF ANAL ) for the period from 16 July 2019 to 1 February 2020. Left panels are standard deviations of GFS FCST -ECMWF ANAL and right panels are those of COSMIC2-ECMWF ANAL . Upper panels are for the specific humidity (g kg −1 ) and lower panels are for the precipitable water (mm).

Comparison to High Vertical Resolution Radiosondes
Despite being useful for evaluating the overall quality of 1D-Var retrievals, NWP data-especially the analyses-tend to be correlated with 1D-Var retrievals in their errors mainly because RO observations are used in common by data assimilation and 1D-Var. A solution to this problem is to compare 1D-Var retrievals with independent data, e.g., radiosonde observations. On that ground, we compared 1D-Var retrievals with high vertical resolution (1 s sampling rate;~5 m in depth) radiosonde profiles in close proximity (within 150 km and 90 min) of COSMIC profiles for the period from May 2006 to August 2015. These radiosonde profiles are from the stations operated with the Vaisala RS92 system and by the National Oceanic Atmospheric Administration (NOAA), available online: ftp://ftp.ncdc.noaa.gov/pub/data/ua/rrs-data (accessed on 31 October 2022). The radiosonde observation network spans multiple latitudes, including tropical areas and the Alaska region. This radiosonde data set offers an excellent representation of small-scale atmospheric structures, thanks to the balloon's precise position recorded at every moment of the flight [38]. Figure 14 shows mean differences (lower panels) and standard deviations (upper) from the radiosonde observations. As a benchmark for 1D-Var retrievals, we also compare the dry RO temperature and pressure (denoted DRY). At heights above 12 km where water vapor is scarce, DRY shows a remarkable agreement with the collocated RS92 profiles. The dry temperature is cooler than the radiosonde temperature, but by less than 0.6 degree, with a standard deviation ranging from 1.0 to 2.3 degrees. The dry pressure also exhibits a slight (<0.25%) negative bias and a small (0.3-1.1%) standard deviation. OLD follows DRY very closely in the temperature and pressure in this height range, showing almost identical mean differences and marginally smaller standard deviations. The compatibility with DRY in stratospheric heights is what we intended while designing OLD; the primary purpose of OLD was to extend DRY to lower, moisture-abundant heights.
The agreement between DRY and radiosonde in the lower altitudes deteriorates rapidly with decreasing height. On the contrary, OLD maintains small mean differences and standard deviations from radiosonde temperature and pressure down to the surface. Furthermore, NEW shows significantly better agreements with radiosondes than OLD does, throughout the entire height range, in both measures of the mean difference and the standard deviation, and in all variables compared, confirming the improvement over OLD demonstrated earlier in comparison with ECMWF analysis.
The pressure produced by NEW exhibits no apparent bias up to the height that balloon bursts start deteriorating radiosonde pressures (~27 km). NEW shows significantly smaller pressure standard deviations than OLD by a factor of 3 at the surface. The pressure field, which describes the atmospheric mass distribution, is larger in spatial scales and varies more slowly than the temperature and moisture fields. As a result, the representativeness error is less of a concern when comparing the pressures. In addition, the pressure error corresponds to the temperature and moisture errors accumulated in a relatively deep layer for which the real atmosphere remains close to a hydrostatic balance. Therefore, the pressure error is a good metric for evaluating the overall performance of RO retrieval. It is encouraging that NEW shows noticeably better agreements with radiosonde than OLD in pressure.
Humidity comparison is challenging because atmospheric humidity varies enormously in space and time, e.g., radiosondes flown together often show relative humidity differences of a few tens of percent, e.g., [84]. Despite that, 1D-Var humidities demonstrate good agreements with radiosonde humidities. The specific humidity standard deviations increase toward the surface, and NEW displays persistently smaller standard deviations than OLD at heights below 6 km. The 1D-Var specific humidities are negatively biased with respect to radiosondes, which is consistent with the well-known negative bias of RO observation in the lower troposphere. NEW is cooler than radiosonde in the lower troposphere, although we expect a warm bias from the negative observation bias. While the cause is uncertain, it could be a warm radiosonde or cold background bias. Nevertheless, the temperature bias is minor, with a minimum value of −0.25 • C at about 2.5 km.
In the light of the Hollingsworth-Lönnberg method described in Section 2.2.4, the difference between a nearby RO-radiosonde pair depends on the separation in space and time as well as the errors of RO and radiosonde. When all contributors to the difference are uncorrelated, the RO-radiosonde standard deviation σ t can be approximated as: where σ RS is radiosonde observation error, σ RO is RO retrieval error, and σ d is atmospheric variation across the separation. At zero separation σ d vanishes, leading to σ t = σ 2 RO + σ 2 RS 1/2 . Figure 15 shows NEW-radiosonde standard deviations for different collocation thresholds. The difference in the mean, which is insensitive to the separation, is not shown here. The collocation is pursued on a level-by-level basis (rather than profile-by-profile) to consider the horizontal drifts of RO tangent points and the balloon. Therefore, a NEW-radiosonde pair is partially excluded from the comparison when the separation on each height level exceeds the threshold.
OR PEER REVIEW 24 of 31 cause is uncertain, it could be a warm radiosonde or cold background bias. Nevertheless, the temperature bias is minor, with a minimum value of −0.25 °C at about 2.5 km.
In the light of the Hollingsworth-Lönnberg method described in Section 2.2.4, the difference between a nearby RO-radiosonde pair depends on the separation in space and time as well as the errors of RO and radiosonde. When all contributors to the difference are uncorrelated, the RO-radiosonde standard deviation can be approximated as: where is radiosonde observation error, is RO retrieval error, and is atmospheric variation across the separation. At zero separation vanishes, leading to = ( 2 + 2 ) 1/2 . Figure 15 shows NEW-radiosonde standard deviations for different collocation thresholds. The difference in the mean, which is insensitive to the separation, is not shown here. The collocation is pursued on a level-by-level basis (rather than profile-byprofile) to consider the horizontal drifts of RO tangent points and the balloon. Therefore, a NEW-radiosonde pair is partially excluded from the comparison when the separation on each height level exceeds the threshold.  Figure 15 reveals that the standard deviation decreases rapidly as the separation narrows. We can estimate the hypothetical obtainable at zero separation by extrapolating the standard deviation profiles of different thresholds. The at zero separation comprises observation errors of RO and radiosonde, containing the representativeness error, which is shared by the observation errors in an unknown ratio. The representativeness error in this RO-radiosonde comparison may be large because the two high-vertical-reso-  Figure 15 reveals that the standard deviation decreases rapidly as the separation narrows. We can estimate the hypothetical σ t obtainable at zero separation by extrapolating the standard deviation profiles of different thresholds. The σ t at zero separation comprises observation errors of RO and radiosonde, containing the representativeness error, which is shared by the observation errors in an unknown ratio. The representativeness error in this RO-radiosonde comparison may be large because the two high-vertical-resolution data sets describe small-vertical-scale atmospheric structures very differently. Hence, the σ t extrapolated to zero separation is still a conservative upper bound of the error of NEW. Considering all these factors, we find that the agreement between NEW and radiosonde is excellent and that the 1D-Var provides accurate and precise RO-based profiles.

Discussion
It is worth touching on the similarities and differences between 1D-Var and v tional assimilation of RO data into NWP models (hereafter referred to as data assi tion). The relationship with data assimilation is a question often raised by data user which has not been discussed so far. We find that the comparison to data assimil offers a different new angle that helps us understand 1D-Var better. Data assimilation 1D-Var are based on the same principle. Both methods attempt to combine the com mentary information from observations and a priori data into an optimal analysis mate) of the atmospheric states of relevance. Nonetheless, they have diverging purp and practices, apart from the difference in the problem's dimensionality. The diffe in purpose between 1D-Var and data assimilation can be said that 1D-Var utilizes a p information to enhance and interpret observations; in comparison, data assimilation the observations to improve weather forecasts by bettering the model's initial cond This difference may sound subtle but can lead to profoundly different perspectives outcomes.
A few practical differences between 1D-Var and data assimilation include: (1) assimilation produces global analyses valid at certain fixed times (i.e., snapshots at optic times) using observations irregularly distributed in time and space. Hence, da similation uses observations to obtain an analysis that is continuous and consistent a scales by propagating observational information to data-sparse regions and times. 1D seeks its solution in a very narrow subspace: along the trajectory of rays' tangent p at the exact time of the observation, which is somewhat similar to the ascending path radiosonde balloon. (2) In data assimilation, the analysis at a RO profile location is aff by other nearby observations and is thus less strongly constrained by the RO observa

Discussion
It is worth touching on the similarities and differences between 1D-Var and variational assimilation of RO data into NWP models (hereafter referred to as data assimilation). The relationship with data assimilation is a question often raised by data users but which has not been discussed so far. We find that the comparison to data assimilation offers a different new angle that helps us understand 1D-Var better. Data assimilation and 1D-Var are based on the same principle. Both methods attempt to combine the complementary information from observations and a priori data into an optimal analysis (estimate) of the atmospheric states of relevance. Nonetheless, they have diverging purposes and practices, apart from the difference in the problem's dimensionality. The difference in purpose between 1D-Var and data assimilation can be said that 1D-Var utilizes a priori information to enhance and interpret observations; in comparison, data assimilation uses the observations to improve weather forecasts by bettering the model's initial condition. This difference may sound subtle but can lead to profoundly different perspectives and outcomes.
A few practical differences between 1D-Var and data assimilation include: (1) Data assimilation produces global analyses valid at certain fixed times (i.e., snapshots at synoptic times) using observations irregularly distributed in time and space. Hence, data assimilation uses observations to obtain an analysis that is continuous and consistent across scales by propagating observational information to data-sparse regions and times. 1D-Var seeks its solution in a very narrow subspace: along the trajectory of rays' tangent points at the exact time of the observation, which is somewhat similar to the ascending path of a radiosonde balloon. (2) In data assimilation, the analysis at a RO profile location is affected by other nearby observations and is thus less strongly constrained by the RO observation. Using all platforms' observations is desirable in data assimilation but not necessarily from the standpoint of RO retrieval, especially for RO-based climate studies. In 1D-Var, the atmospheric profiles are retrieved under the direct influence of RO observation and are constrained well by the observation. (3) The optimal analysis in data assimilation is the analysis that leads to the best forecast, not the one achieving local optimality at the location of observations. The optimality of 1D-Var relates to the quality of retrieved profiles, which is definable on its own and assessable in the observation's space. (4) The low cost of 1D-Var permits the specification of precise error covariances, including detailed spatiotemporal variations. Doing so is impractical in data assimilation, which uses a state vector of enormous size and numerous observing platforms. (5) Unlike data assimilation, 1D-Var is not restricted by the model geometry. The 1D-Var grid can be of much higher resolution and is extendable to the thermosphere. The flexibility in setting the computational grid is advantageous to recovering small-scale atmospheric structures and reconstructing the upper atmosphere. (6) The 1D-Var does not apply any internal quality control or bias correction to RO observations, except for the data rejection below the top of the highest a priori super-refraction layer and the slight data thinning at stratospheric heights. The intent is to retain as much observational information as possible, enable retrieval in extreme weather conditions (or events), and embody the aspects poorly represented or undetected by data assimilation systems, e.g., model bias and climate change signals. In addition, we aim to monitor and improve our data products by disabling quality control and bias correction. Data assimilation is more attentive to quality control and bias correction because outlying or inconsistent observations can be detrimental to the quality of analysis and forecast. Furthermore, data assimilation systems usually have many other observations to substitute for rejected data. (7) Data assimilation must consider inter-platform balance to maximize the utilization of all observation platforms, such as determining the data amount to be assimilated for each platform and the relative weighting to give. It is thus challenging for data assimilation to improve the components specific to RO. Focusing exclusively on RO, 1D-Var has excellent potential for further improvement and augmentation. For example, 1D-Var can use dual-frequency observations to improve ionospheric correction and retrieve electron density and atmospheric profiles together [69,85]. 1D-Var can also employ advanced observation operators, which might be too sophisticated for use in data assimilation. As an example, Wee et al. [86] and Wee and Kuo [87] used a curved ray tracing to model phase measurements directly while extending the computational grid to the orbital height of GNSS satellites (~20,000 km). Our future work will focus on implementing these improvement and augmentation.

Conclusions
While RO produces a great deal of data, the observations of bending angles and refractivities do not directly satisfy many data users' needs. Thus, efforts have continued from the early days of RO to retrieve atmospheric profiles of temperature and water vapor from RO observations. UCAR has been using an atmospheric 1D-Var for the last twenty years, which recently underwent a significant upgrade. In this paper, we describe the UCAR 1D-Var and present some of our validation results to answer increasing inquiries about the retrieval method and its data product (wetPf2).
The findings of this study can be summarized as follows. Tests with observations perturbed at a single level (presented in Section 3.1) show that the 1D-Var response to the pointwise perturbation is highly nonlocal and differs considerably from one state variable to another, clearly illustrating the vertical spread of information by the observation operator and error covariances. The broad and complicated response accentuates the importance of accurate and precise error covariances, which is especially true for 1D-Var, lacking some resources and apparatuses available to multi-dimensional systems. The tests also validate that the 1D-Var works as designed and expected, making suitable corrections to the background, provided with an accurate observation profile containing many independent pieces of information, which is largely true for RO.
Tests using synthetic data (Section 3.2) show that the 1D-Var attains a smaller posteriori error than a priori error, given accurate error covariances. In other words, the 1D-Var can produce value-added retrievals that are better than the background. The error reduction (ratio of a posteriori error to a priori error) varies significantly among the state variables and depends on the atmospheric condition. Overall, temperature (and pressure) error reduction is much greater than humidity error reduction in a dry atmosphere, while moisture error reduction is predominant in humid conditions. This dependence on the atmospheric condition is determined by how the state variables' a priori errors compare when converted into refractivity equivalents. The 1D-Var primarily focuses on improving the largest variable in the refractivity-equivalent a priori error, which usually is temperature (and pressure) in the stratosphere and humidity in the lower troposphere. In contrast, the 1D-Var has difficulty reducing a priori humidity error in the stratosphere, which typically is already small and can easily be overshadowed by minor retrieval errors in temperature and pressure. Likewise, temperature error reduction is also challenging in the tropical lower troposphere. In these challenging cases, the 1D-Var is more receptive to systematic observation error.
The results in Sections 3.3 and 3.4 support that the performance of the 1D-Var shown by synthetic data tests remains valid for real-world data sets and confirm that the retrieved profiles are of added value regarding a priori profiles. The 1D-Var shows close agreements with ECMWF analysis in all state variables' mean difference and standard deviation, and the retrieved profiles accord excellently with high vertical resolution profiles of RS92 radiosonde. This comparison to RS92 radiosonde suggests that the random uncertainties of the retrieved profiles are no larger than 1 • C in temperature, 0.7 hPa in pressure, and 1 g kg −1 in specific humidity at heights below 25 km. These random uncertainties are exceedingly small considering the differing characteristics of RO and radiosondes. Another finding of this comparison is that the 1D-Var temperature and pressure agree notably better with RS92 observations than the traditional dry retrieval throughout the entire height range of the comparison in both mean difference and standard deviation. The closer agreement shown by the 1D-Var profiles indicates that the 1D-Var is a valuable complement to the dry retrieval. We find that the performance of 1D-Var depends on the observation operator and how RO observation is used in the retrieval process. A comparison made in this respect evinces that the two-step approach adopted for the 1D-Var (variational refractivity inversion succeeded by retrieval step) outperforms others. We also observe that 1D-Var solutions based on different a priori data approach each other closely as if converging to a unique solution, indicating that the solutions do not depend strongly on the priori data. Along the same lines, a multi-year data record of the 1D-Var has demonstrated exceptional long-term stability, while detecting and correcting abrupt temporal changes of a priori data. The long-term stability has important implications for RO-based climate studies.
Our findings, taken together, point to that the 1D-Var meets the design expectations well and that its data quality is sufficient for most applications. We believe that the 1D-Var profiles could greatly enhance the scientific utility of RO.

Data Availability Statement:
The COSMIC and COSMIC-2 RO data used in this study are available online: http:/cdaac-www.cosmic.ucar.edu/cdaac/products (accessed on 31 October 2022). The ECMWF data are provided to UCAR by the European Centre for Medium-Range Weather Forecasts.