Global 3D Features of Error Variances of GPS Radio Occultation and Radiosonde Observations

: Global Positioning System (GPS) radio occultation (RO) and radiosonde (RS) observations are two major types of observations assimilated in numerical weather prediction (NWP) systems. Observation error variances are required input that determines the weightings given to observations in data assimilation. This study estimates the error variances of global GPS RO refractivity and bending angle and RS temperature and humidity observations at 521 selected RS stations using the three-cornered hat method with additional ERA-Interim reanalysis and Global Forecast System forecast data available from 1 January 2016 to 31 August 2019. The global distributions, of both RO and RS observation error variances, are analyzed in terms of vertical and latitudinal variations. Error variances of RO refractivity and bending angle and RS speciﬁc humidity in the lower troposphere, such as at 850 hPa (3.5 km impact height for the bending angle), all increase with decreasing latitude. The error variances of RO refractivity and bending angle and RS speciﬁc humidity can reach about 30 N-unit 2 , 3 × 10 − 6 rad 2 , and 2 (g kg − 1 ) 2 , respectively. There is also a good symmetry of the error variances of both RO refractivity and bending angle with respect to the equator between the Northern and Southern Hemispheres at all vertical levels. In this study, we provide the mean error variances of refractivity and bending angle in every 5 ◦ -latitude band between the equator and 60 ◦ N, as well as every interval of 10 hPa pressure or 0.2 km impact height. The RS temperature error variance distribution differs from those of refractivity, bending angle, and humidity, which, at low latitudes, are smaller (less than 1 K 2 ) than those in the midlatitudes (more than 3 K 2 ). In the midlatitudes, the RS temperature error variances in North America are larger than those in East Asia and Europe, which may arise from different radiosonde types among the above three regions.


Introduction
Global Positioning System (GPS) radio occultation (RO) and radiosonde (RS) observations are both important in initializing numerical weather models, conducting global and regional reanalysis, and calibrating and validating satellite data. RS soundings are in situ measurements made under all-weather conditions with a long history, often considered a standard for comparisons with satellite observations and reanalysis data [1][2][3]. The RO technique provides vertical bending angle profiles with high resolution, accuracy, and precision, retrievable from atmospheric refractivity, temperature, and water vapor information [4]. There are more than 1000 RS stations worldwide, whose launches generally occur 45 min before 0000 and 1200 UTC. RS stations are distributed more densely in East Asia, Europe, and North America than elsewhere. Satellite missions, such as the Constellation Observing System for Meteorology, Ionosphere, and Climate-1, -2 [5][6][7], provide nearly evenly and densely distributed RO profiles over the world, making up for the lack of conventional observations over the ocean.
Many studies have shown the positive impact of assimilating RO and RS observations in numerical weather prediction: Assimilating GPS RO and RS observations significantly improves the short-range forecast skill [8]; assimilating RO data helps improve the forecasts of geopotential heights, temperatures, winds, and water vapor [9][10][11]; and assimilating radiosonde data helps improve the forecast of central pressure, strong winds, and moisture transport in a typhoon system [12]. In three-dimensional variational data assimilation, the analysis field is generated by minimizing the following cost function [13]: where x 0 represents the model state function at the initial time, x b represents the background model state function, B is background error covariance matrix, O is the error covariance of observations, and F is the error covariance of the forward observation operator H. In most operational assimilation schemes, the combined observational and forward model error variance O + F is a necessary input to determine the weightings of observations [14,15].
Assuming that the errors of RO observations and forecasts are independent, the variance of the apparent error (σ a , the difference between forecast and observation, i.e., O − B) is calculated as where σ 2 o is the observation error variance and σ 2 f is the forecast error variance. Many studies have estimated RO observation error variances according to Equation (2) [16][17][18]. Previous studies have used the differences between RO observations and analyses to estimate the RO observation error variances [19,20]. Anthes and Rieckh (2018) [21] applied the "three-cornered hat" (3CH) method [22] to estimate RO observation error variances by combining multiple datasets at four chosen RS stations. Note that this method needs no estimation of forecast errors. Xu and Zou (2020) [23] applied the 3CH method to estimate the RO refractivity and bending angle errors over China.
Many studies had compared RO with RS observations in terms of the mean differences and mean fractional differences of temperature and specific humidity between RO and RS [2,[24][25][26]. In this study, we estimate the error variances of refractivity and bending angle for both RO and RS observations on a global scale. The 3CH method with biases included in the estimate is applied here to estimate the error variances of both RO refractivity and bending angle (both are often assimilated) by combining four datasets, two of which are observation datasets (RO and RS) and two of which are model datasets (the ERA-Interim reanalysis and Global Forecast System (GFS) forecasts). The error variances of global RS observations of temperature and humidity are also estimated and analyzed. The latitudinal dependence of the error variances of RO and RS observations and their correlations can also be examined. This paper is organized as follows. Section 2 introduces the four datasets, i.e., RO and RS observations and ERA-Interim and GFS model results, and the collocation among them. Section 3 presents the quality control (QC) of observations and the 3CH. Sections 4 and 5, respectively, show and discuss some characteristic features of error variances of RO and RS observations. Section 6 provides the discussion and conclusions.

Data Description
A total of four datasets from 1 January 2016 to 31 August 2019 were used in this study, including two observation datasets (RO and RS profiles) and two model datasets (the ERA-Interim reanalysis and GFS forecasts). Since this study mainly focuses on the troposphere and the impact of water vapor, only the situation below 200 hPa is discussed. Considering the different temporal and spatial distributions between the RO and RS observations, collocation in both time and space is needed.
The RS data used in this study, the Integrated Global Radiosonde Archive dataset (IGRA) [27], are integrated by the National Center for Environmental Information under the auspices of the National Oceanic and Atmospheric Administration. IGRA provides quality-assured RS and pilot balloon observations which pass a sequence of independent checks [27,28] for more than 2700 stations over the globe. In 2014, the IGRA dataset was upgraded to the second version (IGRA2, https://www.ncdc.noaa.gov/data-access/ weather-balloon/integrated-global-radiosonde-archive), also providing refractivity data retrieved by dewpoint temperature, pressure, and water vapor, among others.
Post-processed RO data from the COSMIC Data Analysis and Archive Center (CDAAC) were used in this study. CDAAC provides RO data from many missions and, here, we chose RO profiles observed by the Global Navigation Satellite System Receiver for Atmospheric Sounding (GRAS) onboard the MetOp-A/B satellites from 1 January 2016 to 31 August 2019. All RO data used in this study can be downloaded at https://cdaac-www.cosmic. ucar.edu/cdaac/products.html. RO profiles retrieved under both dry and wet atmospheric conditions are provided by CDAAC, which are called AtmPrf and WetPrf in the data files. AtmPrf data provides the atmospheric bending angle at full resolution, and WetPrf data provides refractivity and one-dimensional variational (1D-VAR) estimated temperature, humidity, and pressure at 100 m vertical resolution. The 12 h aviation real-time model forecasts or ECMWF analysis was used as an a priori background state of the atmosphere in the 1D-Var. The RO temperature, humidity, and pressure errors could be correlated depending on the background error covariance matrix B employed in the RO 1D-Var algorithm.
ERA-Interim is a global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts. Dee et al. (2011) [29] described the processing of ERA-Interim and discussed the performance of the system. The current status, data availability, and update information of ERA-Interim can be found at https://www.ecmwf.int/en/ forecasts/datasets/reanalysis-datasets/era-interim. CDAAC provides ERA-Interim data that are interpolated to the RO profiles, which we used for subsequent calculations.
The GFS is a weather forecast model produced by the National Centers for Environmental Prediction. It provides dozens of atmospheric variables, including temperature, wind, and precipitation data. GFS products and more information are available at https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/global-forcastsystem-gfs. CDAAC also interpolates GFS data to RO profiles, which were used in later calculations.
RS stations generally make observations twice a day at fixed locations and times, while RO observations vary in time and space. A total of 521 RS stations were selected for this study. First, we calculated the number of the neighboring stations (within a 300 km radius) of each station and removed those stations with the largest number of neighboring stations. If there was more than one station with the largest neighboring stations, the station with the shortest sum of distances all other stations was removed. Then, we performed the same step for the remaining stations, and so on, until there were no other stations within a 300 km radius of all the remaining stations. The time period during which MetOp-A/B GRAS RO observations were collocated with these 521 RS stations was from 1 January 2016 to 31 August 2019. Based on the geographic locations and observation times of RS data, we selected those RO profiles that fell within a 300 km radius centered on each RS station and within 3 h of the soundings made at the RS station. With more than 3 years of data from 1 January 2016 to 31 August 2019, most RS stations were collocated, with more than 150 RO profiles, as shown in Figure 1.
Since the RS and RO profiles had different vertical resolutions, we interpolated all data to vertical levels with a 10-hPa interval. We then conducted vertical filtering (see Gilpin et al. (2018) [30] for details) to all collocated profiles to remove structures with very small vertical scales and to decrease the representative error due to the different observation types and vertical resolutions [31].

Methodology
In this section, a quality control (QC) step and the 3CH method are introduced. Before estimating the error variance by combining the 4 datasets, we conducted QC on each dataset, i.e., a biweight check on RO refractivity and bending angle and RS temperature and humidity observations. The error variances of RO and RS observations were then estimated by the 3CH method.
This QC step is a biweight check on the variables participating in the error variance estimation, aimed at removing outliers of the collocated observations. Zou and Zeng (2006) [32] first applied the biweight check method to the QC of RO data, giving a detailed derivation and providing comprehensive analyses and interpretations of the results. Here, we provide a brief description to this QC step.
First, for a given dataset with n observations (x i , where x can be temperature, specific humidity, or refractivity,i = 1, 2, ..., n), the weighting function w i is defined by the median (M) and median absolute deviation (MAD) of the n observations: Then, we use w to calculate the biweight mean (BM) and biweight standard deviation (BSD): Finally, we use BM and BSD to calculate the Z-score: Data with Z-scores greater than the chosen threshold are considered outliers.
Here, we also provide the variation in estimated error variances of RO refractivity (N) and bending angle (α) with respect to Z-score thresholds ( Figure 2). The estimated error variances of both RO refractivity and bending angle decreased as the Z-score threshold decreased at different pressure levels and impact heights. When the Z-score threshold was equal to 2.5, the error variances of both RO refractivity and bending angle were significantly reduced, and only about 2% of the refractivity observations and 3% of the bending angle observations were removed. Thus, we chose the Z-score threshold of 2.5 for the QC so that observations with Z-scores greater than 2.5 were removed, i.e., data deviating from by more than 2.5-times were removed. This method uses M and MAD instead of the mean and standard deviation to define the weight, thus avoiding the effect of extreme outliers in the dataset. All the collocated data from 1 January 2016 to 31 August 2019 that passed QC are used in the following calculation. The 3CH method is used to estimate the observation error variance. No forecast error variances are needed in this method. The 3CH method was proposed by Gray and Allan (1974) [22] to estimate the intensity of the random noise frequency modulation of an individual oscillator. Anthes and Rieckh (2018) [21] first applied the 3CH method to estimate error variances of RO refractivity observations near 4 chosen radiosonde stations. They give a detailed derivation in the Appendix ??. Here, we only provide the final equations. A key assumption of 3CH method is that errors in various datasets are uncorrelated. Considering that both ERA and GFS assimilate RO and RS observations, there might be some correlations between RO/RS observations and ERA/GFS model data. Since a lot of other satellite and conventional observations were also assimilated in ERA and GFS analyses, the correlations among the 4 datasets used here are likely to be neglectable.
In the 3CH method, the error variance of the target dataset is estimated by combining the 2 other datasets. In this study, in addition to the target dataset, there were 3 other datasets, which can have 3 combinations to give 3 estimations of the error variance. Taking the RO dataset as an example, its error variance can be estimated by the following equations: where σ 2 RO is the variance of RO observations, σ 2 A,B (A and B can be RS, RO, ERA, or GFS) represents the mean square difference between datasets A and B, and b A,B is the bias defined as the sample mean of difference between datasets A and B. Note that the effects of the biases among the 4 datasets are considered in (7)-(9) (R. A. Anthes, personal communication). According to Gray and Allan (1974) [22], the 3 estimations (Equations. (7)-(9)) were averaged to obtain a better estimation of σ 2 RO : The latitudinal and height distributions of the 6 bias terms that appear in Equations (7)-(9) are shown in Figure 3. The 6 bias terms had similar magnitudes (e.g., 1-10 N-unit 2 for RO refractivity), and their impact on the error variances was even smaller (e.g.,~1 N-unit 2 for RO refractivity) due to a cancellation of 3 biases in the second bracket on the right side of Equations (7)-(9).

Global Error Variances of RO Observations
Since RO observations can be derived to obtain atmospheric temperature and humidity information with high accuracy and precision, have a high vertical resolution, and are bias-free [6], they are a good complement to satellite microwave and infrared observations. The observation error variance is required when assimilating RO data. Here, the error variances of both RO refractivity and bending angle observations were estimated by the 3CH method (described in Section 3). In this section, we show the horizontal and vertical distributions of the error variances of both RO refractivity and bending angle and their latitudinal dependence. Figure 4 shows the error variances of RO refractivity near 521 radiosonde stations at 500 hPa and 850 hPa. At both 500 hPa and 850 hPa, the refractivity error variance decreased with increasing latitude. The error variances at 500 hPa, with a magnitude of 1 N-unit 2 , were smaller than those at 850 hPa, which had a magnitude of about 10 N-unit 2 . Because the error variance of the refractivity has an obvious latitudinal dependence, we calculated the mean and standard deviation in low-, mid-, and high-latitudinal bands and provided the data counts in each latitudinal band (figure omitted). In all three latitudinal bands, the mean error variance monotonically decreased with decreasing pressure. The mean error variance at low latitudes was the largest, reaching 40 N-unit 2 near the surface, and that at high latitudes was the smallest, at less than 20 N-unit 2 near the surface.

Error Variances of RO Refractivity
To clearly see the latitudinal dependence of the RO refractivity error variance at different pressure levels, the mean and standard deviation (STD) of the RO refractivity error variance were calculated using all the collocated data after QC in every 5 • latitude interval and the results are shown in Figure 5. Figure 5a shows the number of RS stations in every 5 • latitude interval, and only latitude bands with more than 10 RS stations are shown in Figures 5b and 6c, i.e., between 35 • S and 75 • N. The RO refractivity error variances (Figure 5b) in the Southern and Northern Hemispheres were asymmetrical. Below 500 hPa, the maximum error variance occurred near 15 • S/N, and the error variance decreased with increasing latitude when the latitude was greater than 15 • . Above 500 hPa, the error variance monotonically decreased with increasing latitude. The STD distribution (Figure 5b) of the error variances in every 5 • latitude interval was similar to the mean error variances (Figure 5a) except that the error variance STDs near 20 • N were remarkably larger than those near 20 • S below 950 hPa. Table A1 in the Appendix ?? gives the RO refractivity error variances in each 5 • latitude band and at 10 hPa pressure level interval between 0 • and 60 • N for reference.

Error Variances of RO Bending Angle
To estimate the error variances of RO bending angle, the bending angles of the RS, ERA, and GFS datasets are first retrieved by the Abel transformation. Here, we use the one-dimensional forward model in the Radio Occultation Processing Package (ROM SAF, 2019) [33], which is publicly available and can be downloaded from http://www.romsaf. org. In this study the error analysis is restricted to the troposphere so that the errors resulting from an extrapolation implemented in the bending angle forward model of ROPP above the radiosonde observing range of height is likely to be negligible. The retrieved bending angles of the RS, ERA, and GFS datasets are then used to estimate the error variances of RO bending angle observations by the 3CH method (described in Section 3).  The horizontal distributions of the RO bending angle error variances ( Figure 6) were similar to those of refractivity, decreasing with increasing latitude at impact heights of 6.8 km and 3.7 km. The latitude dependence of the RO bending angle error variances at different impact heights were also explored ( Figure 7). Mean bending angle error variances had a similar distribution to that of refractivity, decreasing as the latitude increased. Figure 7a shows that the mean bending angle error variances were symmetrical about the equator between 35 • S and 35 • N. The STD of error variances (Figure 7b) were distributed more chaotically than the mean but their variations with latitude were similar. We also provide the values of the RO bending angle error variances in every 5 • latitudinal band and 0.2 km impact height interval between 0 • and 60 • N ( Table A2 in

Global Error Variances of Radiosonde Observations
Here, the 3CH method was also applied to estimate the error variances of RS observed temperature and specific humidity (q) from 521 radiosonde stations worldwide. We first conducted the QC (described in Section 3) on both RS temperature and humidity observations and then estimated their error variances. Finally, the horizontal and vertical distributions and latitudinal dependence of the RS observation error variances were analyzed. Figure 8 shows global distributions of the error variance and variances of RS temperature observations at 850 hPa. Latitudinal differences were clearly seen in both the error variances and variances, which were relatively small at low latitudes and larger at the midlatitudes. The estimated error variances of RS temperature in East Asia and North America at 850 hPa were remarkably larger than those in other parts of the world. Large differences were also seen among the temperature error variances in North America, Asia, and Europe. We thus selected four regions (framed by dashed lines in Figure 8a (Figure 8b) in East Asia were larger than those in North America, which was the opposite of the error variances (Figure 8a). To sum up, the global distributions of the error variance and variance were consistent except for in the individual regions, which may have been caused by the different radiosonde types used in the different regions [34].
The vertical distributions (Figure 9) of the error variances at RS stations in the four areas are also shown, along with the error variance averages and standard deviations in each area. The error variances of RS temperature were large at lower levels of the atmosphere (below 850 hPa) and smaller between 850 hPa to 300 hPa in all four areas, consistent with results reported by Mitchell et al. (2002) [35]. The RS stations in area A1 had the largest temperature error variance averages and STDs. Also, at the midlatitudes, the stations in A2 had the smallest temperature error variances compared with those in A1 and A3. The error variances in A4 were very small, with magnitudes of about 1 K 2 . This distribution is consistent with that reported by Ho (2017) who concluded that the Vaisala RS92 (mainly distributed in A2 and A4) provides the most accurate temperature observations [26].  Figure 10 shows the error variances and variances of RS q observations at 850 hPa. The error variance of q was larger at low and middle latitudes, whereas water vapor was more abundant than at high latitudes. Comparing Figure 10a,b, the spatial distributions of the error variance and variance were similar, but the variance was about three-times the error variance in magnitude. Unlike the distribution of the error variance, the humidity variance in North America was much smaller than that in East Asia.
Considering the clear latitudinal dependence of the error variance, we plotted the mean and STD of RS q error variances at low, middle, and high latitudes ( Figure 11). The mean error variances at the three latitudinal zones all reached a maximum at 800 hPa, and the error variances all had the largest spread near the surface. Among the three latitudinal zones, the error variances of q at low latitudes had the largest means (0-2.4 (g kg −1 ) 2 ) and the widest spread. Considering that the data count was low below 950 hPa, the large spread of the near-surface error variances of q may be associated with a smaller amount of data at the lower than higher altitudes.

Discussion
The error variances of both RO refractivity and bending angle had clear latitudinal dependencies at 500 hPa and 850 hPa, decreasing with increasing latitude. The error variances of RO refractivity at 500 hPa and 850 hPa and those of the bending angle at corresponding impact heights (6.8 km and 3.5 km) had a good correlation. The error variances of RO refractivity and bending angle in the Northern and Southern Hemispheres were highly symmetrical at each vertical level. For reference purposes, the mean error variances of RO refractivity and bending angle in every 5 • latitude band and at every 10 hPa pressure level or 0.2 km impact height between the equator and 60 • N are listed in Table A1 and A2.
The latitudinal dependence of estimated error variances of RS specific humidity was similar to that of the RO refractivity and bending angle, while that of the RS temperature differed. The RS temperature error variances were the smallest (less than 1 K 2 ) at low latitudes and largest at the midlatitudes (more than 3 K 2 ). At the midlatitudes, the RS temperature error variances differed in different regions. Specifically, the error variances in North America were larger than those in East Asia and Europe, which may have been caused by the different radiosonde types used in the different regions.

Conclusions
In this paper, the 3CH method was applied to estimate the error variances of RO refractivity and bending angle and RS temperature and specific humidity at 521 radiosonde stations around the world. The 3CH method estimated the error variances by combining multiple datasets and does not require estimates of forecast errors. Here, we used four datasets covering the years 1 January 2016 to 31 August 2019, including two observation datasets (radiosonde and RO) and two model datasets (ERA-Interim and GFS). Collocated RO and RS observations passing the QC were used to estimate the observation error variances. The global distributions, vertical changes, and latitudinal dependence of the observation error variances of both RO and RS observations were then analyzed.
The error variances of RO refractivity and bending angle estimated in this study can be used in GPS RO data assimilation, and its impact on numerical weather prediction can then be explored. This is planned for a follow-up study.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1 give the RO refractivity error variances in each 5 • -latitude band and at 10-hPa-pressure-level intervals between 0 • and 60 • N for reference. Considering the good symmetry of the RO refractivity error variances between the Northern and Southern Hemispheres, the error variances in the Southern Hemisphere can be directly determined by referring to those in the corresponding latitudinal band in the Northern Hemisphere.  Table A2 give the values of RO bending angle error variances in every 5 • -latitudinal band and 0.2-km impact height interval between 0 • and 60 • N. Similar to the refractivity error variance, the bending angle error variances in the Southern Hemisphere can be directly determined by referring to those in the corresponding latitudinal band in the Northern Hemisphere.