A Comparison of Sporadic-E Occurrence Rates Using GPS Radio Occultation and Ionosonde Measurements

Sporadic-E (Es) occurrence rates from Global Position Satellite radio occultation (GPS-RO) measurements have shown to vary by a factor of five between studies, motivating the need for a comparison with ground-based measurements. In an attempt to find accurate GPS-RO techniques for detecting Es formation, occurrence rates derived using five previously developed GPS-RO techniques are compared to ionosonde measurements over an eight-year period from 2010–2017. GPS-RO measurements within 170 km of a ionosonde site are used to calculate Es occurrence rates and compared to the ground-truth ionosonde measurements. The techniques are compared individually for each ionosonde site and then combined to determine the most accurate GPS-RO technique for two thresholds on sporadic-E intensity: no lower limit and fbEs ≥3 MHz. Overall, the YuS4 method shows the closest agreement with ionosonde measurements for total Es occurrence rates without a lower limit on intensity, while the phase-based Chu technique shows the closest agreement for fbEs ≥3 MHz. This analysis demonstrates that the variation in GPS-RO derived sporadic-E occurrence rates is due to varying thresholds on the sporadic-E intensities in terms of fbEs.


Introduction
Ionospheric conditions are important to a range of electromagnetic spectrum operations including satellite and High Frequency (HF) communications. Abnormalities such as sporadic-E (E s ) can cause degradation of the communication signals as well reflect it entirely [1,2]. Radar techniques such as over-the-horizon-radar [3] are also significantly affected by the presence of sporadic-E, motivating the need for accurate global occurrence rates (ORs).
Sporadic-E manifests itself as a non-uniform, wavy layer, or a composition of irregular elongated clouds of intense ionization within the lower E-region of the ionosphere [1,4]. There are many methods and instruments used to monitor sporadic-E including incoherent scatter radars (ISRs) [5], ionosondes [6][7][8][9][10], and GPS radio occultation (RO) [11,12]. While ISR measurements provide detailed E s observations, the number of sites across the globe are significantly lacking for global climatological studies. Ionosondes provide robust E s measurements and include a near global coverage [13], but there are many locations with missing data such as the African sector or over the oceans. An updated global climatology for foEs and fbEs ORs was recently provided in Ref. [14], highlighting the capabilities as well as the spatial gaps arising from the strict use of ionosonde data for a global climatology.
GPS-RO provides a unique capacity for developing global E s occurrence rate climatologies. For example, the Constellation Observing Satellite for Meteorology, Ionosphere, and Climate (COSMIC-1) constellation consisted of six satellites with circular orbits at altitudes of 800 km at an inclination of 72 degrees, providing an average of 1500-2000 GPS-RO profiles per day with near global coverage [15,16]. The original COSMIC-1 constellation was recently succeeded by the COSMIC-2 constellation which increased the frequency and range of the sensors activity [17]. Using low Earth orbit (LEO) satellites to receive dual L 1 (1575.42 MHz) and L 2 (1227.60 MHz) GPS signals allows for measurements of amplitude and phase perturbations caused by a sporadic-E layers [12,[18][19][20]. GPS-RO can also be used to find temperature, water vapor pressure, and ionospheric electron density [20,21]. The large number of daily GPS-RO measurements provides a unique opportunity to develop global climatologies for sporadic-E occurrence [22,23] and intensity [12,24].
While GPS-RO measurements provide nearly complete global coverage, there are large discrepancies in ORs between different GPS-RO methods for monitoring sporadic-E. For example, Ref. [22] shows a maximum E s occurrence rate of ∼10% while Ref. [23] presents a ∼50% maximum. Neither study mentions a plasma intensity threshold for the sporadic-E layer, which is likely the primary cause of the significant difference in the ORs. To anchor the different techniques to specific E s intensity thresholds, they should be compared against more direct E s measurements such as those obtained by ionosondes. This is the primary motivation behind the present research effort.
In the search for accurate GPS-RO methods for estimating sporadic-E ORs associated with fbEs ≥ 3 MHz and for all fbEs measurements without a lower limit on intensity, five GPS-RO techniques [22][23][24][25][26] are compared to ionosonde measurements over an eight year period between 2010 and 2017. For this comparison, data from the digital ionosonde brand Digisonde as presented in Ref. [14] are used as the ground-truth for sporadic-E ORs across the globe. Two techniques will be recommended for use in estimating E s ORs for the two intensity thresholds. These techniques can be used for future climatological studies with specific intensity thresholds in mind.

Methods
Here we briefly outline the five GPS-RO techniques used to calculate sporadic-E ORs. To improve readability, we refer to the methods as Chu for Ref. [22], Arras for Ref. [23], Yu for Ref. [24], Niu for Ref. [25], and Gooch for Ref. [26]. Table 1 provides a summary of the criteria used in each technique. The GPS-RO data used in this analysis was obtained from the COSMIC Data Analysis and Archive Center (CDAAC, https://cdaac-www.cosmic.ucar. edu/, accessed 1 September 2020). Tangent point altitudes of 70-130 km, typical for sporadic-E, are analyzed globally to search for binary (present or not-present) E s observations. While the Arras and Chu techniques provide explicit binary criteria for E s presence, the Niu, Gooch, and Yu methods are used to estimate peak E s plasma frequencies. Then, the peak E s plasma frequencies are compared against predefined intensity thresholds used as the binary criteria for the existence of E s . Here, we use a lower intensity threshold of 3 MHz and an upper threshold of 20 MHz to remove unrealistic or noisy data. While a peak E s plasma frequency threshold of 3 MHz was used for the various GPS-RO techniques, the occurrence rate results were compared to two separate ionosonde fbEs thresholds: fbEs ≥ 3 MHz and no lower limit for a total occurrence rate. In this study, we use foEs and fbEs interchangeably for the GPS-RO techniques that provide an foEs estimate through previous comparison with ionosonde measurements. However, it must be noted that the continuous blanketing sporadic-E layers causing large GPS-RO perturbations are most likely related to fbEs rather than patchy foEs measurements [23,27] and fbEs ≤ foEs by definition. Therefore, the 3 MHz foEs thresholds used for some of the GPS-RO methods outlined below correspond to fbEs ≤ 3 MHz. As the ionosonde measurements are able to distinguish between foEs and fbEs, the ionosonde data strictly uses fbEs for this comparison.

Arras Method
The Arras method analyzes 50 Hz L 1 amplitude data (atmPhs files from CDAAC) in the altitude range of 70-130 km following the boundaries of typical sporadic-E. L 1 data is used since the lower frequency L 2 is noisier from elevated index of refraction gradients. To search for sporadic-E, the L 1 signal-to-noise ratio (SNR) is normalized and a rolling standard deviation is calculated using a 2 km window. The binary E s criteria is satisfied if the standard deviation exceeds an empirically determined value of 0.2. With typical sporadic-E vertical thicknesses on the order of 1 km [19], if the altitude extent of the perturbed region with a standard deviation over 0.2 exceeds 10 km, then the occultation is removed for quality control.

Niu Method
Total electron content (TEC) profiles are calculated using L 1 and L 2 excess phase data from CDAAC's 1 Hz ionPhs files. The TEC profiles with an altitude range of 70-130 km are detrended using a Savitzky-Golay [28] 11-point, 3rd order polynomial to obtain the background TEC profile, TEC Background : where TEC r is the raw relative TEC and ∆TEC is the TEC perturbation (or detrended TEC). The S max is the maximum vertical gradient of ∆TEC, as described by where dh is the change in altitude used to calculate the vertical TEC gradient [25]. Averaging the four linear fits of S max as a function of foEs obtained by Ref. [25] provides an estimate for a global linear fit: From this averaged slope, we find a threshold S max of 0.12 TECU/km corresponding to a 3 MHz foEs.

Chu Method
The 1 Hz ionPhs phase and SNR data is used within an altitude range of 70-130 km. This method requires three criteria to be satisfied simultaneously: First, both the L 1 and L 2 SNR and excess phase are smoothed with a 3-point Savitzky-Golay filter. Both the L 1 and L 2 phase perturbation magnitudes must be larger than 5 cm. The excess phase on both the L 1 and L 2 are detrended using a Savitzky-Golay filter with a 25 km window and 3rd order polynomial to remove large scale variations in the E-layer of ionosphere. The phase perturbation magnitudes, ∆L, are inversely proportional to the signal frequency, f , squared: Therefore, the ratio of ∆L 2 /∆L 1 should have a value of 1.65 if the signals share the same group path. From this, the second criteria requires the phase perturbation ratio to lie within the range of [1.2, 1.9]. Note that we extended the phase perturbation ratio from the [1.5, 1.8] range provided by Ref. [22] to increase ORs. Finally, the amplitude of the normalized L 1 SNR perturbation has to be greater than 0.01. The height where the peak of the perturbation occurs is the height of the sporadic-E layer.

Yu Method
From the 50 Hz atmPhs L 1 SNR data, a 50-point window of (25 below and above) is used to calculate a 1 Hz S 4 . This S 4 is calculated as a function of altitude and is converted to an foEs following: where S 4,Max is the maximum of the S 4 index and the foEs is the peak frequency of the sporadic-E cloud. For the current study, we use a perturbation of f oEs − 1.2 = 3.0 MHz, following a foµEs threshold of 3 MHz for an average background E-layer frequency of 1.2 MHz [29]. This corresponds to an S 4,Max threshold of 0.66 to indicate a sporadic-E layer with sufficient intensity for the current comparison. It must be noted that we calculate the S 4 index directly from 50 Hz L 1 SNR data using a rolling 50 point window. This may provide slight differences to the S 4 values obtained directly from CDAAC [30], especially for strongly scintillating layers.

Gooch Method
Using the 50 Hz atmPhs excess phase data, a relative TEC r is calculated using where ∆L 1 is the excess phase for f 1 = 1.57542 GHz and ∆L 2 is the excess phase for f 2 = 1.22760 GHz [11]. Afterward, the TEC is detrended using a Savitzky-Golay filter with a 25 km window to obtain the TEC perturbation ∆TEC. Finally, the ∆TEC is smoothed using another Savitzky-Golay filter with a 1 km window. From this, the electron density is calculated by where the TEC perturbation is divided by an effective path length of 176 km corresponding to an effective path through a cylinder with a vertical thickness of ∆R = 0.6 km located at a radial distance R from the center of Earth [26]. The effective path length is derived from an assumed geometry of a cylinder centered around the tangent point [31]. The binary E s criteria is satisfied if the plasma frequency associated with ∆n e is ≥3 MHz and the detrended TEC perturbation occurs within a 10 km altitude window.

Ionosonde Measurements
Occurrence rates based on ionosonde measurements are used as the ground-truth in this comparison. A detailed explanation of the techniques used to calculate the ionosonde based ORs is outlined in Ref. [14]. In summary, ionogram-derived E s parameters were obtained from the Global Ionospheric Radio Observatory (GIRO) network of Digisondes [13] using the GIRO web portal (http://giro.uml.edu/didbase/scaled.php, accessed 26 September 2020). To ensure the most up-to-date automatic scaling processes used to determine the sporadic-E parameters, only ionograms processed with the ARTIST-5 software [32] are used to calculate ORs.
A vital distinction between foEs and fbEs must be defined when comparing GPS-RO measurements to ionosonde measurements, as the ORs can vary by a factor of five depending on the sporadic-E intensity threshold of interest [14]. Here, we assume that the fbEs layers are most likely to be measured by GPS-RO techniques because of the greater phase contribution from a continuous layer rather than the patchy foEs layers, and we compare the RO measurements to fbEs. Additionally, an intensity threshold for the sporadic-E must also be defined before comparison. In this study, we use the total sporadic-E occurrence rate as measured by ionosondes and a separate comparison for fbEs ≥ 3 MHz. For the total occurrence rate, there is no lower limit to the sporadic-E fbEs intensity, except for the natural lower limit of ∼1.5 MHz for most ionosonde sites due to frequency allocation and sensitivity issues.

Comparison Criteria
Occurrence rates are calculated by dividing the number of sporadic-E measurements by the total number of measurements within a certain time period. For example, the ionosonde based OR for fbEs ≥ 3 MHz is calculated by where OR( f bEs ≥ 3 MHz) is the occurrence rate for fbEs greater than or equal to 3 MHz, N( f bEs ≥ 3 MHz) is the number of fbEs measurements during the time period of interest (seasonal or entire year) above the 3 MHz threshold, and N CS is the total number of vertical ionograms provided by the number of confidence scores [14]. Every ARTIST-5 scaled ionogram has an associated confidence score, so N CS is used as the total number of ionosonde measurements. No threshold is placed on the ionogram-derived confidence scores for this analysis. A similar method is used for the GPS-RO derived ORs, except a measurement area around each ionosonde is used to calculate the rates. The measurement area corresponds to circles centered around the Digisonde sites with radii of 170 km, corresponding to the mean length of sporadic-E [33]. A map of the measurement areas used in this analysis are displayed in Figure 1. Ionosonde and RO based ORs were compared seasonally near the same location to ensure a direct comparison. The comparison is performed for each season with ORs averaged over the years 2010-2017. This time period was selected to ensure sufficient ARTIST-5 scaled Digisonde and COSMIC-1 data was available simultaneously. To maximize the number of data points, ORs were not separated by time of day (TOD) and the values provided here correspond to all-day averages. A quality control check was placed on the data using the technique outlined in Ref. [34], which places an upper limit of 1.5 on the mean deviation of the electron density (or here, TEC) profile. GPS-RO profiles with few data points (less than 400 for the 50 Hz data) were also excluded from the analysis. Ionosonde sites were required to have >4320 ionograms for the season of interest in a given year to be included in the comparison. This corresponds to ionosonde operation roughly half of the season assuming a 15 min ionogram cadence [14]. The seasonal results throughout 2010-2017 are combined to perform the statistical comparison between GPS-RO and ionosonde derived ORs. A lower limit of 10 observations was placed on the GPS-RO data at each site to be included in the combined data set for seasonal comparisons. Temporal gaps in ionosonde operations combined with quality control checks of the GPS-RO data [34] create a variable number of sites with an amount of GPS-RO observations greater than 10 (which is the lower limit for which the comparison is made) in the different seasons (i.e., summer and winter as discussed below).
Instead of directly comparing ionograms and GPS-RO crossings within a short time window (e.g., 30 min between ionogram and crossing as outlined in Ref. [26]), here we compare rates averaged over TOD for the areas surrounding each ionosonde site. A direct comparison of each GPS-RO crossing within 170 km of a site certainly has merit as it ensures the measurements are taken at nearly the same time and location. However, due to the patchy nature of E s clouds, simultaneous ionosonde and GPS-RO measurements can sample different parts of the cloud and provide varying results (see the discussion in Ref. [26]). As an alternative approach, here we use all Digisonde measurements at a particular site over a season to calculate the OR averaged over TOD. Then, as the GPS-RO measurements randomly sample the area surrounding the Digisonde site over all local times, the TOD averaged results from the GPS-RO rates should align with the TOD averaged Digisonde rates if the GPS-RO techniques properly characterize the existence of E s layers. A large number of ionosonde measurements are taken over a season, which provides a robust TOD average over the site of interest. While this statistical comparison would be flawed if the number of GPS-RO crossings is very low and clustered around a particular TOD, a larger number of measurements spread throughout TOD provides a proper sampling around the site such that the rates can be compared with the ionosonde rates. As described later, the average number of GPS-RO observations within 170 km of each site over all seasons and years is greater than 100 (Table 2) and the average number of ionosonde observations is over 200,000, providing a measure of confidence in the statistical results through proper sampling over the TOD for both ionosonde and GPS-RO derived rates.

No Lower fbEs Limit
The GPS-RO and Digisonde derived ORs were compared over all seasons, but here we show only the summer and winter results for brevity. Figure 2 displays the results for the five RO techniques calculated over each Digisonde site in comparison to the ionosonde-derived total sporadic-E ORs with no lower limit on fbEs intensity. The number of ionosonde sites satisfying the threshold of at least 10 GPS-RO measurements varies between seasons with 22 sites for December-February and 25 sites for June-August. This difference in number of sites is due to temporal gaps in ionosonde operations combined with quality control of GPS-RO data, and sites with ≤10 GPS-RO crossings are removed from the comparison. Comparisons over all seasons (entire year) include the 30 sites displayed in Figure 1. The number of GPS-RO crossings within 170 km of each Digisonde site are annotated in Figure 2 to display the combined number of GPS-RO measurements used to calculate E s ORs over each ionosonde site for the season of interest during 2010-2017.
A linear regression of the GPS-RO derived ORs with respect to the ionosonde ORs is calculated for each method using a bootstrapping technique with 10,000 resamples. The bootstrapping technique allows for estimation of a statistical parameter through random resampling of the original dataset with replacement (random selection from the original data points), allowing for a quantification of the uncertainty in a statistical parameter of interest [35]. Here, the mean linear regression coefficients (slope and intercept) and associated variance are estimated for each GPS-RO method with respect to the ionosonde ORs. The linear fit provides a measure of the general trend and should be compared against the one-to-one line (slope = 1, intercept = 0) that indicates a perfect match between data sets with equal ORs for the GPS-RO and ionosonde measurements. Differences between the estimated ORs can manifest as a difference in slope (GPS-RO and ionosonde rates not increasing at the same rate) or in magnitude (GPS-RO underestimates or overestimates the magnitude relative to the ionosonde rates). In the boreal winter season, the Gooch (green) and Niu (cyan) techniques overestimate the ORs. However, the slope of the two methods correlates to the ionosonde ratios. The Arras (red) method slightly overestimates the slope and OR magnitudes. The Chu (blue) methodology underestimates both the ORs and slope. The Yu (yellow) method crosses the one-to-one line, but the slope is less than one which would increase the gap between the ionosonde ratios for larger ORs. In the boreal summer season, the Chu method underestimates and the Arras and Niu methods overestimate the rates. The Gooch technique crosses the one-to-one lines but the slope is much less than one. Here, the Yu technique closely matches both the slope and OR magnitudes.   The ORs as a function of geomagnetic latitude are displayed in Figure 3 for winter and summer. Rolling spatial averages are calculated for each GPS-RO technique and ionosondederived values using a ±10 • window to estimate trends as a function of geomagnetic latitude. Ionosonde error bars are presented at each site from standard deviations calculated over the range of years. Additionally, the shaded region around the ionosonde trendline (dashed black line) represents a rolling spatial standard deviation calculated using a ±10 • latitude window.   The expected latitude dependence of sporadic-E formation is immediately obvious, with a maximum in the local summer hemisphere much larger than the ORs in the local winter hemisphere. This result is in agreement with many previous studies [36][37][38]. In the boreal winter, the Arras and Gooch techniques overestimate rates while the Niu technique overestimates at all latitudes except one. In general, the Yu technique slightly underestimates the values except for a couple of latitude positions, and the Chu technique underestimates for all latitudes. Overall, the Yu technique shows the closest agreement with the ionosonde results for the boreal winter. For the boreal summer season, most GPS-RO techniques align with the ionosonde rates over some range of latitudes. In general, the Arras and Gooch techniques overestimate the ORs, but the Chu, Yu, and Niu techniques all show reasonable agreement overall.
To compare the various RO techniques to the ionosonde results over all seasons and years, a bootstrap calculated mean using 10,000 random resamples with replacement from the original dataset was calculated for each method (Figure 4). These results show that the Yu technique most closely aligns with the mean ionosonde ORs without a lower threshold on fbEs intensity, while the Chu technique underestimates and the Gooch, Arras, and Niu techniques overestimate. Similarly, the ORs averaged over all sites, seasons, and years, as displayed in Table 2, show an average of 14% for the Yu technique and an ionosonde average of 16%. Finally, statistical similarity is calculated as the percentage of overlap between uncertainty intervals calculated for a particular statistical metric. For the occurrence rate means, the statistical similarity is displayed in Table 3. As highlighted by the bootstrapping results (Figure 4), the Yu technique shows the most overlap and has the strongest statistical similarity with the ionosonde means overall. However, the Gooch technique shows the strongest statistical similarity in the boreal summer season, likely due to the close match in the northern midlatitudes during this period and the Yu method underestimation of the rates in the Southern Hemisphere.

fbEs ≥ 3 MHz
The comparison is repeated for fbEs intensities greater than 3 MHz, which in general are reduced from the total sporadic-E ORs by a factor of roughly 2/3 [14]. This distinction between sporadic-E intensity is important as the GPS-RO derived ORs show relatively large variations in magnitudes between the various techniques. While the Chu technique tends to underestimate the ORs for no lower limit on E s intensity, we find a close agreement with the fbEs ≥ 3 MHz rates derived from ionosonde measurements. Figure 5 shows a scatter plot of the winter and summer ORs for each GPS-RO technique separated by Digisonde site. As discussed in Section 3.1, there are 22 sites for December-February and 25 sites for June-August with greater than 10 GPS-RO crossings throughout 2010-2017 over the season of interest. Linear regressions and associated variances of GPS-RO rates as a function of ionosonde rates are calculated through bootstrapping. All other RO techniques overestimate the E s ORs while the Chu method shows agreement in both slope and magnitude.
Overall, the Chu technique shows the closest agreement with ionosonde rates over the range of geomagnetic latitudes, as displayed in Figure 6. While the Chu technique overestimates the OR in the equatorial region in boreal summer, there is reasonable agreement in the midlatitudes during summer and winter. In general, the other techniques tend to overestimate the rates for this fbEs threshold.
Combining the data over all sites, years, and seasons indicates a strong agreement between the bootstrap calculated means for the Chu method and the ionosonde-derived values (Figure 7). This result is also supported by the overall average of 8% taken over all sites, seasons, and years for both the ionosonde and Chu results (Table 2). Finally, the statistical similarity of 88% for the Chu technique over the entire year, as displayed in Table 4, gives confidence in using this method for characterizing sporadic-E ORs of fbEs ≥ 3 MHz.

Conclusions
Five GPS-RO methods for monitoring sporadic-E occurrence rates from Refs. [22][23][24][25][26] are compared against Digisonde results obtained through the GIRO network to find techniques with strong agreement for two limits on sporadic-E intensity: no lower limit and fbEs ≥ 3 MHz. Data over a period of 2010-2017 were analyzed and separated by season to compare the techniques' occurrence rates. For GPS-RO techniques without binary criteria, a threshold E s layer intensity of 3 MHz was implemented to provide a binary threshold for E s occurrence. Results for the five techniques varied depending on the statistical measurement, season, and sporadic-E intensity threshold.
For the total occurrence rates with no lower limit on E s intensity, the Yu S 4 technique showed the closest agreement with ionosonde rates. In general, the Arras, Gooch, and Niu methods overestimated the occurrence rates for this E s intensity, while the Chu method underestimated rates. However, it must be noted that most techniques showed large variations in rates and no technique perfectly aligned with the ionosonde results over all locations and seasons. Occurrence rates calculated for fbEs ≥ 3 MHz showed a strong agreement between the Chu technique and the ionosonde-derived rates. The other techniques overestimated the occurrence rates for this E s intensity threshold, in general.
Overall, the results of this comparison show that the many GPS-RO techniques for monitoring sporadic-E show a wide variation in occurrence rates because they ultimately map to different intensity thresholds for sporadic-E (i.e., fbEs or foEs). While the Yu technique aligns with ionosonde rates without a lower limit on fbEs and the Chu technique aligns with fbEs ≥ 3 MHz, the other techniques likely align to various foEs thresholds with higher occurrence rates as outlined in Ref. [14]. Further research is required for these comparisons, but the key issue of understanding the large difference in GPS-RO derived sporadic-E occurrence rates has been solved.
These various GPS-RO techniques can be used to create updated global climatologies of sporadic-E occurrence for specific E s intensity thresholds. Combining GPS-RO and ionosonde data would provide a synergistic combination with nearly global coverage. Possible improvements to the various techniques, or an ensemble approach combining multiple techniques, may also help to improve the statistical similarity with ionosondederived occurrence rates.