Removal of Covariant Errors from Altimetric Wave Height Data

: The echo waveforms received by conventional radar altimeters are interpreted by retracking algorithms to give estimates of range, wave height, and backscatter strength. However, in response to fading noise on the waveform leading edge, common retrackers, such as MLE-3 and MLE-4, show correlated errors in wave height and range. This correlation is used to develop a correction to the wave height data that reduces the high-frequency variability by ∼ 22%, without affecting the global distribution of values. This correction also results in a closer matchup of Jason-2 and Jason-3 data during their tandem phase. Although the correction is quite straightforward in practice, the appropriate conversion factor has to be determined for each combination of altimeter and retracker. There are also remaining open questions concerning the needed low-pass ﬁltering.


Introduction
There are now more than three decades of near-continuous wave height data from satellites, which have been used to describe the regional wave climate and its trends [1]. Altimeter data can also be used to give the broad scale picture across storms, and for ingestion into wave forecasting models. However, wave height data from altimeters are less frequently analysed at their finer scale because of short-scale noise within the data records. We investigate how these data can be improved to potentially enable studies of wave-current interaction and better comparisons between different altimeters and with wave buoys.
Satellite-borne radar altimeters are principally designed to accurately record range information (in order to determine sea level), but their return signals are also widely interpreted to give significant wave height (SWH) and wind speed, and information on other atmospheric or oceanic conditions [2]. Most radar altimeters to date have been "low resolution measurement" (LRM), with the only information recorded being the time series of power level within the radar echo. Instruments operating in this LRM mode typically produce signals as illustrated in Figure 1a. The expected signal has a simple functional form (Figure 1b), corresponding to the mathematical convolution of three terms [3,4]. However, the signal received is beset by "fading noise" in that the vector summation from multiple small reflecting facets on the sea surface will have a mean power, P, as shown in Figure 1b, but with a random variation governed by χ 2 statistics such that the standard deviation (S.D.) of multiple realizations is P/ √ N, where N is the number of independent pulses averaged. As the instrument sampling permits only a limited number of pulses to be averaged, this fading noise will be present in all signals analysed. Consequently, algorithms designed to fit a model waveform to the observations and to derive its key parameters must be designed to minimise their sensitivity to this noise term. A number of these "retracker" algorithms are commonly used, including MLE-4 [5] and ALES [6].
Several papers have shown that there is "cross-talk" between the estimation of wave height (Hs) and range, as both are derived from information on the leading edge, and thus a particular realization of fading noise can lead to covariant errors in these two terms. Similarly, for those algorithms that fit both amplitude (σ 0 ) and mispointing (ψ 2 ), which are both terms related to the power distribution in the trailing edge (see Figure 1b), there is a strong connection between their errors [7,8]. Quartly [7] showed how this property could be used to significantly improve the σ 0 estimates from the MLE-4 algorithm and, more recently, Zaron and deCarvalho [9] and then Quartly et al. [10] have demonstrated how the Hs data can be utilised to provide smoother (more realistic) variations in range, and thus ultimately in sea level. Here, we take those ideas and explore the potential for the range information to be used instead to improve the quality of wave height estimates. In the majority of waveform-fitting algorithms, the slope of the leading edge is governed by a term σ C , which combines the effect of the width of the emitted pulse, σ P and of the spread of reflecting facets governed by significant wave height, Hs: where σ C and σ P are expressed in travel time and c is the speed of light. On occasions, the random fading noise may result in a steeper leading edge than would be expected for a flat surface, i.e., σ 2 C < σ 2 P . Often in such circumstances, the derived estimate for Hs is set to zero, potentially leading to multiple neighbouring values being the same, resulting in a spike in the probability distribution function, and an unreasonably low estimate of the variability within a 1 Hz record (denoted σ Hs ). Alternate coding of the information is possible, whereby the Hs value reflects the fact that σ C was less than its theoretical minimum.

Covariant Errors of Properties of Leading Edge
In simulation work, involving a retracker algorithm applied to a waveform with appropriate fading noise, Sandwell and Smith [11] had shown there was a very strong correlation between the errors in the derived slope and in the epoch (position of the mid-power point). In their work, the error was well-defined since the true position without noise was known. In a practical implementation neither the true epoch nor the range is known, so we approximate this by the anomaly about the local mean. In a preliminary study, Smith et al. [12] examined the cross-spectra of sea surface height and SWH, noting that the magnitude squared coherence (equivalent to the correlation coefficient, r 2 ) was high (around 0.4) for all scales smaller than 50 km. These results have been corroborated more recently by Tran et al. [13] in a detailed analysis of the MLE-4 retracker (4-parameter Maximum Likelihood Estimator) on Jason-3.
We explore the potential for improving the quality of altimeter wave height data through reducing these covariant errors, applying our ideas to Jason-3 products from both MLE-4 and MLE-3 retrackers (the latter not fitting the 4th term, ψ 2 ). For practical purposes, we use ζ = altitude minus range, ignoring all the geophysical and instrumental corrections that vary on scales >50 km.
To examine the covariant properties, we consider individual 1-second records of data, containing twenty 20 Hz measurements. These correspond to overlapping circular footprints between 2 and 10 km in diameter depending upon wave height [14,15], but spaced at intervals of 330 m along track-thus, there should be minimal real physical change between neighbouring data points. Following Quartly et al. [10], we detrend both the ζ and Hs measurements to emphasise the high-frequency variations which are due to the fading noise. In their work, they determined γ, the slope of the red line in Figure 2a governing the portion of ζ anomaly explained by SWH anomaly. Here, we perform the regression the other way (corresponding to the green line) to give the proportionality value, Γ, indicating how the ζ anomaly affects Hs. (Clearly, if the two parameters were perfectly correlated, then Γ would be the reciprocal of γ.) We calculate the slope of the relationship for all fully marine 1 Hz records having a full complement of 20 observations. (To obtain data on the covariance due to fading noise alone, we avoid contamination by land, ice, and rain by using, respectively, thresholds on proximity to land, latitude outside polar regions, and negligible differential attenuation between its two radar frequencies.) Individual 1-second ensembles will yield a wide range of values for Γ and the associated r 2 statistic detailing the fraction of variance explained. Figure 3a shows that the spread of values for Γ is indicated by the 5th and 95th percentiles of the distribution, but the median value is close to −4 for a wide range of Hs values, although with significant change at low sea states. The median r 2 value is typically greater than 0.40 (Figure 3c), with an increase at low sea states. The analysis of the output of the MLE-3 retracker yields a similar distribution for Γ, as did MLE-4, but the values of explained variance are slightly lower.
As the composite width, σ C , is the variable that best describes the width of the leading edge (rather than Hs), we perform a similar analysis linking anomalies in σ C to those in ζ ( Figure 2b) and noting how its relationship varies with Hs ( Figure 3b). For large wave heights (Hs > 2m), σ C approximates to Hs/2c, and thus the regression statistics show the same r 2 value with the derived Γ values differing by a factor of 1.67 (if Hs is in metres and σ C in nanoseconds). The Γ curves differ at low Hs, where Equation (1) is nonlinear, but the utilization of σ C does not give a much simpler form.

Implementation of SWH Adjustment and Its Evaluation
This understanding of the covariant errors can be used to implement an empirical adjustment to the default Hs estimates: where Hs orig and Hs adj are the original and adjusted estimates of wave height, and δζ is the local anomaly in ζ. To enable a continuous correction over large segments of data, we calculate δζ relative to a running 21-point median, viz: Within our implementation (Figure 4), the appropriate adjustment factor, Γ, could be defined from Figure 3a, based on an average Hs value for that segment, although that approach is slightly circular given that Hs is the parameter to be improved. For simplicity, here we use the mean value (Γ = −4.26) in all circumstances (and Γ = −4.23 for the MLE-3 retracker); the Appendix shows that there is minimal benefit (in this particular case) in implementing a lookup table characterizing Γ as a function of Hs. Figure 5 shows an example of the effective improvement in 20 Hz Hs data-the top panel shows the change for individual values, whilst the middle panel shows that the mean is little affected, since the mean of 20 consecutive records of δζ is close to zero. Figure 5c quantifies how much the variability is reduced for this example.  [13] and discussed in the Appendix of this paper. The global view of this reduction in variability is illustrated in Figure 6. The left-hand panel illustrates the probability distribution function of σ Hs for 300 Jason-3 tracks (∼12 days), with this measure of variability only tallied when present for all four retrackers (MLE-4 and MLE-3 on the GDR, and the "adjusted" versions of each). The distributions for "adjusted MLE-4" and "adjusted MLE-3" almost coincide, but the latter does have more large values (σ Hs > 1.0 m). The median value for each retracker is highlighted by the solid vertical lines: 0.534 m for MLE-4, 0.511 m for MLE-3, and 0.405 m for both adjusted versions, corresponding to changes in σ Hs of 24% and 21% respectively, and thus reductions in variance of about 40%. The dashed lines show the 95th percentiles, as these are sometimes used as a threshold for discarding unreliable data. Variability in Hs estimates is known to be greater in high wave height conditions, suggesting that the flagging of Hs data based on σ Hs should be done using a threshold that varies with Hs- Figure 6b shows this variation to be common to both MLE-4 and MLE-3 retrackers and the adjusted versions of them. (Large values of coherence between range and Hs have been reported at all scales below ∼50 km [12,13], so we have investigated calculating δζ relative to a 101-point running median filter, corresponding to ∼33 km along track. However, the results were not appreciably different.) A further validation of the usefulness of this adjustment is provided in Figure 7, which matches up the 20 Hz records of Jason-2 and Jason-3 data during the tandem phase at the beginning of Jason-3's mission. The two satellites were flying along the same ground track, with Jason-2 between 81 and 83 s ahead of Jason-3, and the match-ups agree in latitude to within 0.001 • . The mean offset between the two (blue line in Figure 6a) shows some variation with Hs (most noticeably below 1.5 m). The spread of values in their difference is shown by the standard deviation (Figure 7b)-the values for MLE-4 are consistent with √ 2 times the variability for one instrument (Figure 6b). Implementing the adjustment for both Jason-2 and Jason-3 has negligible impact on the mean offset between the two instruments, but reduces the S.D. of the differences by ∼22%. Similar to the analysis for improving σ 0 and sea surface heights (ζ) data [10], the pertinent value of Γ will vary with both altimeter and retracker applied. Figure 8 provides a synopsis of similar analysis for MLE-4 processing of data from AltiKa, the Ka-band altimeter on the SARAL spacecraft. That altimeter produces 40 average waveforms and thus 40 separate estimates of the parameters per second [16]-individual 1-second ensembles from 500 passes (∼17.5 days) are used to determine the appropriate value of Γ and the associated r 2 . In this case, the median value of Γ is −5.06, slightly larger in magnitude than for Jason-3. It also has a more marked variation at low Hs, suggesting that implementation using a variable Γ may be beneficial. The median r 2 value is 0.44, slightly higher than that for Jason-3. Implementation of an adjustment, simply using the mean value of Γ and the anomaly relative to a 41-point (1-second) running median, leads to a significant reduction in the σ Hs values, with the median value reducing by 18% from 0.33 m to 0.27 m.

Conclusion and Discussion
Since both estimates of range and wave height are derived from a few waveform bins on the leading edge of an LRM waveform, their estimates are sensitive to the existence of fading noise affecting the power levels within this section of the waveform (Figure 1b). Sandwell and Smith [11] showed that there was a correlation between the noise-induced anomalies in these two parameters, and Zaron and deCarvalho [9] and Quartly et al. [10] have used this empirical connection to reduce the errors in range. The present paper demonstrates that the reverse procedure is equally applicable: using local anomalies in ζ (altitude minus range) to reduce the associated errors in SWH.
The coefficient, Γ, linking the anomalies in these two terms is found to vary slightly with Hs, with the nonuniformity being most marked for Hs < 1.5 m (Figure 3a)-recasting the empirical link so as to be between ζ and σ C (a term better describing the width of the waveform leading edge) did not improve this ( Figure 3c). As noted for the analysis on improving σ 0 and ζ [10], the coefficient to improve Hs depends upon both the altimeter and the retracking algorithm used. The appropriate values of Γ for MLE-4 and MLE-3 are very similar, whereas a different value is needed for ALES applied to Jason data (not shown). In the same way, MLE-4 applied to AltiKa data needs a different value (see Figure 8a). This paper only shows results for LRM altimeters. A brief analysis has been carried out for Sentinel-3 when the altimeter was operating in SAR mode, and the reduction in variance was much smaller than for LRM operations [17]. This is because the SAR waveform has a different shape, with a sharp drop in the trailing edge that depends upon wave height (see Figure 9b of Ardhuin et al. [18])-as more waveform bins are sensitive to Hs, the noise level on its initial estimate is lower, with less cross-talk with the range estimate, which is in effect only extracted from the SAR waveform's leading edge.
In this paper, we demonstrated the implementation of an "adjustment" to Hs using a constant value of Γ as an approximation to the variation shown in Figure 3a. In a similar analysis, which only focused on MLE-4 applied to Jason-3, Tran et al. [13] modelled the variation as a simple linear function of Hs. With their adjustment, they achieved a reduction in σ Hs of 38% at Hs = 2 m, which is larger than the 22% achieved here (see Figure 6, for example). Their analysis used altitude minus range minus mean sea surface, with that field smoothed by a low-pass Lanczos filter of width 140 km, i.e., ∼20 times broader than that implemented here, and also showed a significant reduction in SWH spectra for all wavelengths up to 70 km. However, the results in the present paper showed negligible change when the width of the median filter was changed from 21 to 101 points. Possibly their inclusion of a detailed mean sea surface (MSS) has removed stationary anomalies not associated with the effect of fading noise. Thus, the importance of incorporating MSS may need to be evaluated in the future.
A further demonstration of the efficacy of the proposed adjustment was the improved agreement between two independent (but very similar) altimeter systems. With adjustments to the Hs values for both Jason-2 and Jason-3, the S.D. of the discrepancy between the two reduces by ∼22%, in line with the change in estimated noise for one instrument. Note that as the two follow the same track within a very narrow deadband, the effect of omitting the MSS term should be similar for both instruments.
The advocated procedure removes an error contribution associated with the retracking of fading noise. Other sources of error will remain, so this adjustment may be combined with along-track smoothing or empirical mode decomposition [19] to further reduce errors at the scales of interest. As the implementation uses the anomaly in ζ relative to its local median, there is no impact on the large scale statistics of SWH. Consequently, the accuracy of the adjusted Hs field when compared with buoys or models will be as for the underlying retracker if the analysis uses averages computed over tens of kilometres along track.
There are three main potential lines of application: (i) Comparisons with buoys that only use the few nearest good waveforms; (ii) investigation of change in wave conditions on altimeter tracks running towards the coast, where wave refraction and shoaling bathymetry may be expected to make a big impact; and (iii) studies of wave-current interaction, where the presence of strong current shear is expected to manifest itself in a modulation of the wave field [20]. The method is predicated on the received waveform being a noisy version of that expected by the original retracking algorithm. Therefore, the correction is unlikely to be appropriate if the waveform is significantly affected by small-scale rain events, for example, as these change the overall waveform shape. In such circumstances, information from the relatively unaffected C-band channel may be useful [21], if fitted. Figure A1. Variation of σ h as a function of Hs for two implementations of the adjustment, firstly using a constant value for Γ and then utilising the functional form displayed in Figure 3a. The various percentiles are calculated for bins 0.2 m wide.