1. Introduction
Sea state observations from satellites are increasing in duration, abundance, variety and applications. The long-term continuous altimetry record is particularly significant in this context, having begun in 1992 and affording us the capability to investigate long-term variability on a global scale from remote observations [
1,
2]. The continuity and stability of this record is, therefore, of great importance, and to that end, since the TOPEX/Poseidon mission [
3] launched in August 1992, the Jason series of satellites [
4] has maintained the same reference orbit and ensured a consistency of measurement to the present day. The growing abundance of sea state observations from other missions, spanning a variety of platforms and instruments with heterogeneity in spatiotemporal coverage, further motivates the continuation and maintenance of a consistent long-term record. For example, the European Space Agency (ESA) Sea State Climate Change Initiative (CCI) employs this reference record extensively when intercalibrating missions as part of the production of its multimission sea state Climate Data Records [
5].
With the approaching retirement of Jason-3 (J3), ESA’s Sentinel-6 Michael Freilich (S6-MF) mission [
6] launched in 2020 and formally succeeded J3 as the long-term altimetry reference mission in April 2022. To ensure smooth operational continuity, S6-MF commissioning involved a unique 12-month S6-MF/J3 Tandem Experiment (S6-JTEX) such that S6-MF followed J3 in the reference orbit, lagging by approximately 30 s, providing the opportunity to generate a substantial record of closely collocated altimetry measurements from these two missions, including sea surface significant wave height (Hs). J3, launched in 2016, obtains measurements of Hs derived from the onboard Poseidon 3B altimeter in low-resolution (LR) mode. Carrying a Poseidon 4B altimeter, S6-MF can similarly acquire measurements of Hs in LR mode. However, the newer instrument is a nadir-pointing dual-frequency synthetic aperture radar (SAR) altimeter designed to provide high-resolution (HR) altimetry measurements. As such, measurements of Hs are made concurrently in both LR and HR modes, providing a valuable opportunity to assess their respective characteristics and, for the first time, to evaluate the performance of the new S6-MF SAR interleaved mode [
7] directly against S6-MF LR and J3.
In spite of the advancement in the altimetry measurement record for sea state, questions about uncertainty in observations persist. These stem, in part, from a lack of precise, long-term and quality-controlled direct in situ measurements. While a small number of in situ records date back many decades, their long-term stability remains questionable [
8,
9], and further, their sparsity limits our knowledge to a small number of global locations, generally far from the coast and in the northern hemisphere. Information about in situ measurement uncertainty is usually entirely absent, and varies between platforms, owing to differences in operating agency, measurement payload, local conditions and so on. While recent efforts have been made to improve quality control information associated with some important records, such as the U.S. National Data Buoy Centre (NDBC) buoy database [
10], our enormous reliance on buoy data persists nonetheless, even as the satellite record continues to expand. Dodet et al. [
11] conducted an in-depth investigation of uncertainty through intercomparison of measurements with a global set of moored buoys, drawn from the Copernicus Marine Service In-situ TAC (
https://www.marineinsitu.eu/dashboard, accessed on 20 April 2024), and model hindcast, using the so-called triple collocation analysis (TCA) [
12]. The results from this study established estimates of random error and biases for a range of altimeters, with the lowest errors stated to be around 5%.
Such studies are facilitated by at least two important factors, but these also lead to various limitations in subsequent analyses. Firstly, the analysis of errors is achieved by exploiting in situ measurements at sites in deep water (>50 m). Areas closer to land (<50 km) are excluded to avoid the possibly deleterious effect of radar back-scatter, and the difficulty of analysing over local sea state gradients. Consequently, the analysis of coastal locations, which are often of great interest and unobserved in any other way, are intentionally precluded. Secondly, a large number of collocations improves the statistical robustness of results. Typically many thousands of collocations are required—Dodet et al. worked with some 250,000 collocated data records from numerous missions and spanning three decades. However, although researchers may be able to eliminate the spurious performance of some in situ platforms through their own quality control schemes, ultimately, individual platforms may exhibit quite different biases and errors (see e.g., [
11] Figure 3), leading to an analysis that necessarily averages over the entire dataset. The exclusion of platforms is also undesirable since it further reduces sample size and statistical robustness.
The challenge of collocating altimetry observations with buoy (or model) data is characterised by the need to reconcile infrequent snapshots of Hs, derived from rapid along-track sampling, with near-continuous time series of Hs at a quasi-stationary position. At 1 Hz observation frequency, the altimeter covers approximately 7 km of the ocean surface between successive measurements. On the other hand, data buoys tend to derive integral parameters by sampling waves for 10 to 30 min. For practicality, a fixed spatiotemporal match-up criteria for sampling is typically applied to all collocations [
13] because the assessment of sites on a case-by-case basis is onerous. Recently, Campos [
14] made updated recommendations with respect to spatiotemporal sampling criteria. However, inaccuracies in the match-up, such as representativity error, and site-related problems (e.g., platform mal-operation) can lead to increased uncertainty. In order to better address this, Jiang [
15] has proposed a method to explicitly evaluate representativity and environmental and random errors using only altimeter and in situ observations. Using the dataset of the Sea State CCI [
16], they conclude that, over a large aggregate global sample for deep-water buoys, different classes of error are approximately equal, and go on to compute altimeter random error more precisely. J3 was found to have similar error characteristics to the previous Jason satellites, with an error increasing with Hs, from 0.15 m (1 m Hs), rising to 0.25 m (7 m Hs). These errors are somewhat smaller than those found by Dodet et al., although, questionably, errors in older altimeters, such as TOPEX/Poseidon, were found to be smaller still.
The aforementioned studies, and others [
1,
17], limit their analysis to deep-water moorings in order to control for errors that might otherwise be associated with local coastal morphology and resulting gradients in sea state variability. A few studies, however, have attempted to provide analyses of altimeters closer to the coast. Timmermans et al. [
18] assessed the limitations of altimetry information to evaluate extremes near the coast. In that case, only a small number of sites were considered, and the collocation methodology did not account for local sea state gradients. Nencioli and Quartly [
19] used 17 buoys situated up to a few kilometers from the coast, operated by the National Network of Regional Coastal Monitoring Programmes [
20], to validate altimetry observations both from Sentinel-3 SAR and pseudo-LRM modes. They applied a detailed methodology to evaluate how best to compare altimeter and buoy data, concluding that coastal morphology was a hugely important factor in the collocation approach. They investigated sea state gradients in complex coastal regions by computing comparison statistics with buoys from along-track data. Match-up criteria for the final analysis were further informed by the use of a high-resolution coastal hindcast to establish regions of representativity for each buoy. In contrast, using a much larger global set of coastal buoys, Bué et al. [
21] computed agreement statistics with a number of altimeters. They employed a methodology that closely collocated altimetry with buoy data based on the distance to coast and concluded that altimeters slightly overestimate Hs in coastal regions.
The use of coastal buoys is desirable in terms of their abundance, compared to their deep-water counterparts, and also creates new opportunities for coastal research. However, collocation has to be meticulous, and disagreements between altimeter and buoy data are often not readily explained by any single factor [
19]. In the event of disagreement, no studies cited here explicitly attempt to identify whether buoys themselves may be providing erroneous data. Other aspects of disagreement, such as seasonal conditionality, are also overlooked. Furthermore, high-resolution wave hindcast data that can resolve gradients at similar scales to altimeters, particularly near the coast, are expensive to produce and are unlikely to be available generally. And while altimetry data can be exploited directly to examine sea state gradients [
22], to date they have seen little application with respect to understanding and explaining discrepancies and errors with respect to other data sources.
Finally, the aforementioned studies used data from one or more altimeters that did not observe the same sea state concurrently. Data from altimeter–altimeter collocation available in the S6-JTEX tandem phase provides a rare opportunity to compare results from different instruments but under the same conditions. Previously, the Copernicus Sentinel-3A and B satellites were placed in a tandem configuration for six months [
23]. Using TCA, Sentinel-3 SAR mode altimetry was found to have to the lowest error by a small margin compared to buoys and ERA5 reanalysis, although the results were limited by few collocations. This experiment also lacked the collocation of both LR and HR data acquisition.
Given the imperative to understand uncertainties in the long-term altimetry record, particularly linked to a transition between reference missions, in this work, we use the collocated S6-MF LR, HR and J3 data from 12 months of the S6-JTEX tandem phase to evaluate uncertainties in observations of Hs. In the locality of the North East Pacific, we undertake a detailed examination of the tandem data, principally via collocation with data from moored buoys located in deep water. After an initial examination of the tandem data in
Section 3.1, TCA is presented in
Section 3.2. We then proceed to evaluate the effect of the altimeter sampling area on Hs mean bias in
Section 3.3. Subsequently, the impact of individual buoys on analyses of Hs mean bias are examined in
Section 3.4, and finally a detailed spatial analysis at selected buoy sites is conducted in
Section 4. The implications of these results, particularly with respect to their application to coastal regions, are discussed in
Section 5.
3. Data Intercomparison and TCA at Offshore Locations
In this section, we intercompare the various Hs observations. In
Section 3.1, using standard statistical methods, we evaluate agreement between the measurements of Hs from the tandem datasets and moored buoys. In
Section 3.2, TCA is applied to each altimeter dataset, together with moored buoys and ERA5 reanalysis. Finally, in
Section 3.3 and
Section 3.4, we focus on Hs mean bias between individual buoys and altimeters to identify how the collocation approach affects agreement with each of the buoys in the ensemble. Discrepancies driven by spatial gradients in particular are identified and evaluated through a detailed analysis in
Section 4.
3.1. Tandem Data Intercomparison with In Situ Data
The tandem data are collocated along-track, with a 30 s lag for S6-MF. For intercomparisons of wave height over distances of 10–100 km, sea state is not expected to vary appreciably during the lagged period. Given our primary interest of intercomparison with in situ data, initially, comparison of the three tandem datasets is limited to the ground track sections within 100 km of the nine OS buoy sites (see
Figure 1). Median values of 1 Hz observations are computed, resulting in a single observation per overpass.
Firstly, LR data from J3 and S6-MF are compared. The anomaly (J3–S6) is shown in
Figure 2A, together with a number of statistical measures. A total of 870 overpasses took place, with the data showing near-perfect correlation. A mean bias of ≈0.01 m and an RMSD of ≈0.06 were observed. Outliers are few in number. For Hs greater than 5 m, observations become sparse. In contrast, the comparison between J3 and collocated buoy data (
Figure 2B) shows considerable random error and increased mean bias. There also appears to be a negative bias at higher values of Hs. Disagreement is linked to representativity error, local spatial gradients in sea state variability and sampling methodology. These issues are explored in greater detail in
Section 4.
A comparison of J3 and S6-MF HR data is shown in
Figure 2C. The effect of the sea state-dependent bias in the HR data is evident, resulting in a bias in the mean of −0.24 m. Also, a number of outliers are apparent and likely linked to inconsistent quality control flagging in the S6-MF data. However, linear regression modelling, based on simple functions of collocated J3 Hs, can be fitted robustly and is found to explain 99.8% of the variance. Residuals from such a model are shown in
Figure 2D. In this case, residual outliers exceeding 3 s.d. of the distribution are assumed to be spurious and have been removed. Agreement between J3 and S6-MF HR is found to be very similar to S6-MF LR. Mean bias is, in fact, found to be close to zero, with an RMSD of 0.06 m. The linear relationship between the J3 and S6-MF HR is given by,
where the coefficients
a,
b and
c are found to be −0.5654, 1.0930 and 0.4390, respectively.
3.2. Triple Collocation Analysis
TCA can be applied to the tandem data, but while J3 and S6 offer an abundance of data closely collocated in space and time,
Figure 2 shows that both LR mode data and HR data are highly correlated (
). It is therefore far-fetched to assume that observation errors are independent, which violates an important assumption under the TCA method (see
Section 2.4). Each of the three tandem datasets was therefore evaluated independently (together with buoy data and ERA5) in the TCA. In this case, altimeters and ERA5 were pre-calibrated using linear regression with respect to the buoy data, assumed to be unbiased with respect to the ground truth. To remove the nonlinear bias component, S6-MF HR was calibrated in a similar way to Equation (
1), but setting buoy data as the response variable in place of J3. Using a 100 km radius, 535 collocations are available. The reduction here with respect to the sample sizes seen in
Figure 2 is due to buoy outage. Coefficients of the regression relationships are shown in
Table 1.
Figure 3 shows the results from TCA using each tandem dataset individually, each together with the OS buoys shown in
Figure 1 and ERA5 hourly (0.5 degree) reanalysis. Altimeter estimates of Hs appear to be the lowest, although the sampling error (black error bars show 1 s.d.) is clearly quite large compared to the differences between the datasets, so the results must be interpreted with caution. For example, differences in errors between the different tandem datasets appear to be very small, and are clearly not resolved. To illustrate the impact of data availability, a similar analysis is carried out with the J3 record from 2017 to 2021 (inclusive) and shown in
Figure S3 (Supplementary Material). The improvement in statistical robustness from the larger sample size is apparent (3002 vs. 535), and gives rise to some changes in the relative error magnitudes. However, noting the size of the uncertainty estimates on the error variances, even a few thousand samples appears to be insufficient to completely resolve the differences between the tandem datasets in this case. Note that representativity error affects all collocations and may contribute to the higher error values associated with buoy data and ERA5.
To summarise the TCA results, there are a number of issues that affect the implementation and validity of this type of analysis; however, we emphasise two aspects in particular. Firstly, representativity error obscures the true error variance computed with the TCA, and is linked strongly to the collocation methodology. At deep-water sites, where collocation is conducted over, e.g., <100 km, this is generally assumed to be small owing to a high degree of spatial sea state homogeneity in those areas. Secondly, as already noted, large numbers of collocations are required for statistically robust results, particularly where there is little difference between datasets, as is the case here for the tandem data. Seen in
Figure 1, even within this particular region, many more buoys are potentially available. However, these are mostly located much closer to the coast, which demands a more detailed collocation methodology. Closer to the coast, gradients in sea state variability are likely to be stronger [
19] and consequently have a more dramatic impact on collocation approach.
In fact, we can examine both of these issues in much the same way. We begin, in
Section 3.3 and
Section 3.4, by evaluating the Hs mean bias between altimeter and buoys to see how individual locations and buoys contribute to the overall bias evaluation. Later, in
Section 4, local sea state gradients are evaluated using the altimetry data directly to show how gradients can affect the mean bias under different sampling methodologies.
The results of the TCA and its potential application for the tandem data are discussed further in
Section 5.2.
3.3. Hs Mean Bias for Tandem Data
In this section, using all OS buoys, we examine Hs mean bias specifically, and evaluate how this is affected by changes in radius using a typical systematic isotropic sampling method.
The Hs mean bias between S6-MF LR and the buoys is shown in
Figure 4. In this case, the sampling radius showing the closest agreement is approximately 60 km (≈−0.008 m), with the bias remaining relatively stable between 35 and 75 km. However, bias with individual buoys can be seen to be highly variable, with only four buoys contributing to the analysis for most sampling radii. The blue shading shows that the altimetry data density diminishes fairly rapidly with decreasing sampling radius. Furthermore, at least two buoys, 46066 and 46059, suffer from data sparsity during the tandem phase (see
Figure S1), which gives rise to large biases that are not clearly shown in the figure. These biases are likely associated with stronger seasonal waves.
The Hs mean bias for S6-MF HR is shown in
Figure 5. The overall behaviour with decreasing sampling radius is remarkably similar although the bias is clearly larger, and on average closer to zero. Statistical noise remains high, owing to the small number of samples, so it is difficult to attribute the difference of 0.01 to 0.02 m between the two S6-MF modes. Finally, for comparison, a similar analysis is shown in
Figures S4 and S5 for the one- and five-year J3 records, respectively. For the tandem phase, J3 is extremely similar to S6-MF LR, but showing a slightly increased bias (0.015 m). However, for the five-year record, stability in the mean is much improved and found to lie between 0.01 and 0.02 m.
To conclude, Hs mean bias appears to be fairly stable even up to 75 km sampling radius. It is clear that bias, calculated using the respective altimeters, shows only small differences ≈0.01 to 0.02 m, which are considerably smaller than the spread of Hs mean bias for individual buoys. In fact, some buoys, such 46246, exhibit very large deviations. We explore the reasons for this in the next section.
3.4. Hs Mean Bias at Each Site
The results in the previous section show, even for deep-water sites offshore, that Hs mean bias varies quite widely depending on which buoy is used. While this is clearly a function of different errors, such as representativeness or possibly environmental effects, bias remains large even as the sampling radius is reduced to recommended scales of around 25 to 50 km [
14]. This might be anticipated in coastal areas when sea state gradients can be strong, but offshore, this is somewhat surprising. Fortunately, it is readily possible to use the altimeters to interrogate the buoy data for discrepancy.
Away from land interference, satellite altimeters operate in the same way everywhere, so, compared to a collection of buoys, are not expected to introduce location-specific biases linked to operational variation. This provides a consistent measurement system against which buoy data can be compared. For each of the nine OS buoys, the mean bias with respect to J3 is calculated. To improve statistics beyond the single year of the tandem phase, this analysis exploits the five-year J3 record from 2017 to 2021. Large absolute values of mean bias may indicate the degraded operation of a buoy. In order to assess statistical significance, the sampling distribution of mean bias is obtained using a bootstrap sampling approach. The entire sample from all buoys is used to estimate the population, and data pairs (J3 and buoy) are sampled randomly with replacement. For each buoy,
N data pairs are selected, where
N is the number of samples observed at that particular location, and a mean bias is calculated. This process is repeated 5000 times, yielding an empirical probability distribution from which the probability of occurrence of the observed mean bias can be estimated. Initially, a sampling radius of 100 km is used, with results shown in
Figure 6.
The observed mean bias for each buoy is denoted by a blue line. In addition, the number of altimeter tracks, their identifier and their distance of closest approach to the buoy are included in each panel. Note that N samples for each location is approximately the number of collocated tracks, multiplied by 180 (the number of passes in 5 years). The estimated probability of random occurrence of the observed mean bias is also shown.
Given the variety of factors that can contribute to mean bias, results must be carefully interpreted. While the mean bias for most buoys appears small, and consistent with the overall population estimate, the observed mean bias at 46246 is found to be highly improbable. Since, in this case, the minimum buoy–altimeter distances are not large (18 and 51 km), the outlying bias suggests some kind of platform-specific issue. Buoys 46066, 46002 and 46005 all exhibit probabilities around a few percent, also suggesting possible systematic conflict with the altimeter.
However, differences in Hs due to large buoy track separations (up to 89 km in these cases) can contribute to bias, so we expect the results to change with more spatially constrained sampling. The same analysis is therefore conducted for altimeter sampling at 50 km radius, the results of which are shown in
Figure 7.
With a smaller sampling radius, fewer observations are made. In fact, 46002 and 46006 are now excluded from the analysis since no J3 tracks lie within 50 km. Furthermore, fewer samples per buoy in general leads to an increase in the variance in the estimated sampling distribution. This reduces our ability to resolve discrepancies. However, with altimetry observations now taken closer to each buoy, we expect to see increased agreement across the collection of buoys, which is in fact the case. The overall mean bias for all buoys is now closer to zero. This can be seen from the sampling distribution shown in each panel, which is shifted by ≈+2 cm. As a result, low bias at some individual buoys, such as 46066 (top left panel), is now entirely consistent with the overall sampling distribution.
However, in spite of the increase in overall uncertainty, a number of buoys still appear problematic. Again, the mean bias at 46246 remains highly improbable. This is consistent with results presented in
Section 4.1 that show Hs mean bias is particularly low everywhere, and largely independent of sampling radius. Furthermore, 46001 still appears to be statistically anomalous, in fact it increased from ≈0 to 2 cm. More dramatically, 46085 has swung from a negative mean bias of 4 cm to a positive bias of 3 cm. These changes are perhaps unsurprising because the number of sampled altimeter tracks has been reduced from four to one. Furthermore, a closer look at the site of 46085 reveals a gradient in mean bias, running approximately south to north, that is driven largely by the winter (ONDJFM) conditions. Local spatial gradients are examined in detail in
Section 4.