The Roles of the S3MPC: Monitoring, Validation and Evolution of Sentinel-3 Altimetry Observations

The Sentinel-3 Mission Performance Centre (S3MPC) is tasked by the European Space Agency (ESA) to monitor the health of the Copernicus Sentinel-3 satellites and ensure a high data quality to the users. This paper deals exclusively with the effort devoted to the altimeter and microwave radiometer, both components of the Surface Topography Mission (STM). The altimeters on Sentinel-3A and -3B are the first to operate in delay-Doppler or SAR mode over all Earth surfaces, which enables better spatial resolution of the signal in the along-track direction and improved noise reduction through multi-looking, whilst the radiometer is a two-channel nadir-viewing system. There are regular routine assessments of the instruments through investigation of telemetered housekeeping data, calibrations over selected sites and comparisons of geophysical retrievals with models, in situ data and other satellite systems. These are performed both to monitor the daily production, assessing the uncertainties and errors on the estimates, and also to characterize the long-term performance for climate science applications. This is critical because an undetected drift in performance could be misconstrued as a climate variation. As the data are used by the Copernicus Services (e.g., CMEMS, Global Land Monitoring Services) and by the research community over open ocean, coastal waters, sea ice, land ice, rivers and lakes, the validation activities encompass all these domains, with regular reports openly available. The S3MPC is also in charge of preparing improvements to the processing, and of the development and tuning of algorithms to improve their accuracy. This paper is thus the first refereed publication to bring together the analysis of SAR altimetry across all these different domains to highlight the benefits and existing challenges.


The Sentinel-3 Satellites
The Sentinel satellites are an integral part of the Copernicus programme of the European Space Agency (ESA) to provide long-term climate-quality datasets that can enable the monitoring and investigation of the Earth. The Sentinel-3 satellites in particular provide multi-sensor observations of the Earth's surface, with sensors using the visible, infra-red and microwave bands [1] (see Figure 1a). In terms of their sensor payload, the Sentinel-3 satellites are successors to ERS-1, ERS-2 and Envisat, but with advances in the capabilities of all the instruments. Sentinel-3A (S3A) was launched on 16th Feb. 2016 and Sentinel-3B (S3B) on 25th Apr. 2018, each with an anticipated lifetime in excess of seven years, with the objective that a further two units (S3C and S3D) be launched in time for overlap with S3A and S3B so as to generate a coherent self-consistent climate-quality dataset spanning more than 15 years. Both S3A and S3B were placed in sun-synchronous orbits with a mean altitude above the Earth of 815 km, giving an orbital period of 101 min, such that after 385 complete revolutions (27 days), the ground-track of the satellite repeated to within less than a kilometre. The inclination of the orbits is 98.65 • , permitting coverage between 81.3 • S and 81.3 • N, with data being collected over all surfaces (marine, cryosphere and land). S3A has always been in the same 27-day repeat cycle, with a pattern of ascending and descending tracks 0.94 • apart in longitude (104 km at the Equator). The pattern of tracks is such that after 4 days there is quasi-global coverage with a wide (∼7 • ) spacing of tracks, with successive 4-day periods occupying the set of tracks one step to the right (see Figure 1b). S3B was Another key development in SRAL, inherited from recent altimeter missions, is the modification of the on-board surface tracking necessary to set the temporal location of the window for recording the radar echo. In early topography missions, the simple on-board processing had to predict the probable range to the next point on the Earth based on the history of the last few seconds of data [7]. This procedure, called "closed loop" (CL) works well when the range is only varying slowly e.g., in the open ocean or over central portions of the Antarctic Plateau, but could not adapt to sharply varying topography or the abrupt changes at the coast when encountering highly elevated land surfaces. Thus an alternative prediction scheme was designed that feeds information from a digital elevation model (DEM) into the predictive loop. This is termed "open loop" (OL) operation and was first used with the Jason-2 altimeter [8]. It is applied not only near coasts to guide the instrument to track the sea surface, but can also be used in areas of complex topography to position the window to detect the waveforms from chosen rivers and lakes. The table of locations for open loop operation can be modified in flight to encompass new water bodies of interest.

Data Availability and Access
Once downlinked from Sentinel-3, SRAL observations are processed to successive levels of complexity and distributed with three different latencies (see Table 1). The lowest level of data (Level 0) is that telemetered from the satellite. These are in turn processed by the "ground segment", which first applies various calibrations to generate the Level 1 products. It then combines these with complementary data from other sources that provide relevant corrections to produce an along-track Level 2 product consisting of estimates of various geophysical parameters. Near Real-Time (NRT) data are provided to operational users within 3 h of satellite overpass, to enable them to run forecast models and data assimilation. Their requirements are for a quick dissemination of data with clear information on data quality, but accepting that not all corrections will be optimised. On the other hand, users interested in climate applications need the most accurate and consistent version of the data. These data are referred to as "Non-Time Critical" (NTC) and are generally available one month later to allow for the retrieval of the best orbital information and atmospheric and tidal corrections. For such climate studies it is important to understand any long-term changes in the instrument as components age on space exposure, and errors in instrument drift could be misconstrued as some global aspect of climate change. In between these two types of delivery, there are also the "Short Time Critical" (STC) data available within 2 days, which are also used in operational models as they contain preliminary values for all the corrections. During the course of the mission there have been many changes in the Processing Baseline (PB), due to tuning of parameters or corrections for observed artefacts within the ground processing. Consequently, data collected via the NRT, STC or NTC route may have small changes with time. Thus, to enable long-term climate studies, there is periodically a Full Mission Reprocessing (FMR) to produce a more fully self-consistent dataset of the highest standard possible. The most recent of these was in Jan 2020.
The dissemination of Level 1 and Level 2 data has been provided by ESA for LAND services through the ESA Open Access Hub [9], and by EUMETSAT for MARINE areas through the Copernicus Online Data Access (CODA), EUMETCast and EUM Data Centre [10], with some areas of overlap. From mid 2020 onwards their respective realms of coverage are expected to be those shown in Figure 2. To-date the processing has been identical in both agencies; when there is an agreed evolution of the processing code to incorporate an improved correction or retuning of a model, this is indicated by an increment in the associated PB, with the changes detailed in Product Notices [11].Some of the analyses shown in this paper illustrate the step changes in performance that can occur when the PB changes. This is particularly relevant for operational users who require NRT data for weather forecasting or planning vessel movement. However the majority of the results in this paper are for data from the most recent FMR, corresponding to PB2.61 for S3A and 1.33 for S3B. Table 1. Levels and latencies of Sentinel-3 STM processing. The NRT data are provided in segments as downloaded from the satellite; data for the other latencies are collated in pole-to-pole passes (half orbits).

Level 0
Raw telemetered data Level 1A Geolocated and fully calibrated Level 1BS L1A + fully beamformed → "stack" Level 1B Multi-look processed → SAR waveforms Level 2 Retracked → Geophysical estimates + corrections  Land/Sea mask showing the regions of responsibility for data production for mid 2020 onwards (expected). Data over land (light blue) are produced and disseminated by ESA; those over the ocean (dark blue) by EUMETSAT; some regions (white) are common to both, including the polar regions of maximal sea-ice extent and major inland water bodies (e.g., Caspian Sea, Great Lakes, Lake Victoria). As users for terrestrial data may benefit from the records just offshore, and those working in the coastal zone benefit from having the data just inland, these regions extend 25 km outward/inward from the coast to avoid researchers needing to gather data from two sources.
It has been recognised that the current processing set-up may not be optimal for altimetry applications across all surfaces. Thus, plans for "branches" of the processing tailored specifically for inland waters or land ice have been designed. These are not implemented yet, so all results shown here are from identical processing chains implemented by ESA and EUMETSAT.

Sentinel-3 Mission Performance Centre
There are many complicated challenges in assessing the quality of Sentinel-3 STM data over a variety of surfaces, operating at two frequencies (Ku-and C-band) and using two different tracking modes (OL and CL) and assessing the different latencies. The European Community, through its Copernicus programme, funded the development of the Sentinel-3 Mission Performance Centre (S3MPC). The main goal of the S3MPC is to provide focussed effort on the assessment of the whole mission performance, thus including all measurements recorded by the satellites as well as the derived geophysical values emanating from the processing chains on the ground. The S3MPC has different components covering the various sensors. Each component has many Expert Support Laboratories (ESLs) that provide dedicated expertise in specific areas. The groups involved in the STM component are shown in Figure 3, congregated according to their areas of work. The S3MPC activities, though an ESA contract, are actually jointly managed by ESA and EUMETSAT for all Level 1 processing. The S3MPC has significant interactions with two other groups. Firstly, there is the Sentinel-3 Validation Team (S3VT, [12]), which is a broad and open forum of those interested in S3 validation work, with several members of S3MPC participating in its yearly workshops. Its meetings help bring to light any problems or new potential applications of S3 data. Secondly, there is the S3 Quality Working Group (QWG), which includes selected scientific experts from both within and outside S3MPC. This body considers options and makes recommendations to ESA and EUMETSAT on potential evolutions of the PB.
The S3MPC monitors and explores the calibration of all S3A and S3B instrumentation, and of how they compare with other satellites, with models and with in situ instrumentation. The S3MPC is responsible for the quality of all data disseminated to users, for all levels and latencies (see Table 1) and through all evolutions of the PB. Data quality is regularly assessed and summarised in Cyclic Reports provided to ESA and EUMETSAT [13], with a longer term perspective provided by Annual Reports.
This paper provides an overview of the activities of the S3MPC in relation to the Surface Topography Mission, which focus on the operation of the SRAL and MWR and of the products derived from their data. Later papers will focus on some of those areas in greater detail. (Note, the precise orbits are a common challenge for all missions, and are thus the responsibility of another entity, the Copernicus POD Service.) Section 2 provides details on the health and performance of the two main instruments (SRAL and MWR), using data from their various internal calibration modes. Section 3 then assess the wave height and wind speed data derived over the ocean, via comparisons with buoys, models and with other satellites. Section 4 highlights three different approaches to determining the range bias, which is essential for recovery of absolute sea surface height, with Section 5 comparing Sentinel-3 SSH data with those from Jason-3. Section 6 details the work on assessing Sentinel-3 performance in the cryosphere, with Section 7 showing the expanding work over inland waters. The paper then finishes with a summary and a list of the main abbreviations used.

Internal Monitoring of Instrument Performance
The Sentinel-3 spacecraft has a number of internal systems to monitor performance of both the altimeter (SRAL) and the microwave radiometer (MWR). This is to enable the S3MPC to assess to what extent calibrations have changed since initial pre-launch measurements. Such calibrations are necessary to make adjustments to the processing to correct for how the instrument response differs from the ideal assumed by models. Regular monitoring of these many parameters also helps us understand the health of the instruments.

SRAL
The ground processing starts from the telemetered data, and some of the first processing steps are based in the instrumental calibration data collected on-board. For instance, at Level-1B processing stage, radar echoes are corrected by the instrument transfer function (science signal spectra distortions due to altimeter system characteristics are to be compensated). Also the radar signal delay within the instrument calibration path is measured, so it is not incorporated in the final range, which otherwise would include an additional range (lower surface elevation) in the results. Thus, an increasing calibration delay during the mission could be interpreted as decreasing sea surface height, if there was no compensation for it. The range is also impacted by the on-board clock, an Ultra-Stable Oscillator (USO) responsible for measuring the round trip time of the echo. Similarly, the recorded power of the instrument's transmitted signal is crucial for correcting the σ 0 computation.
Not only are the absolute values important for correcting the different biases caused by the instrument behaviour, but assessment of their drift is essential for avoiding misinterpretations of the geophysical variables retrieved at Level-2. For instance, some features observed in the ocean surface height or in the winds over the sea can be caused by orbital oscillations of instrumental parameters due to thermal conditions on-board. The algorithms of the ground Level-1B processing are designed for compensating for such effects, and the Expert Support Laboratories are required to assess any relation between a suspicious behaviour of the final geophysical retrievals and calibration variables performances.
To achieve all this, the instrument routinely runs a number of calibration modes, but to avoid disruption to the key Earth observation goals of the Sentinel-3 mission, many of these calibration modes are implemented over deserts. CAL1 sends the emitted chirp signal directly into the reception chain, bypassing the duplexer to Antenna, in order to ascertain the Point Target Response (PTR) and parameters derived from it. CAL2 records the instrument transfer function by listening to the thermal noise to note the variation of hardware response with frequency. AutoCal tests the performances of the two on-board attenuators, which are used together to optimize the power dynamic range of the waveform. Below we list the most important variables considered in the Level-1B processing, together with their specific impact in the final geophysical variables.
• CAL1 Delay: Covering the signal path delay within the transmission and reception chains. It impacts the final range measurement in an additive way. • CAL1 Power: Measuring the changes in the on-board chirp power by bridging the duplexer to antenna path. It impacts the ocean wind retrieval. • CAL1 PTR width: The PTR is sinc-shaped. The PTR main lobe width at half power is calculated and used when retrieving the Significant Wave Height. • CAL1 Burst Power and Phase: In SAR operational mode, the altimeter emits pulses in bursts of 64, but the power and phase will change during a burst. Failure to correct for these intra-burst differences could result in noisier geophysical results and a bias in σ 0 .
• CAL2: The CAL2 measurement is computed by accumulating a large number of calibration waveforms while the receiving window is located at an altitude where no surface return is expected, forcing the instrument to receive only thermal noise. • USO frequency: For measuring the two-way travel of the chirp, a clock is needed. This clock is built with an exceptional precision, and therefore called Ultra-Stable Oscillator (USO). However, there may be anomalies in such performances: the EnviSat RA2 USO clock period unexpectedly generated a jump equivalent to 5.6 m and oscillations of 30 cm around the orbit. A specific solution [14] was designed to cope with this anomaly and correct the range measurements. A drift in the clock frequency corresponding to several millimetres per year is expected, with its impact on range being multiplicative. • AutoCal: The difference between the real attenuation and that expected from the attenuators is needed for the σ 0 computation. Hence, potential errors could impact the estimation of wind speed. • Thermistors: As mentioned earlier, the thermal conditions on-board are to be monitored. The correlation between these data and the calibration parameters is important. The potential thermal sensitivity of a particular parameter is to be followed and studied to discard undesired oscillations or excursions.
Apart from the above parameters, there are a number of secondary calibration variables that are also monitored and regularly assessed. These variables are not compensated for in the on-ground processing of the science data, and therefore any non-ideal performance needs to be carefully analysed, such as the power distribution within the PTR secondary lobes.
Five different areas are devoted to the calibration measurements, where CAL1, CAL2 and AutoCal modes are running for Ku and C bands, in LRM and SAR operational modes. The figures below show calibration parameters of SRAL instruments, from both S3A and S3B. The figures exclude the records from the very beginning of the mission (BOM), when the parameters are impacted by several instrumental mode changes. Hence we can better extrapolate and predict the instrument behaviour. S3A data were collected from May 2016 to March 2020; S3B data cover from July 2018 to February 2020. The figures show the Ku-band, which is the main science band, in SAR mode, the operational mode globally used currently over all surfaces. Figure 4 shows the calibration parameters that are accounted for in the science range measurements: CAL1 Delay (a) and USO frequency (b). For the CAL1 Time Delay, we observe similar absolute values close to 1 m internal delay and opposite drifts: −0.3 mm/year for S3A and 0.7 mm/year for S3B. For comparison, CryoSat-2 and Envisat missions delay drifts are respectively about 0.3 and 1 mm/year. The USO frequency correction in terms of range is of the same order of magnitude for both missions: 4.4 mm/year for S3A and 1.8 mm/year for S3B. Figure 5a depicts the CAL1 power drift of both missions. At the beginning of S3A mission, the power was dropping at a high rate of −1 dB/year. This caused concern, due to the mission requirements of a maximum permitted drop of −3.5 dB for ice applications. Nevertheless, we see how the drop is becoming less steep, with an overall mission slope of −0.34 dB/year. By fitting a power model to the current S3A power series, the −3.5 dB limit should not be reached until 2034. For S3B the whole mission power trend is about −0.36 dB/year. CryoSat-2 and EnviSat missions power drops are around 0.2 dB/year. Figure 5b shows the CAL1 PTR width behaviour. For S3A, a clear negative slope (slope of −0.36 mm/year) is noted, while S3B behaviour is more erratic (both annual slope and standard deviation absolute values are about 0.16 mm). For CryoSat-2 and EnviSat altimeters, we see PTR width long term drifts of 0.01 mm/year and 0.1 mm/year respectively. In SAR operational mode, within each burst, the 64 pulses are emitted at a very high pulse repetition frequency (PRF). Such a high PRF stresses the instrument, and causes the pulse power to decrease and their phase to vary within the 64 transmissions of the burst. Burst corrections are arrays of 64 records for the power and phase compensation, as shown in Figure 6a, where similarities between both missions figures can be seen. These corrections have a progression throughout the mission, illustrated in Figures 6b,c for S3A and S3B, showing a stable behaviour in both missions: excursions below 0.02 radians and 0.06 dB. Annual oscillations are observed in the burst power for S3A, whilst S3B displays long-term variations that are higher than those for S3A, although within the same order of magnitude. The transfer function depicts the altimeter signal distortions within the observation window, caused by the imperfect behaviour of the altimeter electronics and on-board processing. The CAL2 ideal shape is a rectangle, unitary in the band of frequencies of interest, but it actually shows ripples, caused by the cut-off anti-aliasing filtering stage [7,15]. The CAL2 ripples, usually observed in other altimeters, are shown in Figure 7a. The stability of this shape during the mission is shown in Figure 7b,c for S3A and S3B. The CAL2 waveform features, such as sides slopes and standard deviations are maintained in both missions. The correction is continuously updated, and the science measurements are compensated by an averaged CAL2 over a number of previous days. AutoCal calibration mode produces an array that is used for correcting σ 0 for the difference between the commanded attenuation and the actual attenuation applied to the signal on-board. Moreover, this correction can drift in the long term, and an updated in-flight mission-averaged AutoCal table will be used from 2020. Figure 8a depicts the averaged attenuation correction tables for the two bands and missions, where the curves show the delta between the corrected and reference attenuations. Different schemes can be developed for achieving the different desired attenuations, and this is the reason why both missions have completely different shapes, but more similar between bands for a mission. Figure 8b,c illustrate respectively the correction progression for S3A and S3B missions: S3B shows higher differences during the mission.  The SRAL thermal environment is constantly monitored. As expected, there are Variations are due to annual on-board temperatures oscillations and also spikes caused by instrument restarts or occasional events caused by other instruments on the spacecraft. There is little impact on the calibration parameters. For instance, the burst power correction is sensitive to the on-board thermal conditions, but burst absolute values are not affected by more than 0.02 dB.
All the above information is included in the Cyclic Reports [13], with summary tables and analysis of current cycle and whole mission periods. However, it is important to note that all these anomalous behaviours do not imply erroneous geophysical retrieval. Rather, they show the corrections that are applied during Level-1B processing, so that the geophysical result at Level-2 is not impacted by the instrument behaviour.

S3 Microwave Radiometer
Coupled with the altimeter, the Microwave Radiometer (MWR) is an essential part of the payload for surface topography, as it is used to estimate the excess path delay (PD) on the altimeter range due to humidity in the troposphere. The S3 MWR measures at two frequencies: around 23.8 GHz, a frequency sensitive to water vapour burden, and around 36.5 GHz, which responds to the liquid water in clouds. The wet troposphere correction (WTC) is the negative of the path delay. Today WTC is still considered one of the main contributors to the error budget of the Mean Sea Level (MSL) [16]. Performance of the retrieval algorithm is of course important as it drives the performance of the range estimation, and thus of the mission itself. But the stability of the MWR is also a key point in ensuring the stability of the WTC, and thus in correcting the MSL without introducing an instrumental drift. The system requirement document provides a target for the radiometric stability, which is that it shall be smaller than 0.6 K over the lifetime for all channels. The S3MPC is tasked with monitoring the MWR instrument continuously since its switch-on, and throughout its entire lifetime.

Monitoring of MWR Internal Parameters and Brightness Temperatures
The MWRs on-board S3A and S3B missions are identical in design. They are dual-frequency instruments measuring in K band (23.8 GHz) and Ka-band (36.5 GHz) bringing information about water vapour content and cloud liquid water content respectively. Sentinel-3 MWR operates as a Noise Injection Radiometer (i.e., a Dicke balanced radiometer) for observing temperature below the reference temperature (the so-called NIR mode), with a noise diode adding power to the observation to achieve the balance with respect to the internal reference hot load. This mode is the main observation mode for Sentinel-3 MWRs. For temperatures higher than the reference temperature, the MWR operates as a traditional non-balanced Dicke radiometer (the so-called DNB mode). Therefore, two different kinds of raw count are then provided by the instrument: the noise injection pulse length for NIR and the error voltage for DNB. This means two kinds of calibration parameters are required by the ground processing to convert raw counts to brightness temperatures. Brightness temperatures at the MWR sampling rate (∼7 Hz) are the output products of MWR L1B processing. In the level 2 processing, the brightness temperatures are averaged to the altimeter time tag at 1 Hz. The altimeter backscatter coefficient (σ 0 ) is used in addition to the two brightness temperatures to retrieve geophysical parameters. The two most important are the wet troposphere correction (WTC) and the Ku-band atmospheric attenuation, which are used as corrections to the altimeter range and σ 0 respectively. Water vapour content and cloud liquid water content are also derived from the MWR data and assessed by the S3MPC through comparison with models.
Monitoring of the MWR starts with the monitoring of the internal thermal environment, which is very important for a radiometer. Orbital and annual oscillations exist, but there are also spikes during events associated with other on-board instruments. However, there is no impact on MWR data quality. The second step concerns the raw counts and the internal calibration parameters. Internal calibrations are performed every 30 s and are averaged within a window of 10-15 min allowing us to account for the small evolution of these parameters along the orbit. But monitoring of the calibration parameters is not sufficient to ensure the stability of the MWR products. This is because the antenna path is not exactly the same as the calibration path within the instrument, and that the calibration parameters evolve during the instrument lifetime due to aging, for instance, without impact on the brightness temperatures. This is also the role of the internal calibration: to follow the evolution of the instrument and account for it. Thus brightness temperatures also need to be monitored.
The plots of the calibration parameters in Figures 9 and 10 show a marked change for S3A in March 2018. This is due to the update of the calibration timeline for S3A. During the first two years in flight, S3A internal calibrations were measured only three times per orbit with long calibration sequences. This methodology caused data gaps in the Level 2 altimeter dataset. A new setting was thus defined to avoid these gaps by using shorter sequences with a faster repetition (every 30 s). For the first setting, averaging was performed for each calibration sequence, during the calibration processing. For the second method, the averaging is performed at a later stage of the processing, when processing antenna counts. This explains why the second procedure seems to show noisier calibrations. Figure 9 shows daily statistics of the noise injection temperature, one of the calibration parameters for the main observation mode. The dataset analysed here is composed of reprocessed data for the beginning of the time series and operational data for the end, while the S3B dataset is composed only of operational data. This explains the relative smoothness of S3A dataset with respect to S3B for which a significant jump can be observed at the end of 2018 due to the update of the MWR characterization file. While the S3A parameter is rather constant and smoothly evolving, the S3B 23.8 GHz injection temperature increased during one day in April 2019 without any related incident. After a period of very gradual increase, it has stabilized. For 36.5 GHz channel, both instruments show different behaviour: S3A follows an annual cycle with a slow increase while S3B has a faster increase.
In Figure 10, the receiver gain is monitored through daily statistics, again for both instruments. For the 23.8 GHz channel, both instruments have very similar behaviour during the common period. Two steps in S3B dataset are observed: the first one due to the update of the MWR characterisation file for that instrument, the second one (in common with S3A) due to a particular update of the ground processing. For the 36.5 GHz channel, things are quite different. S3B receiver gain estimation looks noisier than for S3A, but has been stable since beginning of 2019. For S3A, radio frequency interference on 24th November 2018 by the American military radar facility on Kwajalein atoll (9 • 23'47" N-167 • 28'50" E) has caused stress to the instrument and changed its operating characteristics. The impact in the receiver gain is huge but investigations have shown that there is negligible impact on MWR data quality.  Apart from calibration, brightness temperatures are also monitored using statistical selections. Statistics over the ocean are used to check the consistency of the brightness temperatures with other missions operating in the same band of frequencies. For calibration purposes and stability assessment, statistics over specific target are used. Statistical selection of coldest temperature over ocean is commonly used for long-term monitoring [17][18][19]. Multi-mission consistency allows us to discriminate instrument stability from reference stability. If the reference is drifting or changing, several instruments will be impacted; while if one instrument is drifting it is unlikely that another one will drift in the same way.
The Figure 11 presents the vicarious calibration monitoring for the MWRs on S3A and S3B and those on three other satellites: AltiKa, Jason-3 and Metop-A. For Metop-A, the two pixels of smallest incidence (closest to nadir) are averaged. Four of these missions are sun-synchronous, with the local time of ascending node (LTAN) being 22:00 for S3A/B, 21:30 for Metop-A and 06:00 for AltiKa. As Metop-A and S3A/B are close in local time, they will see similar situations of surface temperature, which is a parameter impacting the coldest temperature that can be observed. Figure 11a shows the monitoring of coldest ocean points for these four missions. We can see how close are S3A, Metop-A and S3B (after the update of the MWR characterisation file). A small bias is observed with AltiKa and a larger one with Jason-3. It shows the stability of S3A MWR so far.
At the other end of the range of observable brightness temperature, the hottest temperatures are selected over the Amazon forest and monitored as shown by Figure 11b. The Amazon forest is the natural body closest to a black body for microwave radiometry. The annual signal is stronger for the hottest temperature over Amazon due to the annual cycle of water vapour. Events such as El Niño will also have a signature on the hottest brightness temperature due to the excess of water vapour over South America. S3A and S3B have an evolution very similar to the other missions and show their stability.
The same monitoring is performed for the second channel of S3A/B (see Figure 12) with the difference that the liquid water channel frequency is not the same for the four missions: 31.4 GHz for Metop-A, 34 GHz for Jason3, 36.5 GHz for S3A/B, and 37 GHz for AltiKa. The smaller the frequency, the lower the coldest temperature. These differences in frequencies have an impact mainly on the coldest ocean points as seen in Figure 12a, the Amazon being close to a black body at microwave frequencies. These diagnoses show the very good consistency of the liquid water channel of S3A and S3B MWRs with other missions and their stability.
The results of this monitoring for both channels are shown and discussed in the relevant cyclic report [13].

Monitoring of MWR-Derived Level 2 Products
For performance assessment at Level-2, we consider the four MWR-derived geophysical parameters: the wet troposphere correction (WTC), the atmospheric attenuation, the water vapour content, and the cloud liquid water content. These are usually computed from three measurements (two MWR brightness temperatures and the altimeter backscatter coefficient, σ 0 ). There is also a 5-parameter algorithm that incorporates two extra fields (sea surface temperature and the atmospheric temperature lapse rate).
The consistency of the atmospheric corrections with other missions cannot be achieved only through crossover points. Indeed, the atmosphere evolves much faster than the ocean and so a very short interval would be required for the selection of the crossover between two different missions. This selection reduces drastically the number of crossover points, and, moreover, when computing crossovers between two sun-synchronous missions, the points will mainly be at the highest latitudes. Thus a reference non-sun-synchronous mission, such as Jason-3, shall be used. Another dataset can be used for consistency analysis of different missions. For this, we use output from the ECMWF operational model as the data are available at short latency; however, there are roughly annual updates to the model, which complicate the interpretation, so long-period analyses are also performed using a stable model (not shown here). ECMWF provides, every 6 h, analyzed profile and surface variables that allow the evaluation of model parameters colocated with the measurement and the computation of the difference between the model and the MWR parameter. Since the model is the same for all missions, only the geographical sampling can bring some differences, as well as the calibration of the instruments and residual biases in the retrieval algorithm. In the absence of any drift, it is expected that missions will show the same evolution with respect to the model. S3MPC is routinely performing the monitoring of the geophysical parameters for several missions: Jason-3, AltiKa, S3A and S3B. Jason-3 has a three-channel radiometer (18.7 GHz, 23.8 GHz, 34 GHz) while the three others only have two frequencies (23.8 GHz; 36.5 GHz for S3; 37 GHz for AltiKa). Figure 13a shows the mean difference between MWR-derived corrections and ECMWF for the wet troposphere correction. For the monitoring presented here, S3A/B corrections are derived from both the 3-parameter and the 5-parameter algorithms. There is a bias between the model and the MWR-derived correction for all these missions, but this bias is rather small, less than 1 cm for all missions and even less than 0.5 cm for S3A and S3B. AltiKa and S3 have the same evolution of the difference, including the same oscillations and slopes over various periods. This may indicate geophysical events not correctly accounted for by the model. Jason-3 does not have the same evolution, possibly due to its non-sun-synchronous orbit. Figure 13b shows the standard deviation of the difference, which is an indicator of the retrieval performance. Jason-3 has the smallest S.D. of ∼1.2 cm, because it has the extra radiometer frequency. We can see that the 5-parameter (5P) retrievals for S3A and S3B have a small bias with respect to 3-parameter (3P) ones. For S3A, the mean bias for 3P retrieval is 0.14 cm and 0.04 cm for 5P retrieval. The S.D. is slightly smaller, 1.36 cm for 5P vs 1.42 cm for 3P, highlighting slightly better performances for this diagnosis. But the geophysical assessment of difference of variance of SLA at crossover point have shown similar performances for 3P and 5P although the 5P shall bring an improvement close to 0.5 cm 2 [20] with respect to the 3P. The S3MPC is working on a new algorithm that will solve this issue.  Figure 14 presents maps of the WTC difference between MWR and model for S3A (3-parameter retrieval) and Jason-3 over the same 27-day period. We can see that the biases relative to the model are not homogeneous over the ocean. Jason-3 and S3A shows the same patterns, indicating areas where the model is not accurate, but the patterns for S3A have a higher amplitude due to it only being a 2-channel radiometer, while Jason-3 has a 3-channel one.

Metocean Observations from S3A and S3B
Both wind speed and wave data are routinely observed by Sentinel-3A and -3B, with estimates being provided from both the SAR and PLRM processing. Here, we concentrate on the findings concerning the S3A SAR-derived estimates.

Procedure
Radar backscatter (σ 0 ), surface wind speed (WS) and significant wave height (SWH) from the marine Level-2 product S3A_SR_2_WAT are monitored and validated using the procedure successfully applied to equivalent products from earlier altimeters (Jason-2 [21], SARAL-AltiKa [22] and CryoSat-2 [23]). The procedure, which is based on Abdalla and Hersbach [24], is described in Appendix A of any of the "Wind and Wave" issues of Sentinel-3 Cyclic Reports [13]. It consists of a set of self-consistency checks and comparisons against other sources of data. Model equivalent products from the ECMWF Integrated Forecasting System (IFS) and in situ measurements available in Near Real-Time (NRT) through the Global Telecommunication System (GTS) are used for the validation.
The validation is based on the NRT operational data from both Sentinel-3A and -3B, using the STM product distributed by EUMETSAT in netCDF through their Online Data Access (ODA) system. [For consistency with other meteorological data, validation analysis will be converted to work with the formal BUFR (Binary Universal Form for the Representation of meteorological data) format whenever that becomes available.] The raw data product is collected for 6-hourly time windows centred at synoptic times (00, 06, 12 and 18 UTC).
The data are then averaged along the track to form super-observations with scales compatible with the model scales of around 75 km (being typically 4 to 8 times the model grid spacing). This corresponds to 11 individual (1 Hz) Sentinel-3 observations at ∼7 km spacing.
To achieve this, the stream of altimeter data is split into short observation sequences each consisting of 11 individual (1-Hz) observations. A quality control procedure is performed on each short sequence. Erratic and suspicious individual observations are removed and the remaining data in each sequence are averaged to form a representative super-observation, providing that the sequence has enough number of "good" individual observations (at least 7). The super-observations are collocated with the model and the in situ data (if applicable). The raw altimeter data that pass the quality control and the collocated model data are then investigated to derive the conclusions regarding the data quality.

Routine Assessment
Routine assessment of wind and wave products on global scale is composed of cyclic assessments as documented in Cyclic Reports [13] and longer term assessments like annual assessments. The main results from the latest annual assessment (2019) are presented here. Furthermore, assessment of reprocessed data sets are carried out whenever such reprocessed data are made available.
The scatterplots ( Figure 15) show the direct comparison of Sentinel-3A significant wave height measurements against in situ measurements and model output. The corresponding plots for Sentinel-3B are very similar and, therefore, not shown. They are based on the data derived from the SAR waveforms for the whole of 2019, using the operational NRT data stream. The statistics for each comparison are displayed within each panel, but the key aspects to notice are that the data lie almost symmetrically about the 1:1 (diagonal) line, so in each case the bias is small and the slope close to unity, indicating that the algorithms are working well. The in situ data provide an absolute calibration of the altimeter's metocean fields, but are limited in the number and geographical distribution of match-ups in that most of the available measurements are from buoys located in the coastal areas around Europe, Japan and North America. The use of model output in the assessment provides a better global comparison, including more representation of high wave and long swell conditions such as found in the middle of the oceans. The ECMWF wave model (WAM) does assimilate altimeter observations, but the evaluation shown in Figure 15c is using the "First Guess" i.e., the model forecast at a given time step prior to the incorporation of data.
The assessment of altimeter SWH using in situ observation shows a standard deviation of the differences (SDD) of 0.34 m, with the altimeter biased slightly higher for SWH > 7 m. The SDD is 0.27 m for comparison with the model, with this lower value being partially attributable to the altimeter match-ups with the model are mainly offshore, whereas the in situ match-ups are more close to the coasts with greater uncertainties involved in the measurements. A display of the spatial pattern of the altimeter bias ( Figure 16a) shows that regional variations are present but generally small. In the low latitudes, where mean SWH is small, the altimeter appears to be underestimating relative to the model by ∼0.1 m; in the mid to high latitudes, where larger waves are prevalent, S3A returns values ∼0.1 m higher on average. There are some problems at the edge of the Antarctic ice pack that are likely to be due to imperfect flagging of ice-contaminated altimeter data and/or imperfect model sea-ice description.
The wind speed comparisons in Figure 15b,d show a wider spread because wind speed varies on shorter time and space scales than waves. The SDD for the comparison with in situ measurements is 1.37 m/s, which reduces to 1.07 m/s for the comparison with models (again due to the impact of local coastal conditions on wind measurements). There is also slightly less agreement at low wind speeds, where altimeters cannot easily differentiate among speeds less than about 1.5 m/s [25,26], and also at high wind speeds, where large storms may also be associated with rain that affects the σ 0 measurements [27,28] used in deriving wind speed. The geographical distribution of bias shows the altimeter recording slightly higher values (0.2-0.4 m/s) than the model for the region of the Intertropical Convergence Zone and South Pacific Convergence Zone (both of which may be affected by rain), and also in the storm-dominated areas of the NW Atlantic and NW Pacific. Again there are potentially some problems with data flagging along the Antarctic ice edge. Very similar results are found for S3B (not shown). Although the levels are different, similar geographical bias patterns can be seen when comparing wind and waves from other altimeters (e.g., Jason-3). It is not easy to conclude whether these patterns are due to issues in the altimeter (e.g., rain effects) or the model (e.g., ocean current effects).

Effect of Changes in Processing
The results shown in the previous subsection were for the operational NRT Sentinel-3 data which witnessed a processing baseline change (PB 2.45) in February 2019. Furthermore, there was a change in the ECMWF model used for the assessment in June 2019 (CY46R1). This is illustrated in Figure 17, which shows the comparisons for the NRT data, as would be acquired by those needing them for day-to-day operations. Over the course of three and a half years there have been many changes in the Processing Baseline (PB), guided by the S3MPC's ongoing evaluations, as well as changes in the ECMWF model (roughly once a year for the time being). Note that many of the PB changes were not intended to affect wave height or wind speed but are provided in the plots to show their neutral impact.
Two of the PB changes impacted the global mean bias of wave height from S3A (Figure 17a), but regional variations have persisted, with the bias in the tropics always being ∼0.1 m below that for the global comparison, whereas the northern and southern hemispheres are biased high in their respective winters (when the waves are larger). However, these PB changes had no clear impact on the standard deviation of the difference (SDD) which is related to the random errors. The model change (CY46R1), which brought in better physics representation of the swell dissipation, improved the agreement as can be interpreted from the lower SDD values. The variations in SDD for each hemisphere are greatest for their winters.   The recalibration of S3A's σ 0 values at the end of 2016 brought the altimeter wind speeds in line with those from the model, zeroing the bias and reducing the SDD ( Figure 17). Since then there does seem to be a slight increasing trend in the bias which can also be seen for other altimeters such as Jason-3; the seasonal variations for the northern and southern hemispheres are as for wave height, except that the amplitude of the variations is greater for the northern one. Note CY46R1 incorporated a significant change in the wave model while the change in the atmospheric model had no major impact on surface wind speed.
An up to date record of the biases of wave height and wind speed with respect to the model is provided in the cyclic reports [13].

Consistency with other Altimetry Datasets
As noted earlier, Sentinel-3A is the first altimeter to operate in SAR mode over all the ocean, and thus there are concerns about the consistency of its results with preceding LRM instruments. Moreau et al. [29] had suggested that the narrower along-track footprint could lead to an underestimation of the height of very long period swell, such as may sometimes be found at the eastern side of large ocean basins. In Figure  18 we compare data from Sentinel-3A (the latest reprocessing using PB2.61) with that from Jason-3 for the same period. Although the two datasets are not exactly collocated (in both space and time), the interest here is in assessing any relative bias between differently derived climatologies rather than in absolute accuracy (as indicated by the wave buoys in the previous section). The mapped data are shown on a grid of 3 • in longitude by 2 • in latitude, so there are typically 75 1-Hz measurements each month in a given open ocean cell. Figure 18a shows the mean SWH for 2019 from the Sentinel-3 PLRM data, using the standard MLE-4 algorithm [30]. Figure 18b shows the mean bias of SAR mode data relative to PLRM for the same period. The plots show clear geographical patterns associated with the varying atmosphere/ocean conditions and indicate that the SAR mode estimates are higher than PLRM by an amount that increases with mean SWH conditions. This is further evidenced by Figure 18e, which shows a scatter plot of the annual mean bias from Figure 18b as a function of the mean SWH from Figure 18a. Figure 18c shows the bias of SAR mode data relative Jason-3 data averaged over 2019. The map is more speckled than Figure 18b, as the two altimeters take measurements at different times for each box, and thus randomly sample different events. The annual means for Sentinel-3's PLRM and Jason-3 differ by only small amounts in the tropical regions, where SWH is typically low. However, in the high latitudes there are larger variations, both positive and negative, but with little clear pattern. Figure 18f shows the corresponding scatter plot, revealing that, compared with Jason-3, the PLRM data on average deliver slightly lower values in low SWH regions and larger values in high SWH regions. Understanding these slight differences between different altimetric datasets is crucial when aiming to construct a homogeneous dataset across multiple missions [31].
A very different comparison of these 3 altimetric records is provided in Figure 18d, which shows the probability distribution function of 1 Hz data from the different datasets. The PLRM distribution is close to that of Jason-3 near the peak (frequency of observations at the mode), but the median and 90th percentile for the SAR mode estimates more closely tally with the Jason-3 values. Other than a slight feature at 2.6 m, the distributions of SAR mode and Jason-3 concur very well at SWH > 2.2 m. However, both SAR and PLRM measurements from S3A show many more very low values. This may be due to data quality flagging issues, with the anomalous numbers below 0.5 m for S3A possibly being erroneously not flagged. It is worth noting that, in any case, altimetric records of low SWH are problematic because the waveform bin cannot adequately resolve such steep leading edges.

Validation of Wave Heights in the Coastal Zone
The preceding material in this section has focussed on SWH data in the open ocean; that region dominates any global analysis, and the best absolute calibrations are achieved using wave buoys far from coastal effects. However, many of the key societal applications of SWH data are in the coastal zone, including effects on harbours and shoreline infrastructure as well as the impact on near-shore fishing fleets. For traditional altimeters, errors due to land returns or effect of sheltered bays [32] become typically manifest within 15 km of the coast. One of the expected benefits of SAR mode altimetry is greater resilience to extraneous returns from off-nadir surfaces, potentially allowing meaningful SWH measurements closer to the coast.
Nencioli and Quartly [33] have compared the different SWH records from SAR and PLRM mode in the coastal zone with the in situ observations from of a network of wave buoys close to the southwest coast of England. Not all available coastal buoys were adequately positioned for validation of satellite data. However, using results from a realistic wave model, they were able to demonstrate which buoys were suitably located for this task (Figure 19a). Figure 19b shows that for the selected tracks, the root mean square difference between SAR mode estimates and buoy observations decrease along the track as the coast is approached, with a mean value of 0.46 m in the last 15 km. On the other hand, it is within this range that the PLRM estimates increase markedly with a mean error of 0.84 m (Figure 19c). Note that the careful selection of data for comparison and its flagging is an important component of such an assessment. A global comparison against buoys by Schlembach et al. [34] returned much larger errors for SAMOSA, with better results for new retrackers, probably partially due to their improved data editing.

Assessment of Altimeter Range
Post-launch calibration and validation of the satellite measurements, such as range and datation (i.e., the precise timing of measurements), is a prerequisite to achieving the desired level of accuracy and ensuring the return of the investment [15]. The process of Calibration/Validation (Cal/Val) are provided by several independent external facilities that determine the error in satellite measurements, using accurately monitored surfaces or known and controlled signal inputs from the ground.

Radar Transponders
A radar transponder (TRP) is an active reflector on the ground that receives the altimeter pulses and returns a copy of them with a well known delay and amplification. One of the transponders used for the Sentinel-3 calibration activity is located in Crete [35], Greece. It was developed at the Technical University of Crete for the ESA's Copernicus Earth Observation programme. This site has been named CDN1 Cal/Val and is located at 35.3379302808 • N, 23.7795182869 • E and 1048.8184 m height with respect to the WGS84 reference system, on the western mountains of Crete. Since the month of June 2019 the Svalbard transponder has also been used to calibrate the Sentinel-3 satellites. It was developed in 1987, designed, built and tested for ERS-1 satellite and it was used for the CryoSat 2 calibration activity. The transponder is located at 78.23052306 • N, 15.39376997 • E and 492.772 m of height with respect to the WGS84 reference system. During Sentinel-3 operations phase, the transponders acquisitions are repeated every 27 days, with a maximum across-track displacement of ±1 km.
The Level 0 complex echoes (in-phase and quadrature components) are processed to Level 1A for ingestion by the DeDop toolbox [36]. To understand the timings of individual returns, the range cell migration (slant range correction) is undone. The parabolic range evolution of the TRP signal is shown in Figure 20. In order to improve the precision of the measured range, an FFT is performed in the range direction, using a zero padding factor of 512. From the position of the maximum value for each beam, we can compute the uncorrected range: where window corr is the 2-way time between the pulse transmission and the reference point in range window, tracking re f _sample is the sample of reference in the range window, BW is the bandwidth of the Ku-band, c is the speed of the light and max sample is the position of the maximum value for each beam. Range meas is computed by applying geophysical corrections and the internal delay to the Range uncorr : The range to transponder will naturally trace out a parabola as the spacecraft approaches and recedes from the TRP site, but the positioning of that parabola enables the determination of errors in the internal timing (datation) and range (see Figure 21). To increase the accuracy in the recovery of this information, the resolution of the waveforms is increased, and parabolae are fitted to both the theoretical and the measured range curves. First the vertical shift is determined, corresponding to the range bias (Figure 21b), then, as shown in Figure 21a, a datation bias can be computed to shift the curve forwards (yellow parabola) or backwards (red parabola) relative to the theoretical reference range (black parabola). The results for range bias are shown in Figure 22. S3A over Crete shows a mean bias of 6.79 mm (larger range than expected), while the results for S3B show a negative range bias of 6.59 mm (shorter than expected), resulting in a difference of ∼13 mm between their range bias measurements. A negative regression line of 0.57 mm per year, with a low fitting coefficient (r 2 ) is obtained for S3A [37]. The results for S3A over Svalbard show a negative measured range, 109.92 mm shorter than expected, and S3B provides a negative range bias, 68.44 mm shorter than expected. This disparity is partly due to the Svalbard site not providing accurate in situ estimates of the dry, wet and ionospheric corrections (as is the case for the Crete site) and so the analysis relies on the model corrections within the supplied L2 product. Datation corresponds to an error in the timing of echoes along track (see Figure 21a). The datation for S3A has a standard deviation of 19.95 microseconds and an average of −125.95 microseconds [37]. First acquisitions of S3B show similar values to S3A (passes in tandem orbit following S3A), but it is interesting to note that the first successful acquisitions of S3B in the new orbit show that the datation bias has been reduced from −114.60 microseconds to residual values (average of −25.7 microseconds), as illustrated in Figure 23. A related property is the stack alignment, which describes how the different Doppler components from a point target are arranged. This value is calculated by analysing the central beams of the stack. It is obtained by measuring the slope of the position of the maximum value for the 100 central beams. The alignment results are correlated with the datation ones. Since the S3B orbit change, its stack alignment has improved and is currently ∼ 0.02 mm.
The range noise results are calculated based on the standard deviation of the TRP range errors, it includes the noise from the instrumental and geophysical corrections, with none over 2 mm, and majority passes below 1 mm. Results over Svalbard present an average noise worse than in Crete: S3A results show ∼15 mm stack noise and S3B ∼7 mm. Taking as a reference the results obtained in Svalbard with Cryosat-2 [35], it is observed that the noise computed is about ∼7 mm.
A summary of the main results for S3A and S3B over Crete and Svalbard is shown in Table 2.

Dedicated Altimeter Calibration Site off Corsica
The Corsica calibration/validation site was initially implemented in Senetosa with three tide gauge instruments, in order to monitor the performance of TOPEX/Poseidon and the follow-on Jason legacy satellite altimeters [38]. In 2005, the Corsica site was extended to the Ajaccio tide gauge, which enabled to monitor Envisat and ERS missions, CryoSat-2 and, more recently, the SARAL/AltiKa mission and the Sentinel-3A and -3B satellites [39,40].
As shown in Figure 24, the Sentinel-3A (S3A) configuration in Corsica is of particular interest, as the same track (741) flies close to both sites, which gives the opportunity to estimate the absolute SSH bias of the altimeter range twice within a few seconds. During the tandem phase, Sentinel-3B flew on the same orbit with a 30-second time-lag between the two satellites. It was thus possible to compare the SSH measurements of both satellites at the two calibration sites in Corsica within a very short time. The absolute sea surface height (SSH) bias of Sentinel-3A was estimates using the L2 NTC products (PB 2.33 to 2.45) from cycle 1 to cycle 47 (Sept. 2019). That for S3B was computed with the same products (PB 1.0 to 1.17) for cycles 9 to 13 (tandem phase period corresponding to cycles 32 to 36 for S3A). The computation of the SSH bias estimates was performed both for the SAR data and the PLRM data. As the S3A cycle 1 and S3B cycle 9 were operated in LRM, they were not considered in the SSH bias estimates statistics for both SAR mode and PLRM. For both missions, the altimeter sea surface heights are corrected using the GIM ionosphere correction and the ECMWF model wet troposphere correction.
As a first step, Paris Observatory (OBSPM) and NOVELTIS computed the absolute SSH bias estimates for the Sentinel-3A and Sentinel-3B (tandem phase) missions on track 741 in Senetosa and Ajaccio, using their own slightly different methods described in Bonnefond et al. [40] and Cancet et al. [42]) respectively. The OBSPM and NOVELTIS methods are very close and give same results on equivalent subsets of data. The two techniques mainly differ in the selection of the altimetry data and parameters in the comparison area (editing criteria) and on the management of some corrections (extrapolation of offshore information to reduce the land contamination in the case of the OBSMP technique). Figure 25 shows the time series of the absolute SSH bias estimates obtained with the two methods, for both missions and both modes, and averaged over the two sites. Some cycles are missing in the time series for various reasons: missing in situ data, lack of altimetry data in the comparison area, very strong σ 0 values ("σ 0 bloom") and very strong or very low waves. Table 3 gives the mean SSH bias estimates obtained with both processing methods, for all the configurations. Whichever the computation method, the Sentinel-3A and Sentinel-3B SSH bias estimates are very consistent, with differences within a few millimetres. The mean SSH bias estimates of Sentinel-3A at Ajaccio show slightly larger differences between the two methods that are well above the bias uncertainty (7-8 mm, see Table 3), especially for PLRM data (1.7 cm for bias differences). This may be due to the selection of the 20 Hz data in the comparison area, which is specific to each processing technique, and to the fact that the land contamination affects more the PLRM processing in Ajaccio. As expected, the SSH bias estimates computed on the PLRM range are in general more variable than the estimates on the SAR range. It should be noted that the Sea State Bias (SSB) correction available in the Sentinel-3 altimetry products is not optimal for either the SAR mode observations or the PLRM data yet, which may increase the noise in the SSH bias estimates. Indeed, currently, the SSB model used is that for Jason-2, as many years of data are needed to develop a robust non-parametric model for SSB. Although, the 5 cycles (4 in SAR/PLRM mode) of Sentinel-3B measurements during the tandem phase are not sufficient to compute robust statistics on the SSH bias, the time series on Figure 25 show very good consistency between the two Sentinel-3 missions for both types of products. As a second step, in order to increase the number of Sentinel-3A SSH bias estimates around the calibration sites in Corsica, additional estimates were computed utilising the offshore crossover points between S3A and the Jason-2 and Envisat mean sea surface height profiles (green dots in Figure 26b.) These were for S3A track 741 with Jason-2 track 085, and for Sentinel-3A track 044 with Jason-2 track 222 and with Envisat track 887. The offshore bias was estimated using the regional calibration technique developed by NOVELTIS [42] (and is shown in Figure 26a).The method uses the mean SSH profile determined along prior altimeter tracks to accurately transfer the reference datum offshore. In this paper, we only present the regional SSH bias estimates computed in Senetosa (indeed, the analysis of the results in Ajaccio has shown some discrepancies between the high-frequency signals (especially the ocean tides) that are observed by the tide gauge and the signals that are provided by the models, which require further investigation).
The along-track mean sea surface profiles computed along the Jason-2 and Envisat nominal tracks were used to link the offshore Sentinel-3A SSH with the area around the Senetosa tide gauge observations. Then, as for the direct local comparisons, a high-resolution mean sea surface (Figure 24) was used to link the altimetry and the tide gauge observations at the comparison point (point C in Figure 26a). Furthermore, in order to take into account the differences in the ocean dynamics between the offshore altimeter crossover points and the tide gauge stations at the coast, ocean tide and atmospheric corrections were applied to the altimetry and tide gauge SSH using the same models for both types of observations. The ocean tide signal was removed using a regional ocean tide model developed by NOVELTIS, with a high-resolution grid in the Corsica region. A global simulation from the TUGO-m model (ex-MOG2D model) provided by LEGOS was used to remove the effects of the wind and atmospheric pressure on the sea surface heights at all time scales (inverse barometer and high-frequency signals). Because this Dynamic Atmospheric Correction (DAC) solution is available only until December 2017, the Sentinel-3A offshore SSH bias estimates were computed until cycle 25 for track 741 and cycle 26 for track 044.  Table 4 gives the Sentinel-3A SSH regional SSH bias estimates in Senetosa, both for the SAR and the PLRM products, at the three crossover points considered in Figure 26b (green dots). For comparison purpose, the first line in the table gives the results obtained for the direct comparison between the S3A SSH and the Senetosa tide gauge SSH on track 741, when no ocean dynamics correction is applied, over the period of availability of the DAC correction (cycles 1 to 26). The second line shows the results of the direct comparison on track 741 when using the ocean dynamics corrections (DAC and regional tidal model). For all the other lines in the table (crossover points), the ocean dynamics corrections were also applied. The regional mean is then the average of all these estimates (local and offshore).
The results show stable bias from one crossover point to another, generally within less than 1 cm. The variability of the SSH bias estimates is larger at the crossover point with the Envisat track 887. In the SAR product, this is mainly explained by rather strong SSH bias values for cycles 4 and 14. In the PLRM product, this is intensified by the strong SSH bias value on cycle 13. The Sentinel-3A regional SSH bias at Senetosa is quite consistent with the local estimates, both in terms of mean and variability. All these results are, in general, in good agreement with the global cal/val analysis and the Crete transponder results (Section 4.1). The SSH bias estimates are close to 2 cm for both modes, with a variability of about 3 cm in SAR mode and 4 cm in PLRM mode. However, these results are still very dependent on the quality of the sea state bias (SSB) correction that is provided in the products. Several cycles are removed from the computation because of very strong SWH events that are not accurately managed in the current SSB correction. As a consequence, there is a strong need for a dedicated Sentinel-3 SSB correction (for SAR and PLRM).

Lake Issykkul
An alternative and complementary way of assessing the precision of measurements and the range biases associated with particular retrackers is to analyse altimeter passes across well-instrumented inland water sites. Over the last decade, calibration and validation activities over continental waters (such as lakes and rivers) have taken a central part. Compared with analogous sites over the ocean, lakes present some specific beneficial characteristics. In particular, many correction terms (such as the inverse barometer, the ocean tide, the dynamic variability and the sea state bias) are almost negligible. Seiches may occur with amplitudes of a few centimetres, but experimental design may mitigate this issue. The comparison of results from such inland sites with those in the open ocean thus provide some indication of the validity of those correction terms. Moreover, inland water calibration/validation sites can offer an easier and cheaper opportunity to further increase the density of such activities over the globe.
One particular study site is Lake Issykkul (42 • 10'-42 • 40'N, 76 • -78 • E) in Kyrgyzstan, Central Asia. It is the second largest mountain lake on the Earth and is located in the Tien Shan and Ala Too Mountains. It has the advantage to be overflown by all past, present and future altimetry missions. In terms of in situ equipment, permanent GPS receivers and a weather station have been installed along the shoreline of the lake since the first field campaign in 2004. A radar measuring water height every 5 min has been also added since 2011. The weather station is used to estimate independent zenith tropospheric corrections with discrimination between dry and wet contribution (See [43]).
In addition to the permanent instrumentation, between 2004 and 2019, a total of 18 campaigns have been performed for the calibration/validation of several satellite altimeters (TOPEX/Poseidon, Jason-1, Jason-2, Jason-3, GFO, Envisat, Saral/AltiKa and Sentinel-3A). During each field campaign, the water height relative to the ellipsoid of the lake is estimated from boat measurements collected along the altimeter tracks at the time of the satellite overpass. The boat is equipped with a GNSS station at the bow as well as a microwave radar measuring the height of the GNSS antenna above the lake surface every 30 s. The GNSS data are processed using the GINS software in the Precise Point Positioning (PPP) kinematic mode [44]. Accuracy reached by this system is at the level of 1-2 cm RMS [43]. The absolute range bias of the altimeter is then determined by calculating average difference between the altimeter and GNSS-derived records of the water surface. Further details on the experimental design and concept are given in [43,[45][46][47].
From October 2016 to October 2018, three campaigns over Lake Issykkul were dedicated to Sentinel-3A. Two tracks (number 666 and 707; see Figure 27) were selected for this purpose. The water height along these two tracks have been measured from the boat using the PPP GNSS data processing and then averaged along the track during the pass of Sentinel-3A for comparison with the satellite. Two retrackers, namely, ocean and OCOG, have been considered in this study. The resulting altimeter bias with respect to the ocean retracker is −1.4 ± 2.5 cm, and with OCOG it is 28.4 ± 2.0 cm ( [43] and Figure 27). This large bias for OCOG simply denotes an offset, possibly due to the choice of the tracking point within the waveform; what is more important is that the better repeatibility of the OCOG estimates, evidenced by the lower uncertainty.
Potential errors in this calculation lie in the PPP data processing, but also in the effect of geoid slope, which may reach several centimetres per kilometres on the Lake Issykkul. The current geoid models used to convert ellipsoid height into orthometric height are not accurate enough to measure such slope. To mitigate this error, it is therefore very important that the boat follows exactly the same track as the satellite. However, this is not known exactly in advance, but just within the ±1 km orbital cross-track shift. Thus, the effect of the geoid slope can represent source of error up to few centimeters. It is because of such error that, among several passes collected within the 3 campaigns, only the ones where the cruise track was <300 m from the satellite pass were used for S3A calibration/validation. Furthermore, due to the fact that over Lake Issykkul the geoid slope is much higher close to the shoreline than in the centre of the lake, only shipboard measurements >5 km from the coast were used in the analysis.

Consistency of Oceanographic Data with Other Missions
Sentinel-3A and -3B provide high quality ocean data with very good stability. The data availability for both S3A and S3B is excellent, and the percentage of detected outliers is below 4%, which is similar to other altimetric missions like Jason-3.
S3A and S3B provide similar outputs, as proven during S3B initial phase (flying tandem with S3A). In particular, the Sea Level Anomaly structures observed by the two satellites are consistent as shown on the along-track sea level anomaly maps of Figure 28. Sea surface height (SSH) data are also consistent with other satellites such as Jason-3. This consistency can be shown through multi-mission crossover analysis, allowing cross calibration between different satellites. In such an analysis, the SSH difference is estimated between two measurements (from Sentinel-3A and Jason-3, for example), over the same location. If the difference in sampling time is small, then we consider the oceanic signal constant during this period. To fulfill this last condition, criteria are applied to select crossovers with a time lag of less than 10 days and located in the areas of lowest oceanic variability. Figure 29 shows the resulting map of the mean SSH difference between Sentinel-3A SARM and Jason-3. A very good consistency is found between the two datasets with mean differences ranging between −2 and 2 cm (once the mean bias of 2.0 cm is removed), and a standard deviation of 5.6 cm. These small discrepancies are investigated further in a subsequent paper.

Sentinel-3 Observations of the Cryosphere
Although Sentinel-3 is a multi-disciplinary mission (following the work of ERS-1, ERS-2 and Envisat), for studies over land and sea ice, its data are most usefully compared with those from CryoSat-2, launched in April 2010 and still operating. CryoSat-2 was the first successful application of spaceborne SAR altimetry over the cryosphere, but it differed in a number of aspects. First, it was in a very different orbit; it is non-sun-synchronous with an inclination of 92 • (implying a much smaller "polar hole" where there are no nadir observations) and with a repeat period of 369 days (but a much finer spatial mesh of observations). Second, it had an extra receive antenna, and so was operated in an across-track interferometric mode over the steep ice sheet margins but in LRM mode over the plateaux in Antarctica and Greenland (see Figure 21 of [48]). Third, there are differences in the processing by the ground segment (use of Hamming filter and zero-padding to expand the waveforms) that will be discussed later.
Sentinel-3A is the first altimetry mission to operate in SAR mode everywhere, and currently shares a single ground segment between all surface types (ocean, sea ice, land ice, ice shelves, inland waters, coastal zones), although these all have very different properties and processing requirements at each ground processor level. The initial optimization of the Sentinel-3 L1 ground processing was for ocean surfaces (whereas CryoSat-2 was for ice surfaces) and hence the commissioning and tuning of the Sentinel-3 L2 processors for land and sea ice has been a more complex task than a simple retuning of the CryoSat-2 derived algorithms. In particular, for optimal performance and coverage over sea ice and the ice sheet margins (containing high slopes and complex terrain), Sentinel-3 requires a different L1 processing to that for ocean in order to provide the specialised L2 processors with the correctly windowed and filtered radar echoes to process over these complex surfaces.
Land ice is a permanent feature, built up by many years of snowfall and partial melt, and thus the radar signal reflects not only from the surface but also from interfaces within it. There are also significant variations in topography such that the local slope may exceed 1 • and thus the signal from "point of closest approach" is not necessarily from nadir, and so will need special processing to allocate the correct ground co-ordinates of the measurement. Sea-ice is much flatter, albeit with rafting and leads (cracks within the ice), and radar returns within this domain may be just from ice pack or be dominated by the water surface within the leads. As land ice and sea ice generate different waveform shapes, they need different retrackers and ground processing, and so are discussed separately here.

Land Ice
Here we look at three aspects of Sentinel-3 performance over land ice: firstly the ability of the various retrackers to locate and interpret the echo, secondly the precision of those derived ice heights, and thirdly the accuracy of the measurements compared with available validation data and other missions.

Monitoring Retracker Statistics over Land Ice
Monitoring retracker failure provides a good first indication of the performance of the altimeter and the ground segment processing baselines over the mission lifetime. From previous missions we have a good understanding of typical retracker failure rates and their spatial distribution over land ice surfaces. Higher than expected failure, changes in failure rate or unexpected spatial distributions are a key indicator of problems with the altimeter or ground segment processing baselines. For land ice surfaces, the S3 Level-2 product currently provides two separate elevation measurements: one derived from an ice sheet physical model retracker and the other from an OCOG empirical retracker. The OCOG retracker maintains continuity with previous ESA ground segment mission products and, as it is less sensitive to waveform shape over complex terrain, provides the highest measurement availability. The ice sheet physical model retracker is inherited from an analogous model from CryoSat-2, and is currently optimised for accuracy over relatively smooth areas of low slope.
Both retrackers perform a series of quality tests (low power, noise power, peakiness, variance, position of the leading edge, model fit statistics) on each echo in order to establish the suitability of the waveform for the retracking algorithm. Statistics and spatial maps of percentage failure rates of each retracker are monitored for each 27-day repeat cycle [13]. This analysis is performed over the whole Antarctic and Greenland ice sheets and over specific test areas with different slope, roughness and complexity, which characterise typical ice sheet terrain. Trends or steps in the statistics are then investigated further.
As shown in Figure 30, the highest failure rates from both retrackers are over areas of high slope and complex terrain, such as the ice sheet margins of Antarctica and Greenland. The ice sheet retracker has significantly higher failure rates over the ice sheet margins where complex echoes do not fit the classical ice sheet waveform model. Monitoring the changes in retracker failure statistics for each test area over the mission lifetime is also an essential S3MPC task. Figure 31 shows an example of the time series of OCOG retracker failure rates over the Antarctic ice sheet during three product baseline changes indicating a step change between baselines caused by a change in waveform quality thresholds. The OCOG retracker has been used in all previous altimetry missions and hence it is useful to compare the order of magnitude and spatial distribution of their retracker statistics with S3 SAR. Retracker algorithm rejection criteria and thresholds differ between ground segments, and no other missions have the same operational SAR closed loop mode over the ice sheets as Sentinel-3. Thus, we do not expect identical performance. Figure 32 shows typical patterns and failure statistics of OCOG for Level-2 products from Envisat RA-2 (GDR v3), ERS-2 (REAPER v1), and CryoSat-2 (Baseline-D, LRM mode, where the OCOG retracker is used). Sentinel-3's current OCOG retracker percentage failure rate over ice sheets is around 3% and the range of typical failure rates found in other missions is between 2 and 5% with similar spatial distributions.

Measurement Precision over Land Ice
Two important aspects of the evaluation of any instrument's performance are its precision and its accuracy. Precision i.e., the consistency of near simultaneous measurements is dealt within in this subsection, whilst accuracy i.e., the error with respect to some independent estimate of truth is covered in subsequent subsections. The precision of the Sentinel-3 SRAL altimeter over ice sheet surfaces is monitored by assessing the repeatability of elevation measurements in space and time. For this purpose we performed two sets of analysis: (1) single-cycle continent-wide and localised area crossover analysis to evaluate the repeatability of measurements at locations where ascending and descending satellite passes intersect [49,50], and (2) an evaluation of repeated profiles acquired over successive orbit cycles.
Elevation differences at orbital crossovers are commonly used as a metric for measurement precision and integrate a number of factors, including spatially uncorrelated orbit errors, retracker imprecision, the impact of radar speckle, echo relocation errors, and any sensitivity to anisotropic scattering within the near-surface snowpack [51][52][53].
Our objective is to monitor the precision of the SRAL instrument itself, and specifically to understand the impact of radar speckle, small-scale variations in the the backscattering properties of the firn (partially compacted snow), and the influence of retracker imprecision on the SAR altimeter measurements over land ice. To achieve this, we need to minimise the influence of topography and surface roughness, and therefore choose a smooth, flat, and stable ice surface: Lake Vostok (Figure 33), a sub-glacial in the central region of Antarctica. It is a site that provides a stable and low-slope surface that has been used in many previous validation studies [54][55][56]. Monitoring the statistics and distribution of crossover differences over the whole Greenland and Antarctic ice sheets gives a measure of how precision varies depending upon different ice sheet terrain and surface slope ( Figure 34). The magnitude of these cross-over differences increases with surface slope, with the largest differences occurring in regions with steep and complex coastal topography. In these regions, the processes of locating the echoing point within the beam footprint and of retracking complex multi-peaked waveforms become more challenging. By measuring the median crossover difference the bias between ascending and descending tracks can be monitored, and for both S3A and S3B this bias is less than 1 cm. Note that the new slope model introduced in PB 2.61 at cycle 54 leads to a marked reduction in crossover variability, especially for Antarctica.
Crossover analysis of SAR-derived ranges from both S3A and S3B indicate a precision <10 cm over smooth areas of very low slope, <1 m for slopes of 0.5 • , and <5 m in areas of very high slope (up to 2 • ). Precision over sloping and complex topography is likely to increase over the mission lifetime as techniques of slope correction and waveform retracking improve. A second approach to monitoring mission precision is to study repeated profiles, over 6 or 12 month periods, for areas of very low slope. To do this, we compute (1) the mean elevation profile, (2) the residual elevations from the mean profile, and (3) the standard deviations of all elevation measurements within 400 m intervals along the satellite track, as per McMillan et al. [57]. There is good agreement between the many S3A elevation profiles within a year, with the mean profile being 25-30 cm above the reference DEM from Cryosat-2. Using this approach we find a mean shot-to-shot precision of better than 10 cm (Figure 35c) over both S3A's and S3B's mission lifetime. To conduct an independent evaluation of the accuracy of Sentinel-3 ice sheet elevation measurements, we routinely compare them with contemporaneous measurements from (1) Operation IceBridge Airborne Topographic Mapper (ATM) and Riegl Laser Altimeter (RLA) instruments, (2) ICESat-2 ATLAS lidar, and (3) CryoSat-2, over ice sheet study sites selected to represent a range of topographic regimes. Lake Vostok is chosen to study accuracy over smooth low slope topography, Dome-C for a rougher low slope surface, and Wilkes land and the SPIRIT zone in East Antarctica to represent regions of steeper and more complex topography. A detailed analysis of the S3MPC methods used and results obtained over these test areas when validating S3 with Operation IceBridge is available in McMillan et al. [57].
ICESat-2's ATLAS lidar (launched in 2018) has replaced Operation IceBridge and offers similar accuracy (<10 cm for smooth surfaces, <1 m for rough and complex surfaces [58]), but much greater spatial and temporal coverage. Evaluation of S3A and S3B using crossover differences with ICESat-2's ATLAS lidar ATL-06 height measurements over Lake Vostok (Figure 36) shows a mean difference of 0.0 ± 0.08 m, indicating both very good agreement and that S3 in Ku-band SAR mode is measuring very close to the ice surface with minimal penetration of the near surface firn layer. Using ICESat-2 we can also compare elevation over the whole ice sheets of Antarctica and Greenland ( Figure 37a). Here, as expected, we see an increase in mean elevation difference with slope and topographic complexity, but mean differences remain below 1 m. The positive bias over the margins may be explained firstly by the much larger footprint of S3 (compared to ICESat-2) which favours mapping topographic highs within the footprint over small scale local topography [59] and secondly by a known issue with the search radius used by the S3 slope model. This is under investigation and will be corrected in the next processing baseline. Data selection (removing those with large disparities) shows the bias between the different sensors is small and does not vary seasonally (Figure 37b).
Routine comparisons with CryoSat-2 are performed to compare performance with a dedicated Ku-band polar altimetry mission, fully optimised for the cryosphere. Because CryoSat-2 operates in a different orbit inclination and in different modes (pulse-limited LRM mode over the interior of the ice sheets and an interferometric SARin mode over the margins), we do not expect identical performance to S3. Comparisons with CryoSat-2's LRM mode OCOG retracked elevations over Lake Vostok ( Figure 38) indicate that CryoSat-2 is measuring about 26 cm below S3 (which has been shown to measure close to the surface). This is consistent with the DEM comparison shown in Figure 35a, and is expected, as the leading edge of CryoSat-2's LRM echoes are more sensitive to volume scattering than S3's SAR mode. Over the more complex topography and higher slopes covered by CryoSat-2's SARin mask, the CryoSat-2 altimeter is designed to detect the location of the point of closest approach, whereas S3 will not be able to determine the origin of the signal it is tracking. In these areas, approximately 50% of S3 measurements are within 2 m of those from CryoSat-2 ( Figure 39). The S3MPC are working to evolve S3 slope models using improved DEMs and relocation algorithms, which will increase the future performance of S3 over these areas. The L1 operational processing for S3 is also currently not optimised over the ice sheet margins, and significant improvements are expected to measurement coverage and accuracy once a dedicated land ice processing chain becomes operational in 2021.

Sea Ice
Radar freeboard is the primary L2 sea ice parameter in the S3 products and is derived by firstly discriminating between Ku-band SAR echoes that are measuring open ocean, ice floes, and the leads between the sea ice ( Figure 40). Different retrackers are applied to these different surfaces to calculate their surface height. The sea level anomaly is calculated by subtracting the mean sea surface from the height within leads, and then the instantaneous elevation of the ocean surface beneath the sea ice floes is obtained by interpolating between lead tie points. Radar freeboard is calculated by subtracting the ocean surface elevation from the floe elevations. Freeboard is predominantly only successfully retrieved over the Arctic during the winter months of Oct-Apr. Due to the formation of melt ponds in the sea ice during the Arctic summer (May-Sept), freeboard can not be measured successfully, and in Antarctica, measurement is difficult due to heavier snow loading that can depress the ice below the ocean surface.
The S3MPC monitors the statistics and spatial distribution of all L2 parameters that contribute towards the formation of radar freeboard. For sea ice, unlike land ice, we can directly compare statistics with those obtained from CryoSat-2 as it operates in an identical SAR mode over the majority of sea ice areas ( Figure 41). SARin mode is only used over some coastal areas. Although Sentinel-3's altimeter is very close to that on CryoSat-2, the results obtained are not of the same quality. This is because the SAR data processing chains currently used for S3 uses the same standard treatment for all surfaces, both global ocean and continental regions, whereas that for CryoSat-2 was primarily dedicated to the polar zones. The ice pack has extreme specular surfaces in the leads within the ice, resulting in very peaky waveforms. With standard processing, these waveforms may be sampled at only 2 or 3 bins (see Figure 40d and Quartly et al. [48]). This is insufficient to accurately derive the height in the leads, which is needed as the reference in the freeboard calculation. To overcome this problem, the CryoSat-2 processing chain uses zero-padding prior to the Fourier transform, which enables a doubling of the temporal resolution through frequency-domain interpolation of complex waveforms.
The strong backscattering of leads among the ice pack leads to two other problems: snagging and azimuth ambiguity [48]. The former phenomenon is due to off-nadir bright targets, resulting in secondary peaks after the waveform's leading edge, and is also present in LRM mode. The latter is specific to SAR mode, with power received through the sidelobes leading to peaks in the SAR waveforms ahead of the leading edge. While this phenomenon has relatively little impact on physical retrackers that can model it, it can affect heuristic retrackers that look for the first rising edge. In the CryoSat-2 processing chain this phenomenon is mitigated by application of a Hamming filter in the along-track direction. However, this filter has side effects on the main rising edge [60].
ESA's GPOD cluster [61] allows users to reprocess SAR altimeter data using different options within the processing chain. We reprocessed Cryosat-2 and Sentinel-3A data for Dec. 2016, applying both second order zero-padding filter and the SAMOSA+ version of the SAR waveform retracker [62]. Figure 42 shows that the results from the two missions are now very similar. The lower row of the figure shows the results when using the heuristic TFMRA50 retracker, which looks for the first waveform bin with half the maximum power of the waveform [63]. Again, the results for the two altimeters agree, but are markedly different from those using the SAMOSA+ retracker. These results show that the SRAL altimeter on board the S3A and S3B satellites offers many of the same capabilities as the instrument on CryoSat-2. In order to assess more precisely the measurement qualities of SRAL on the pack ice, a dedicated processing chain would have to be set up for the pack ice, with at least a zero-padding filtering of order 2. (It is likely that this would have little impact on the open ocean.) The use of a Hamming filter needs further evaluation, as it may have some detrimental impacts on heuristic retrackers.

Sentinel-3 Observations over Inland Waters
Section 4.3 showed the use of a large inland water body (Lake Issykkul) for determining the range bias within the altimeter processing chain. In this section we consider the efficacy of the whole system in determining the water surface height (WSH) of a myriad of smaller lakes. This assessment naturally relies on the highest resolution data (20 Hz waveforms), with a number of retrackers and processing changes being considered. We first present the results from reference sites around Lake Issykkul, then show a global view, and finally look in detail at the results for a system of reservoirs and lakes in northeast Spain.

Accuracy Assessment at Lake Issykkul
Data from Lake Issykkul are also used to assess the accuracy of the altimetry measurements for inland waters. The lake has a number of sources of routine in situ measurements: a radar at Karabulun on the southeast coast, which was installed in 2011 and an historical gauge installed on the north coast in Cholpon Ata (see Figure 27). The radar measures the lake height every 5 min, while the measurements over the historical site provides water height twice per day. The comparison during the first three years (from mid 2016 to mid 2019) with water height measured with Sentinel-3A (averaged per cycle) has been performed. Comparisons have been done with both in situ instruments and the results are illustrated in Figures 43 and 44. It shows that the correlation between in situ and altimetry is very high (99%, in scatter plot in Figure 43) and the Root Mean Square (RMS) of the differences between altimetry and in situ is 1.50 cm (see Figures 43 and 44). Using LRM data from the Jason satellites or Envisat or Saral/AltiKa the error was never less than 3 cm over Lake Issykkul. This marked improvement is a little surprising, since for large lakes such as Issykkul the advantages of SAR's smaller footprint than LRM should not be a critical factor, so this may be due to the better noise characteristics of S3A.

Global Perspective
Data quality is assessed over the 700 largest lakes of the Global Lakes and Wetlands Database (GLWD, [64]). Further editing is applied to ensure the data are associated with inland water bodies. It consists in thresholds values for various parameters (backscatter coefficient σ 0 , wet and dry tropospheric corrections, and the ionospheric correction). A 20 Hz measurement is rejected if at least one parameter is found to be outside those thresholds. The σ 0 criterion ensures the data being studied are from the intended water surface, rather than the onboard tracker being hooked on the signal from some land surface (see Section 3.4.1 of [48]).
The  The OLTC update, by imposing adapted elevation priors to position the altimeter tracker, allows better tracking of the water surfaces. Figure 46 presents the distribution of the OCOG σ 0 values for about 17,000 river crossings over which OLTC V5.0 was tuned. With OLTC v4.0 some targets were already observed in open loop mode and some others in closed loop mode. The distributions emphasise that low σ 0 observations (<40 dB) with OLTC v4.2 have disappeared in the OLTC v5.0 measurements when looking at the targets provided in v5.0; this means that the water surfaces are better tracked by the altimeter at positions where the command have been updated.   Figure 48a shows the estimated WSH across Lake Ladoga over S3A cycle 53 and S3B cycle 34, indicating the good agreement between the two missions. In Figure 48b, the median value along each lake crossing is estimated and represented as a function of time for Sentinel-3A (blue lines) and Sentinel-3B (red lines). This shows the long-term water surface height variations are consistent between the two altimeters. The biases in between the different tracks mainly results from the geoid differences across the lake.

Ebro River Basin
The preceding analysis showed the consistency of the Sentinel-3 data, both in terms of its precision and the relatively smooth evolution between successive transects. For a detailed study of the accuracy, we consider the Ebro River basin in the Iberian Peninsula [4.5 • W-2 • E, 40 • -43 • N] (Figure 49), as that offers a diversity of lakes in a challenging environment for which we have plenty of in situ information including reservoir water levels. S3A and S3B L2 data have been validated with in situ measurements over all these inland water bodies, and, when possible, results from S3A and S3B have also been intercompared. Over the western part of the basin, S3A was in Closed Loop (CL) tracking mode until March 9th, 2019, and changed to Open Loop (OL) afterwards with the update of the DEM to v5. The eastern part of the basin has always been in OL tracking mode. These factors permit an interesting demonstration of the importance of OL mode. Water bodies in Ebro River basin have different sizes with widths ranging from 130 m to 4.5 km. The Pyrenees mountains (with heights over 3000 m) are located in the northeast part of the basin making range measurements challenging over some reservoirs (e.g., Cavallers and Irabia). in situ validation data for all reservoirs comes from the Automatic Hydrological Information System (SAIH Ebro).
Both Sentinel-3 L2 ocean retracker and OCOG retracker have been used to calculate water levels from L2 products together with L2 geophysical corrections (including the wet troposphere, dry troposphere, ionosphere, solid earth tide, geocentric pole tide and ocean loading tide) and the geoid correction. The time series of the water levels are calculated using a strict water mask polygon. The water levels of the altimeter footprints within the mask are considered, selected and averaged for each date.
Three reservoirs have been monitored for a long-time period with S3A. These are: Sotonera, Ebro and Ribarroja reservoir. Results including RMSE, bias and Mean Absolute Difference (MD) for the different reservoirs (Table 5) show a good general agreement between water levels derived from S3A L2 and in situ. There is a consistent negative mean bias between in situ and S3 derived water levels. Observed bias is an important share of the observed differences and it is similar when water level is estimated using L2 ocean or OCOG retracker. That this is independent of the processing could indicate that it might be partly attributable to geoid inaccuracies as already shown in Section 7.2. OCOG shows slightly lower RMSEs than L2 ocean. Waveforms over inland water can easily be contaminated by surrounding land; if there is more than one peak in the waveform, OCOG will be more robust than the ocean retracker.
Furthermore, over two lakes (Ebro and Sotonera) that changed from being tracked in closed loop to open loop with the update of the OLTC DEM no accuracy changes were observed. Two reservoirs are not tracked with DEM v4.2 nor with DEM v5, these are: Cavallers and San Salvador. Since S3B measurements became available on its interleaved tracks, we are able to monitor 3 new reservoirs (Ullibarri, Canelles and Santa Ana) and also to cross-validate S3A and S3B measurements over 2 reservoirs (Ribarroja and Mequinenza). The comparison results with in situ data for S3B are shown in Table 6. Canelles and Santa Ana Reservoirs results are very similar to those shown in Table 5. However, Ullibarri reservoir shows a lower bias and MD when water level is estimated from L2 ocean retracker. This lake is very small and has a very complex shape, so the only Sentinel-3 footprint within the lake surface is easily land contaminated which explains the less good result with OCOG retracker. It is also noteworthy for that lake that the bias is not consistent between the two retrackers. However, the S3B time series is still relatively short (one year) and these results need to be confirmed by a longer time series.  Figure 50 and Table 7 show the comparison of S3A and S3B water levels over Ribarroja and Mequinenza respectively. The comparison between S3A and S3B over Ribarroja and Mequinenza show, consistently with previous results, negative bias between in situ and S3 derived water levels. Again, the bias is a major part of the observed differences and has a similar value for both L2 ocean and OCOG retrackers. The comparison shows better performance of S3B. As can be seen in Figure 50, S3B presents no outliers and consequently comparison with in situ data shows much lower RMSE than S3A. Note that S3B passes are closer to the in situ measuring point which could also explain its better comparison with in situ data.  In summary, both S3A and S3B water level estimates follow well the in situ variations, despite there being a consistent negative bias with observations. Over relatively larger water bodies as in Table 5, the OCOG retracker shows a slightly better performance but accuracy of L2 ocean is better over smaller water bodies. Finally, S3B seems to have a better performance but these results need to be confirmed by a longer time series. Note, as these reservoirs are only of order hundreds of metres to a few kilometres in length, fetch is limited and thus waves and sea state bias are unlikely to be significant in this evaluation. Better results (in the decimetre range) over inland water bodies can be obtained when filtering the waveform to consider only the portion returned from nadir as in Gao et al. [65].

Summary and Future Challenges
Although radar altimetry is a very mature discipline, the Sentinel-3A and -3B instruments are only the second and third to function in delay-Doppler (or SAR) mode, and they operate in this manner over all surfaces. This method of operation should provide greater robustness to off-nadir reflections and enable a better signal-to-noise ratio. At the same time, it creates its own challenges in terms of monitoring the health of the instrument, assessing the efficacy and accuracy of retrackers, and understanding any differences with respect to conventional LRM altimetry.
As shown in Section 2.1, some of the internal health checks for the SRAL instruments (e.g., CAL1 and CAL2) to produce the Level 1 products follow a long tradition of checks used on prior instruments. On the other hand, the monitoring of changes in phase and power within a burst ( Figure 6) are specific to SAR processing. Whilst some of the changes noted are large (e.g., the power drift, Figure 5a), the important thing is that they are all continuously monitored and fully accounted for in the ground processed products seen by the users. The same applies to the changes in the MWR behaviour described in Section 2.2. Some of those changes are little understood and, thus, still under investigation (e.g., the variations in noise injection temperature, Figure 9b), while others that were understood lead to changes in the operation of the satellite (e.g., the RFI event, Figure 10b). Overall, the monitoring of all the internal instrument characteristics enables the operational agencies to correct for these effects and deliver homogeneous and self-consistent Level 1 datasets, which are used to derive the Level 2 geophysical parameters.
The S3MPC activites presented in Sections 3 to 7 focused on assessing the accuracy of the derived Level 2 products through comparison with in situ data as well as through intercomparison with other satellite datasets. These comparisons were performed over all the key products and over all surfaces of interest.
The assessment of wind speed and wave height data from Sentinel-3A are performed in a number of ways. This paper first looks at the routine assessment using in situ measurements and models and then explores the errors in wave height in more detail. The scatterplots in Figure 15 show the r.m.s. error of wind speed to be 1.07 m/s when compared with the WAM model, but 1.37 m/s against in situ data, because of the errors also associated with the reference measurements. For wave height, the r.m.s. error compared with the model is 0.27 m, and 0.34 m when compared with the in situ data. Maps of the regional variation in bias ( Figure 16) are consistent with the altimeter SWH estimates being slightly high at high wave heights and the wind speed values being slightly too high at low wind speeds. There is also an occasional problem with inadequate flagging of data contaminated by sea ice. Section 3.4 then brought in comparisons with another altimeter mission, showing that the mapped PLRM estimates of wave height from Sentinel-3 gave very much the same values as Jason-3, but that use of SAR mode gave slightly higher values (Figure 18b). In this analysis the bias between the two estimates seems to be principally a function of SWH, rather than of particular swell-dominated regions. Finally, Nencioli and Quartly [33] have made a special evaluation of the wave data in the coastal zone (Section 3.5), where SAR mode does deliver the expected improved performance compared with PLRM ( Figure 19). Section 4, focussed on the absolute range bias estimation, bringing together three complementary ways of assessing it. The range bias is critical for estimation of absolute water height, and thus for the estimation of long-term trends when combining multiple altimeter missions. The first assessment is based on the comparison with transponders, active point targets, returning a narrow pulse similar to that originally emitted by the satellite (Section 4.1). The analysis returned an estimate of the range bias of S3A at Crete of 0.7 cm. The second assessment is based on tide gauge observations from dedicated cal/val sites off Corsica (Section 4.2). Such assessment provides range bias estimates for ocean-like returns, with the data selection focussing on low wave heights but not mirror-like conditions that can give anomalous waveforms. The results gives a value for the range bias of S3A of −1.7 cm for SAR mode and −1.9 cm for PLRM (corresponding to SSH biases of 1.7 cm and 1.9 cm respectively). Finally, the last type of evaluation was based on ship-based observations over Lake Issykkul (Section 4.3). The assessment yields a range bias of 1.4 cm (SSH bias of −1.4 cm) for SAR signals using the ocean retracker and −28.4 cm when using OCOG. Part of the difference with respect to the results from the transponder, could be due to the differently shaped waveform echoes and thus retracker model used. Although some improvements are still necessary (e.g., an improved SSB model for the Corsica site and better GNSS data processing for the Lake Issykkul experiment), the coherence observed is very encouraging. Furthermore, despite this current observed consistency, validation of S3 altimeter range needs to be ongoing to assess any drift after the corrections for known instrumental ageing are applied (Figure 4). Section 5 shows a global comparison of S3 sea level anomalies with those for Jason-3, with a mean offset of 2.0 cm. There is a very good agreement, but the small residual patterns are still being investigated.
As Jason-3 is limited by a turning latitude of 66 • , the comparisons for land and sea ice in Section 6 are mainly with Cryosat-2 and ICESat-2. Early on in the Sentinel-3A mission there were particular challenges with the on-board tracking to capture the waveform over the ice sheet margins, and Level 1 operational processing is not yet fully optimised to maximise measurement density over areas of high slope and complex terrain. However, the current losses in data over the Antarctica ice sheet are ∼3%, which is similar to that of other altimeters ( Figure 32) and will be improved when thematic operational processing is implemented in 2021. The precision of height retrievals over Lake Vostok ( Figure 34) is better than 10 cm, which is similar to results for CryoSat-2 and ICESat-2. Evolutions in the method of slope correction have been shown to improve Sentinel-3 precision over the ice sheet margins. Currently an identical Level 1 processing chain is used for S3 data over all surfaces, which is not optimal for quantitative derivation of sea level height within leads in the sea ice. A custom processing of the data, using options consistent with those applied to Cryosat-2 data, showed that S3 and Cryosat-2 estimates of freeboard were fully consistent.
Altimeters have long been used to monitor the surface level of major lakes and rivers e.g., [66]. However, as shown in Section 7, a major extension of the realm of interest to smaller lakes has been achieved with Sentinel-3. Firstly its smaller footprint in SAR mode reduced the impact of nearby land reflections, but this is also greatly helped by the on-board DEM guiding the open loop retracker as to the expected timing for the return waveform. Of course retracking performance is best over the biggest lakes (e.g., Ladoga and Issykkul) but Figure 46 shows that Sentinel-3 now records good altimetric data over more than 17,000 river crossings.
In the future new branches of dedicated ground processing will be developed specific for different land surfaces. These should enable a greater accuracy than a uniform processing line for all surfaces. At the same time, such new ground processing branches will require extra effort to monitor, validate and further improve the surface specific S3 observations. Furthermore, in a few years' time, Sentinel-3C and -3D will be launched and it will be important to tie their measurements into the same rigorous standards as already implemented for S3A and S3B. There is still further work to be done to fully understand and characterise the small residual differences that remain between estimates from SAR and PLRM/LRM processing, as this is crucial for the integrity of the long-term altimetric record across many different missions.
Author Contributions: All named authors have contributed by writing a part of the document (including production of illustrative figures) or by reviewing the draft of their colleagues. G.Q. was responsible for the initial concept of the paper and its layout; the text was reviewed and edited by G.Q. and F.N. All authors have read and agreed to the published version of the manuscript.

Abbreviations
The following abbreviations are used in this manuscript: