Sea Surface Temperature (SST) is an essential climate variable used to monitor, detect, predict and characterize earth’s climate and its variations [1
]. Several long-term global SST records are available, based on observations acquired from sailing vessels in earlier decades and, in more modern times, from in situ measurements (from drifters, moored buoys, Argo floats, etc.) and from space and airborne sensors (on satellites and aircraft) [2
]. The advantage of satellite-derived SSTs compared to other sources is vast coverage at high resolution. However, they also have inherent inaccuracies due to the errors associated with spacecraft navigation, sensor calibration and noise, retrieval algorithms, and residual clouds. As a result, there is a need to provide clear error estimates associated with satellite SSTs in order to obtain the desired results in their intended applications. Validation and cross-comparison of different satellite-derived SSTs are critical to understanding these products and to assess their relative merit and performance. However, differences of several degrees can appear between various products due to inconsistencies in retrieval schemes and in cloud detection algorithms [4
]. Therefore, confidence in the reported accuracy of any retrieval is dependent on the validation procedure.
For validation purposes, most satellite-based SSTs are compared against collocated in situ measurements, which, although considered as ‘true’ values for comparison purposes, also have errors associated with them. The reported inaccuracies partly originate from the spatiotemporal mismatch between the in situ and satellite locations, and the standard deviation in their differences has contributions from both of them. Thus, through direct comparison, it is not possible to decouple estimated error associated with the satellite-derived SSTs only. A real validation of any geophysical target variable requires an accurate characterization of the associated errors. A direct comparison of satellite-derived SST with in situ data does not yield the real error in the satellite SST, as there will be ambiguity due to errors both of them and of collocation differences. The situation is further complicated by the fact that buoy data are not uniformly distributed in space and time over the global oceans and have varying performances owing to different origins (cf. [5
]), as well as that the quality of the in situ drifter measurements cannot easily be verified once a drifter has been deployed at sea (cf. [6
]). Also, these measurements are collected at depths ranging from 0.1 m to 2 m below the sea surface and, therefore, may not be fully consistent with satellite infrared SST measurements, which are representative only at depths of approximately the channel wavelength (mostly near-surface, in the micrometer to millimeter range).
Direct validation of SSTs from satellite infrared radiometry is allowed by coincident ship-borne skin measurements made below the intervening atmosphere (cf. [7
]). However, the availability of such data has long been recognized to be sparse e.g., [11
] and still continues to be rather limited [6
]. In addition to algorithmic and reference-related differences between retrieved SSTs, some differences are due to practicalities and lack of a consensus in validation approaches. These include (a) different criteria for matchup between the product and the reference, (b) different treatment of outliers in retrievals, references, or both, (c) using hard cutoffs to exclude tail-end elements from the matchup probability density function, (d) averaging satellite retrievals that may smooth the noise, and (e) reporting only robust statistics. While no particular approach can be proclaimed as the ‘best practice’, since all are driven by the purpose of validation (cf. [15
]), the situation creates difficulties in the assessment of product performance from a user perspective because of the lack of a common platform. These challenges have been recognized by the Group for High-Resolution Sea Surface Temperature (GHRSST), leading to the formation of the Satellite Sea-Surface Temperature Working Group (ST-VAL) (https://www.ghrsst.org/about-ghrsst/tags-and-wgs/
), with an objective of facilitating best practices for validation in the international SST community.
Validation against in situ data is performed primarily for the purposes of assessing the accuracy (bias) and precision (standard deviation) of the target products. Also, most products are generated by regression techniques, against in situ data or based on radiative transfer simulations, and may empirically be tied to the ‘reference response variable’, e.g., drifters. To investigate the independence of the various products, the correlations between the residuals should also be analyzed. Additionally, in situ, data are expected to have inherent measurement error, as with any physical system, which will affect the validation statistics. This inherent limitation can be overcome by employing a triple collocation method (TCM) on triplets of collocated matchups.
A triple collocation (TC) three-way error analysis of three mutually independent measurements can be used to estimate the root-mean-square error (RMSE) of each of these measurements with a high level of accuracy. As mentioned earlier, a knowledge of the ‘true’ value of SST is desirable to estimate the error with high accuracy, but the ‘true’ observations are themselves imperfect due to inherent errors. Using a TC error analysis, it is possible to estimate the RMSE without treating any one system as perfectly observed ‘truth’ [16
], thus estimating only the random error associated with the target variable. TC has also been used widely in oceanography for SST error estimation [17
], wind speed, and wind stress [20
] and wave height [22
]. This standard TC approach provides RMSE of the measurement system, which represents the variability of the measurement error.
In this study, the concept of TC is extended to estimate the correlation coefficient of each measurement system with respect to the unknown target variables of SST, based on the work of McColl et al. [24
]. Thus, we are estimating not only the errors associated with our target variable but also the sensitivity of the measurement system to the ‘true’ SST. In this extended triple collocation (ETC) analysis, the estimation of the correlation coefficient is obtained without using any additional assumptions other than what is already used in TC analysis. Using ETC, we determine the RMSEs and unbiased signal-to-noise ratios (obtained from the correlation coefficients) for the Pathfinder Version 5.3 Level-3C SST product (PF53) using 14 years (1998–2011) of Climate Data Record, along with the in situ SST data and the Advanced Along Track Scanning Radiometer (AATSR) Reprocessing of Climate (ARC) dataset for the corresponding period. These three SST observations are collocated, and statistics of the difference between each pair are estimated. The variances of these differences are further used to derive the RMSE related to each observation type independently (assuming uncorrelated errors). The next section provides a brief review of TC along with an overview of the ETC for this analysis. The implementation of ETC and the results are discussed in the final sections.
Validation results of satellite-derived SST products against in situ SSTs have inherent inaccuracies resulting from spatiotemporal inhomogeneity between the satellite and the point measurements. In addition, such validation requires treating the reference data (in this case in situ SSTs) as the ‘true’ value of SST, in the process neglecting the error in the in situ data. A triple collocation based three-way error analysis using three mutually independent error-prone measurements can be used to calculate RMSEs associated with each of the measurements without treating any one of them as the ‘truth’. In this study, we estimated the RMSEs associated with the Pathfinder Version 5.3 Level-3C SST product. The other two data sources used for this analysis are the iQuam2 in situ SSTs and the AATSR-based ARC dataset for the corresponding period. Firstly, a triple matchup of the dataset was created, and subsequently, the RMSEs and corresponding unbiased SNRs for each data source was estimated by employing the Extended Triple Collocation (ETC) method. The RMSE (true variability) ranged from 0.31 to 0.37 K for PF53, and 0.18 to 0.33 K for the ARC data. These values were reasonable, as was evident from the very high unbiased SNR values (~0.98). The ETC method used to estimate the random error for the Pathfinder SST had some inherent limitations (weaknesses). The results are heavily dependent on our three main assumptions, (1) the error model, (2) independent errors between in situ data, PF53, and ARC and (3) independence of the error from true value of the variable. If any of these assumptions failed, it could lead towards inaccurate values of RMSEs. However, ETC is a powerful technique and is easy to implement. In the future, as an extension of this study, we will work towards the spatial distribution of the error associated with the Pathfinder SST.