1. Introduction
The ocean covers 71% of the Earth’s surface, absorbs the majority of incoming solar radiation, and plays a vital role in atmospheric energy transfer [
1]. As the interface for air–sea interactions, variations in the thermal state of the ocean’s surface layer have a significant impact on atmospheric dynamics. SST is the main parameter describing the thermal state of the ocean’s surface layer and is crucial for understanding the marine environment, global climate change, and disaster prevention and mitigation [
2]. Accurate, high-coverage SST data are essential for various applications. Traditional methods, such as using ships and buoys, offer high precision but suffer from sparse and uneven spatial distribution [
3,
4,
5]. SST derived from thermal infrared data is often limited by gaps caused by clouds and atmospheric aerosols, resulting in coverage that fails to meet application needs [
6]. While microwave sensors can provide full coverage, their spatial resolution is low, and they are sensitive to heavy rainfall [
7]. Polar-orbiting satellites provide global SST coverage due to their orbital characteristics and observation times. However, their temporal resolution is limited, typically offering only two observations per day [
8]. In contrast, geostationary satellites deliver higher temporal resolution—ranging from one hour to as frequent as every ten minutes—but their spatial coverage is more restricted [
9]. Therefore, integrating multi-source SST data is crucial for generating products that combine high spatial and temporal resolution, extensive coverage, and improved accuracy.
Existing studies on global SST analysis are mostly confined to daily, monthly, or seasonal scales, often failing to capture critical information within hourly variations. Seasonal scales can reflect long-term trends, such as monsoon changes. This information is important for the fishing industry [
10]. However, it is unable to portray the hourly temperature abruptness of events such as ocean heat waves and typhoon transits; the diurnal variation of SST significantly affects the ocean–atmosphere thermal interactions [
11], and the time-by-time data can fill the research gaps of short-term climate processes, such as analyzing the impact of time-by-time variation of SST on the atmospheric circulation during strong storms and improving the accuracy of numerical weather prediction to provide real-time disaster warnings in coastal areas [
12]. Therefore, the fusion of multi-source data to build a time-dependent, high-coverage, high-precision SST product can not only make up for the defects of a single data source, but also establish a refined time-series standard for climate research through the integration of time-dependent data, and promote the related fields to real-time dynamic monitoring.
Multi-source data fusion refers to integrating information from different sensors—whether of varying types or from the same sensor at different times—into a unified representation of the same entity [
13]. Common data fusion methods include objective analysis, optimal interpolation (OI), and Kalman filtering [
14]. Objective analysis, first applied to data fusion in the United States in 1965, has since become a well-established technique for solving practical data fusion challenges. Guan, L. and Kawamura, H. [
15] applied objective analysis to fuse SST data from microwave and infrared sensors, resulting in a global SST product with a temporal resolution of one day and a spatial resolution of 25 km. OI introduced by Eliassen in 1954, as part of objective analysis methods, was employed by Reynolds, R.W. et al. [
16] to combine AVHRR SST data with observational SST data, producing a fused product with a temporal resolution of one day and a spatial resolution of 0.25°. Additionally, Li, Y. et al. [
17] used data interpolation empirical orthogonal function (DINEOF) alongside the OI method to successfully fuse microwave and infrared SST data in the Arctic region. Kalman filtering, first introduced to meteorology and oceanography by Michael, G. and Malanotte-Rizzoli, P., iteratively calculates optimal Kalman coefficients to provide the best estimate of a system’s state [
18]. Xi, M. [
19] applied OI to fuse infrared (AVHRR) and microwave (AMSR-E) SST data in the mid-latitude southern Indian Ocean, producing a high-resolution SST product that successfully captured the structure of oceanic fronts. One of the key advantages of Kalman filtering over optimal interpolation is its ability to enhance spatial coverage while preserving more detailed information, making it a more effective tool for SST data fusion. Wang, Y. [
20] also used AVHRR and AMSR-E SST data, exploring the application of Kalman filtering in SST fusion. In 2008, Ding, R. [
21] developed an SST product with a 1-day temporal resolution and approximately 2 km spatial resolution. He applied optimal interpolation and Bayesian maximum entropy methods to fuse SST data from MODIS infrared, AMSR-2, and HY-2A microwave radiometers. In contrast, Liao, Z. [
22] focused on SST data from the FY-3C satellite, using both optimal interpolation and Kalman filtering methods for data fusion. He further employed the radial basis function network (RBFN) to enhance the representation of SST details in the fused results. Additionally, Wang, M. et al. [
23] achieved multi-source fusion of conventional observations, satellite data, and 3D analysis using a deep learning-based ocean data fusion network (ODF network).
The TC fusion algorithm presented in this paper extends the Triple-Collocation method, first introduced in 1998 [
24]. The TC algorithm is widely used for evaluating and analyzing data from multiple sources [
25]. It operates by using three independent datasets, each containing observations of the same geophysical variable but differing in error characteristics. By comparing discrepancies among these datasets, the error variance of each can be estimated [
26]. Christoforos, T. [
27] conducted a study comparing four TC algorithms using buoy-reported real SST, ATSRs/SST, and AMSR-E/SST data. He also determined the minimum sample size required for analysis. Xu, F. et al. [
28] applied the TC algorithm to assess SSTs from major platforms in the NOAA In Situ Quality Monitoring (iQuam) system, using data from NOAA-17 AVHRR and Envisat AATSR, both produced by ESA under the Climate Change Initiative (CCI) program. The study revealed the standard deviations of SST errors for various platforms as analyzed using the TC algorithm. For ships, the error was 0.75 K, while for buoys and Argo floats, it ranged from 0.21 to 0.22 K. Tropical moorings showed an error of 0.17 K, and coastal moorings had an error of 0.40 K. The errors for AVHRR and AATSR were between 0.35–0.38 K and 0.15–0.30 K, respectively.
Li, C. et al. [
29] assessed the applicability of multi-source precipitation products in China from 2013 to 2015 using the TC algorithm. Their results showed a relative Bias of 4.5% in the southern region and a root mean square error (RMSE) of 0.61 for the Tibetan Plateau. These findings indicate that the TC algorithm provides reliable precipitation estimates for China. Wang, S. et al. [
30] applied the TC algorithm to fuse three independent soil moisture products at two spatial scales in the Nagqu region, demonstrating that the fused data were more comprehensive and accurate. Zhou, L. et al. [
31] utilized surface temperature products from FY-3C/VIRR, MODIS, and ground observations for multi-source fusion in the Huaihe River Basin. In summary, while the TC algorithm has been widely applied to SST analysis, precipitation evaluation, and soil moisture fusion, no research has yet focused specifically on the fusion of multi-source SST data using this algorithm.
This paper presents a novel approach to SST fusion using the TC fusion algorithm, effectively integrating multi-source SST data from both polar-orbiting and geostationary satellites. This fusion enhances global SST spatial coverage and increases temporal resolution from twice daily to hourly intervals. The performance of the resulting global temporal SST product is evaluated based on key metrics, including spatial coverage, Bias relative to measured data, RMSE, and R2.
  2. Data Presentation
The accuracy and diversity of SST data are essential for achieving high-precision data fusion. This study uses heterogeneous data from multiple sources, including polar-orbiting satellites, geostationary satellites, and measured data, to ensure comprehensive spatial and temporal coverage. However, due to satellite orbit limitations and sea ice, effective data are concentrated in mid-latitude and low-latitude sea areas and coastal regions, with less coverage in polar regions [
32]. The technical parameters of each data source are detailed in the next section. Polar-orbiting satellite data provide the foundational global coverage, while geostationary satellites offer high-frequency observations at mid and low latitudes. Measured data are used for calibration and validation. By organizing the temporal resolution, spatial resolution, data level, and coverage of each source, a solid foundation is laid for applying the TC fusion algorithm and analyzing the results.
  2.1. Polar-Orbiting Satellite Data
The polar-orbiting satellite data used in this study include global SST products from AMSR2, MODIS, and AVHRR.
AMSR2 (Advanced Microwave Scanning Radiometer 2) is onboard the GCOM-W1 satellite, developed by the Japan Aerospace Exploration Agency (JAXA), Tsukuba, Japan. It is designed to detect microwave radiation signals from the Earth’s surface and atmosphere [
33]. The data used here are high-resolution sea surface temperature (GHRSST) products, specifically fusion-processed daily L3-level data, covering the global ocean from 1 to 2 September 2023. The spatial resolution is 0.25°, with coverage extending from 89.875°N to 89.875°S and from −179.875° to 179.875° longitude, providing near-global ocean coverage, with the exception of small polar regions.
The Moderate Resolution Imaging Spectroradiometer (MODIS, National Aeronautics and Space Administration (NASA), Washington, DC, USA) is a key instrument for observing global biological and physical processes, providing twice-daily observations across 16 thermal infrared bands [
34]. The MODIS SST data used in this study have a temporal resolution of twice per day, a spatial resolution of 4 km, and cover the period from 1 to 2 September 2023, matching the AMSR2 data. The coverage spans from 90°N to 90°S latitude and 180°E to 180°W longitude, providing comprehensive global ocean coverage.
The Advanced Very High Resolution Radiometer (AVHRR, National Oceanic and Atmospheric Administration (NOAA), Washington, DC, USA) is a multi-spectral channel scanning radiometer, playing a crucial role in global ocean research and meteorological monitoring applications [
35]. The SST data used in this study are derived from the AVHRR thermal infrared band, based on L3 level datasets compiled by GHRSST. These data have a temporal resolution of twice per day, a spatial resolution of 0.05°, and cover the period from 1 to 2 September 2023. The coverage spans from 90°N to 90°S latitude and 180°E to 180°W longitude, ensuring full global ocean coverage.
  2.2. Geostationary Satellite Data
The specific temporal resolution, spatial resolution, data level, and spatial coverage of the geostationary satellite SST data used in this paper are summarized in 
Table 1.
Himawari-9, launched in March 2017, is equipped with a hyperspectral imager (AHI) featuring 16 observation bands [
36]. The satellite provides full-disc observations every 10 min. The SST data used in this study are L3C-class products, aggregated from these 10 min intervals to achieve a temporal resolution of 1 h and a spatial resolution of 2 km. The data cover latitudes from 60°S to 60°N and longitudes from 180°W to 160°E and 80°E to 180°E.
Meteosat, operated by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), offers SST data from the Meteosat-9 (Met09/SST) and Meteosat-10 (Met10/SST) satellites. Met09/SST covers the Indian Ocean, with a latitudinal range of 60°S to 60°N and longitudinal range of 101.5°E to 18.5°W. It provides L3C-level SST data on a 0.05° grid, aggregated from 15 min interval data to create 1 h temporal resolution products. Met10_SEVIRI, which covers the eastern Atlantic, also provides high-resolution SST data, remapped to a regular grid with 1 h temporal resolution.
The GOES satellites, operated by NOAA, continuously monitor Earth’s atmosphere, oceans, and surface, providing ocean surface temperature measurements that are processed by ACSPO to generate SST data. This paper uses SST data from G16-ABI and G18-ABI. G16-ABI has a temporal resolution of 1 h, with 0.2 GB of daily data, covering 59°S to 59°N and 135°W to 15°W at a 2 km resolution. G18-ABI also has 1 h temporal resolution, with 0.7 GB of daily data, covering 60°S to 60°N and 180°W to 77°W/163°E to 180°E, also at 2 km resolution.
  2.3. Measured Data
The iQuam product, developed by NOAA’s Satellite Applications and Research Center (STAR), integrates buoy, ship, and other survey data for SST analysis [
3]. The data are archived monthly, with updates every 12 h and a 2 h delay. All data processing for the previous month is completed by the 15th of the following month. The system performs three primary functions in near real time: quality control (QC) of field SST, online monitoring of field SST after QC, and the provision of reformatted field SST data with quality grades and flags. The data are stored in point form and is available for download at 
http://www.star.nesdis.noaa.gov/sod/sst/iquam/index.html (accessed on 20 November 2023).
  3. Methods
In this paper, we apply the TC algorithm proposed by Tan, S. et al. [
26] to SST data fusion for the first time, offering an innovative solution for high-precision global SST fusion. The TC fusion algorithm consists of two steps: First, the error variance of various datasets is estimated using the TC algorithm. Second, fusion weights for the data are determined based on the estimated error variance, followed by the data fusion process. The TC algorithm relies on four assumptions:
- (1)
 The three SST products are linearly correlated with the true value;
- (2)
 The errors of the three SST products do not vary with time;
- (3)
 The errors of the three SST products are independent of each other, i.e., the covariance between the errors of different products is zero;
- (4)
 The errors of the three SST products are independent of each other and of the true value, i.e., the covariance between the errors of the three products and the true value is 0.
Based on the four assumptions, the relational equations for the three different data values are established as shown below:
      where 
 are the three data values corresponding to the same image element; 
 is the true value of the SST; 
 are the additive deviation coefficients of the three SST products 
 relative to the true value of the SST 
; 
 are the multiplicative deviation coefficients of the three SST products 
 relative to the true value of the SST 
; 
 are the errors that the mean of the three SST products is 0, respectively. Any of the data in 
 are selected as a reference sample, and in this paper, the data of 
 are selected as a reference sample to construct the scaling coefficients 
, 
, and 2 new sequences 
:
      where: 
 is the mean value calculated by multiplying the corresponding elements of the 2 sets of samples two by two. The construction of the scaling factor 
, 
 utilizes the 2 irrelevance assumptions (Assumption (3) and Assumption (4)), so that on the one hand, it can be derived directly by the operation between the two samples, for a definite value; on the other hand, it can be reduced to the ratio of the coefficients of deviation of the linear multiplication of the two samples. For example, it can be obtained from 
E() = 0:
Substituting Equation (4) into Equation (2) and eliminating the term with zero according to Assumption (3) and Assumption (4) yields:
The same reasoning gives: 
:
Since Equations (5) and (6) with Assumptions (3) and (4), the new sequence 
 in which all 
 are transformed into 
:
As a result, the term containing the “truth value 
” can be eliminated by subtracting two by two:
According to the 2 irrelevance assumptions (Assumption (3) and Assumption (4)), it follows that
      where 
 is the variance of the random variable 
,  can all be found from the sample 
, so it can be expressed as 
, , :
Then, the TC algorithm was used to determine the error variance of the three data values relative to the true value. Subsequently, the fusion weight for each SST product was calculated pixel by pixel, ultimately enabling the fusion of multi-source data. Based on the number of valid data points from the three SST products for the same image element, after temporal and spatial matching, the fusion of these products is categorized into three scenarios:
(1) When valid data exist for all three SST products for the same image element on the same day, the fusion weights for each SST product are calculated separately using Equation (11):
      where 
 and represent the fusion weights of the three SST products 
, respectively; and then use the fusion weights of the three SST products to calculate the fused SST:
      where 
 is the post-fusion SST;
(2) When only two SST products exist with valid data for the same image element on the same day, Equation (13) is used to calculate the fusion weights of these two SST products:
The fused SSTs were calculated using the fused weights of the two SST products:
(3) When only one SST product exists with valid data for the same image element on the same day, the SST of that product is directly assigned to the fused SST:
In this paper, three commonly used evaluation metrics are selected to measure the accuracy of model fusion data: 
Bias, root mean square error (
RMSE), and coefficient of determination (
R2). The formulas for each index are as follows:
      where 
 is the number of samples, 
 represents the fusion result data, and 
 represents the actual measurement obtained from buoys, ships, etc. The smaller the values of the Bias and RMSE indicators, the closer they are to indicating that the fusion of SST data is good and the overall accuracy is higher. Conversely, larger values indicate that the fusion data effect is poor and the accuracy is low.
  4. Experiments and Results
The experimental flow in this paper, shown in 
Figure 1, covers three core aspects: data preprocessing, fusion experiment, and data evaluation, outlining the complete process from raw data processing to result verification.
  4.1. Preprocessing
To ensure the quality and reliability of each SST product and facilitate comparison on the same spatial scale, this paper performs thorough preprocessing before data fusion. The main preprocessing steps include data deep conversion, removing invalid values, quality control, and linear interpolation, which lay the foundation for the subsequent fusion analysis.
During the data preprocessing stage, this study first standardized the measurement depths of the data. As indicated by the fundamental information accompanying the datasets, while the Himawari-9 SST data were measured at the skin depth, all other datasets used in this study were measured at the subskin depth. To achieve depth consistency, we employed the physical conversion model to adjust the Himawari-9 SST data to the subskin level:
Key parameters:   −0.15 (maximum temperature difference coefficient),  = 0.3 (wind speed response coefficient);  is the 10-m wind speed.
The conversion was applied only to the skin depth SST data from Himawari-9. Data from other sources, originally at the subskin depth, were used directly. Consequently, all merged data products were standardized to the subskin depth, ensuring a consistent physical basis for the fused dataset.
Subsequently, invalid values were removed. This study eliminates the infinity values and NAN invalid values in the data to eliminate the interference of abnormal data on the results. Then, the quality control of the data is carried out on this basis. According to the data description information in the data download webpage, the quality of AMSR2, AVHRR, and the geostationary satellite SST data used are labelled on a scale from 0 to 5, where 0 represents no data, 1 represents data that should not be used in any case due to the influence of clouds, rain, etc., 2 is the worst-quality but usable data, and 5 represents the best-quality data, and the quality level increases step by step from 2 to 5 in a stepwise manner. SST data with quality levels 4 and 5 were selected for this paper based on a combination of data quality and quantity. For the MODIS SST product, the data quality identification also adopts the level from 0 to 5, but the order of good and bad quality is reversed from AMSR2. Therefore, SST data with quality levels 0 and 1 were selected for this paper. After the above processing, the linear interpolation method is used to unify the data grid, and finally, a standardized preprocessed dataset with a spatial resolution of 4 km is obtained.
In processing the measured data, each file contains SST data for the entire month. During the screening process, hourly SST measurements for 1 September were extracted and evaluated based on their quality identifiers. Only data with quality levels 4 or 5 were selected.
  4.2. Experiments
As shown in 
Figure 1, this study uses a rigorous processing procedure in the fusion of SST data. Although the five sets of geostationary satellite data selected for the study can cover the global sea area at low and middle latitudes, there is spatial overlapping of the data. This study follows a priority strategy, processing data in the order of Himawari-9, Met09_SEVIRI, Met10_SEVIRI, G16_ABI, and G18_ABI to generate the first stationary satellite observation dataset (named HMMGG, which stands for Himawari-9, Met09_SEVIRI, Met10_SEVIRI, G16_ABI, and G18_ABI). Subsequently, by adjusting the data reading priority, the data are re-read in the order of G18_ABI, G16_ABI, Met10_SEVIRI, Met09_SEVIRI, and Himawari-9 to generate the second geostationary satellite observation dataset (named GGMMH, formed by the initials of G18_ABI, G16_ABI, Met10_SEVIRI, Met09_SEVIRI, and Himawari-9). The two datasets, HMMGG and GGMMH, provide complementary geostationary satellite SST data sources for the subsequent multi-source fusion through the differentiated data combination strategy, which effectively improves the data diversity and coverage.
For the fusion of polar-orbiting satellite SST data, we applied the TC fusion algorithm to combine three datasets: AMSR2, AVHRR, and MODIS, resulting in SST data with a 12-h temporal resolution. Given the temporal coverage of polar-orbiting satellites, we processed the data across different time windows, generating three SST datasets centered on 00:00, 12:00, and 24:00 UTC on 1 September, each with a 6 h window before and after the central time point.
Finally, the TC fusion algorithm was used to integrate the polar-orbiting satellite data with two geostationary satellite datasets, producing global time-dependent SST data after processing data from various time periods.
  4.3. Results
This study uses data from 00:00 UTC on 1 September 2023 as a representative case to visualize the SST data fusion process and stage-wise results. 
Figure 2a–c display the preprocessed SST images from three types of polar-orbiting satellite data: AMSR2, AVHRR, and MODIS, highlighting the characteristics of the raw data. 
Figure 2d presents the preliminary results after processing these datasets using the TC fusion algorithm, illustrating the integration of the polar-orbiting satellite data.
Figure 3 further outlines the subsequent stages of the fusion process. 
Figure 3a shows the preliminary fusion result of the polar-orbiting satellite data (TC-AAM); 
Figure 3b,c show the preliminary fusion results of the HMMGG and GGMMH datasets generated from the geostationary satellite data, using different data reading sequences; and 
Figure 3d shows the final fusion result, TC-AHG, representing the complete global time-series SST data. The two sets of images systematically demonstrate the entire process, from raw data preprocessing and initial fusion of polar-orbiting and geostationary satellite data to the generation of the final global SST dataset.
   6. Conclusions
This study introduces the application of the TC fusion algorithm to effectively integrate SST data from both polar-orbiting and geostationary satellites. A notable advantage of the TC algorithm lies in its ability to autonomously evaluate and fuse multi-source data without the need for prior knowledge. The main objective is to leverage the wide spatial coverage of polar-orbiting satellites and the high temporal resolution of geostationary satellites to generate globally consistent, time-dependent, and high-precision SST datasets. The fused products show clear improvements, notably enhancing the temporal resolution to an hourly scale.
Through careful comparison and analysis of the fusion results with buoy measurement data, we observe that the SST data produced by the TC fusion algorithm demonstrate high accuracy. The spatial coverage of the fused data increased by 6 per cent (32 per cent compared to 26 per cent before fusion) and a daily mean Bias below 0.0427 °C, RMSE from about 0.5938 °C to 0.6965 °C, and R2 exceeding 0.9879. These experimental findings robustly confirm the reliability of the TC fusion algorithm in integrating polar-orbiting and geostationary satellite data and showcase the considerable potential of this method for SST data fusion.
This study lays a solid foundation for future research. To enhance practical application value, we plan to integrate the fused SST dataset into numerical weather prediction or climate models to validate its utility in typhoon forecasting and air–sea interaction research. At the methodological level, a key challenge lies in addressing the partial independence limitation of the HMMGG and GGMMH datasets arising from their shared input sources. Concurrently, one of the core priorities for future work is to conduct a comprehensive quantitative comparison to assess the performance of the TC method relative to alternative fusion schemes like Kalman filtering and optimal interpolation. Moreover, to further improve spatial coverage and accuracy—especially in high-latitude regions with limited GEO satellite coverage—future work could integrate VIIRS or other polar-optimized sensors, with careful attention to temporal alignment. These enhancements are expected to further refine the methodology and advance the quality of SST data products.