Revealing Spatial–Temporal Patterns of Sea Surface Temperature in the South China Sea Based on Spatial–Temporal Co-Clustering

: To discover the spatial–temporal patterns of sea surface temperature (SST) in the South China Sea (SCS), this paper proposes a spatial–temporal co-clustering algorithm optimized by information divergence. This method allows for the clustering of SST data simultaneously across temporal and spatial dimensions and is adaptable to large volumes of data and anomalous data situations. First, the SST data are initially clustered using the co-clustering algorithm. Second, we use information divergence as the loss function to refine the clustering results iteratively. During the iterative optimization of spatial clustering results, we treat the temporal dimension as a constraint; similarly, during the iterative optimization of temporal clustering, we treat the spatial dimension as a constraint. This is to ensure better robustness of the algorithm. Finally, this paper conducts experiments in the SCS to verify our algorithm. According to the analysis of the experimental results, we have drawn the following conclusions. First, the use of the spatial–temporal co-clustering algorithm reveals that the SST in the SCS exhibits strong seasonal patterns in the temporal clustering results. The spatial distribution of SST varies significantly in different seasons. There is a slight difference in SST between the northern and southern regions of the SCS in winter, but the largest difference is in summer. Second, during ocean anomalies, our proposed algorithm can identify the corresponding abnormal patterns. When ENSO occurs, the seasonal distribution pattern of SST in the SCS is destroyed and replaced by an abnormal temporal pattern. The results indicate that during ENSO events, the SST in specific months in the SCS exhibits a correlation with the SST observed 4–5 months afterward.


Introduction and Related Work 1.Introduction
The rapid development of ocean observation technology has resulted in the accumulation of a large amount of data.SST has been recognized as an essential climate variable and one of the leading indicators of climate change.SST directly impacts the water vapor exchange between the atmosphere and the ocean [1,2].The revelation of its concealed spatial-temporal patterns benefit marine forecasting and marine ecological environment protection [3].The SCS is situated between East Asia's most typical monsoon region and the area affected by the local Hadley circulation, and the SCS plays a significant role in the global climate system due to its large heat storage capacity.The SST in the SCS is an important variable to monitor, as it has significant impacts on regional circulation patterns, typhoon activity, marine ecosystems, and fisheries.The climate of the SCS has been anomalous in recent years, with frequent El Niño and Southern Oscillation events [4,5].Discovering the spatial-temporal patterns and distribution characteristics of the SST in the Appl.Sci.2024, 14, 4289 2 of 15 SCS can provide a reference for predicting natural disasters in the SCS region and neighboring countries [6].The spatial-temporal pattern mentioned here refers to the interdependent pattern of time and space, rather than a separate temporal or spatial pattern.
SST data are typical spatial-temporal data, as SST shows strong continuity characteristics in space and time [7].In terms of geographical space, there are usually similarities in the SST in adjacent areas due to ocean heat diffusion and ocean flow [8].In terms of temporal characteristics, it usually reflects seasonality.Variations in SST are usually accompanied by complex nonlinear processes, including seasonal changes, interannual changes, and abnormal varieties caused by natural phenomena such as the ENSO [9].
Therefore, this paper combines co-clustering with information divergence to propose a novel spatial-temporal co-clustering algorithm that can be used to analyze spatial-temporal data.Information divergence is a way of quantifying the difference between two matrices.It can help to measure the information lost when using one matrix to approximate another matrix.The advantage of using information divergence as a loss function compared to other distance measures (such as Euclidean distance and Manhattan distance) is that it pays more attention to the data distributions, while distance loss functions often only focus on the distance values between data.For spatiotemporal data such as SST, using information divergence as the loss function of the spatial-temporal co-clustering algorithm can better capture the spatiotemporal distribution characteristics of SST data.The spatial-temporal co-clustering algorithm can not only discover the seasonal characteristics of SST, but also the abnormal time patterns of SST when ENSO events occur.It can also discover different spatial patterns in different temporal clusters.We experimented with our method on the SST dataset of the SCS.First, we selected SST data from 2003.12-2004.11for experimentation to obtain the spatial-temporal patterns of the SST in the SCS.Next, we conducted experiments during the occurrence of ENSO events, demonstrating that our proposed algorithm can detect abnormal patterns.Then, the correlation of the corresponding SST time series was verified to support the conclusions of this paper.

Related Work
Currently, the widely used methodologies to discover spatial-temporal patterns include traditional matrix decomposition methods, such as Singular Value Decomposition (SVD) [10], Empirical Orthogonal Function (EOF) [11], Dynamic Mode Decomposition (DMD) [12], and Principal Component Analysis (PCA) [13,14].However, the traditional matrix decomposition method is suitable for situations that do not have massive data and in which the analysis scenario is not complex [15].In addition, machine learning methods are frequently employed in spatial-temporal data pattern discovery [16].The clustering algorithm is the most widely used machine learning algorithm, including the spatialtemporal density-based spatial clustering of application with noise (ST-DBSCAN) [17] and the clustering-based approach for discovering interesting places in a single trajectory (CB-SmoT) [18].But ST-DBSCAN has a density peak problem.After that, Wang et al. proposed spatial-temporal clustering by using a fast search and find of density peaks (ST-CFSFDP) [19].This method solved the density peak problem and could distinguish and identify clusters at the same location at different times.However, it cannot identify different spatial distributions on the same time cluster.Moreover, the above two methods use density-based clustering algorithms.Their ability to capture spatiotemporal characteristics is not as good as information divergence.And the clustering algorithms (like k-means) currently used cannot connect the dependencies of time and space, resulting in unreasonable results in time or space [20].
Some scholars used remote sensing data of the SST in the Yangtze River Estuary from 1982 to 2017 and a matrix decomposition method to investigate the seasonal and interannual variation characteristics of the SST in the Yangtze Estuary, China [21].In experiments using EOF to explore the temporal evolution and spatial distribution in the SST in the Yellow Sea and the East China Sea, the results showed that the SST in the Yellow Sea and the East China Sea showed different warming trends, and the spatial pattern was consistent with previous research results [22].Kyung-Ae et al. used EOF to examine the spatial and temporal variations of SST in the Yellow Sea over 29 years [23].It has been determined that the growth rate of the SST in the Yellow Sea's littoral regions is significantly higher than in the deep regions.Vertical stratification of the water column reveals long-term changes that result in varying surface heating.Cyclisation Empirical Orthogonal Function Analysis (CSEOF) is used to explore interannual and decadal variability in terrestrial water storage associated with ENSO [24].It can be observed that the methods are independently studying changes in time or spatial distribution.Moreover, with the explosive growth in data volume, matrix decomposition methods are gradually revealing their limitations, while machine learning methods are becoming increasingly popular.
Machine learning and visual analysis methods have also given a great impetus to the field of spatial-temporal data discovery.Ruela et al. utilized the k-means clustering algorithm to identify the most suitable CMIP5 models and to calculate the SST trends for the 21st century [25].In 2023, Peng and colleagues proposed a new spatial-temporal clustering algorithm to explore the spatiotemporal patterns of SST [26].This clustering algorithm considers the spatiotemporal dependencies.However, its network is complex and lacks interpretability.Chen et al. created a visual analysis system for geospatial data based on a Bayesian network, allowing users to interactively investigate anomalous patterns in geographic data [27].Wu used co-clustering for the first time in 2020 to investigate the temporal and spatial differentiation of spring phenology in China [28]; Rohanap et al. used a multidimensional clustering technique to demonstrate a spatial and temporal wave map for the detection of ocean energy potential [29].Similarly, the methods mostly analyze spatiotemporal data from a single dimension, either time or space, without considering spatial-temporal dependencies.
So, the traditional matrix decomposition method cannot adapt to current huge data volume, and the previously used clustering algorithms cannot capture well the spatialtemporal dependency in spatiotemporal data.Meanwhile, co-clustering algorithms are currently used in other fields, such as gene analysis.This paper aims to introduce a novel spatial-temporal co-clustering algorithm based on information divergence into the marine field to explore the spatial-temporal patterns of SST.

Materials
This paper uses NOAA's optimal interpolated SST (OISSTv2) data.The OISST datasets provide global SST data with high spatial resolution.And OISST combines data from multiple sources, including satellite observations, buoy records, and ship measurements.This fusion approach improves the accuracy and reliability of the data.Furthermore, the datasets provide long-term time series, often covering time spans of several decades.This is critical for analyzing issues such as climate change, long-term trends, and seasonal changes.The OISST datasets have undergone careful processing and quality control, including the processing of outliers and correction of cloud coverage [30].This ensures data accuracy and availability.For our experiments, we compute the average monthly SST.In addition, our study area encompasses the SCS and its adjacent waters, at the following location: The study area is shown in Figure 1.

Basic Concepts of Spatial-Temporal Co-Clustering
We use Figure 2 to intuitively explain the spatial-temporal co-clustering algorithm and its difference from other traditional clustering algorithms.Figure 2a represents the original SST data.Spatial clustering (Figure 2b) groups data locations into clusters with comparable attribute values across all timestamps, whereas temporal clustering (Figure 2c) groups timestamp into clusters across all timestamps.Many traditional gatherings can achieve this goal, such as k-means and hierarchical clustering.However, this cluster analysis only considers the spatial or temporal behavior of the data.In contrast to conventional clustering, spatial-temporal co-clustering (Figure 2d) simultaneously organizes locations into location clusters and timestamps into timestamp clusters and identifies spatial-temporal co-clusters where attribute values are comparable along both the location and timestamp.Since they were first proposed in the early 1970s [31], co-clustering methods have received significant attention.However, most of the previous research has concentrated on other fields, particularly bioinformatics [32], and only a small amount of recent research has focused on spatial-temporal data [33].

Method 2.2.1. Basic Concepts of Spatial-Temporal Co-Clustering
We use Figure 2 to intuitively explain the spatial-temporal co-clustering algorithm and its difference from other traditional clustering algorithms.Figure 2a represents the original SST data.Spatial clustering (Figure 2b) groups data locations into clusters with comparable attribute values across all timestamps, whereas temporal clustering (Figure 2c) groups timestamp into clusters across all timestamps.Many traditional gatherings can achieve this goal, such as k-means and hierarchical clustering.However, this cluster analysis only considers the spatial or temporal behavior of the data.In contrast to conventional clustering, spatial-temporal co-clustering (Figure 2d) simultaneously organizes locations into location clusters and timestamps into timestamp clusters and identifies spatial-temporal co-clusters where attribute values are comparable along both the location and timestamp.Since they were first proposed in the early 1970s [31], co-clustering methods have received significant attention.However, most of the previous research has concentrated on other fields, particularly bioinformatics [32], and only a small amount of recent research has focused on spatial-temporal data [33].

Basic Concepts of Spatial-Temporal Co-Clustering
We use Figure 2 to intuitively explain the spatial-temporal co-clustering algorith and its difference from other traditional clustering algorithms.Figure 2a represents t original SST data.Spatial clustering (Figure 2b) groups data locations into clusters w comparable attribute values across all timestamps, whereas temporal clustering (Figu 2c) groups timestamp into clusters across all timestamps.Many traditional gatherings c achieve this goal, such as k-means and hierarchical clustering.However, this cluster an ysis only considers the spatial or temporal behavior of the data.In contrast to convention clustering, spatial-temporal co-clustering (Figure 2d) simultaneously organizes locatio into location clusters and timestamps into timestamp clusters and identifies spatial-te poral co-clusters where attribute values are comparable along both the location a timestamp.Since they were first proposed in the early 1970s [31], co-clustering metho have received significant attention.However, most of the previous research has conce trated on other fields, particularly bioinformatics [32], and only a small amount of rece research has focused on spatial-temporal data [33].

Introduction of Information Divergence
Information divergence is an important concept in information science, commonly used to measure the difference or similarity between two probability distributions.
In information theory, probability distributions are typically represented as either a vector or a matrix, with each dimension of the vector or each element of the matrix representing the probability of an event.The definition of information divergence leverages the concepts of information entropy and KL divergence (Kullback-Leibler divergence).KL divergence, also known as relative entropy, is a method for measuring the difference between two probability distributions, with a smaller value indicating greater similarity between the distributions.Information divergence is defined as the relative entropy between two probability distributions minus the difference in their entropies.Here, relative entropy is utilized to quantify the difference between the two probability distributions, while entropy is used as a measure of uncertainty.Information divergence can be expressed as Equation ( 1): where A and B are two matrices of the same dimensions.For the problem of spatialtemporal co-clustering, we can set the original matrix as A and hope to find an approximate clustering matrix B that minimizes the information divergence between A and B. This process can be carried out through iterative optimization.
In spatial-temporal co-clustering algorithms, employing information divergence as the optimization function is an attractive choice, particularly because the input for spatialtemporal co-clustering is a matrix.First, a significant advantage of using information divergence is its adaptability; it does not depend on the specific distribution form of the data, allowing the algorithm to flexibly adapt to various types of data.This adaptability enables spatial-temporal co-clustering algorithms based on information divergence to be widely applied to text data, spatial-temporal data, and other complex data types, enhancing the algorithm's universality.Secondly, information divergence preserves the informativeness of data.During the clustering process, information divergence optimizes clustering results by minimizing differences within clusters and maximizing differences between clusters, ensuring that data post-clustering retains as much information from the original data as possible.This method is particularly valuable in the field of pattern recognition.Lastly, strong interpretability is also a key advantage of information divergence.By quantifying the difference between two probability distributions, information divergence provides an intuitive way to understand clustering results.
In summary, using information divergence as the optimization function in spatialtemporal co-clustering algorithms offers powerful adaptability, preserves the informativeness of data, provides good interpretability, and captures spatiotemporal distribution characteristics.These advantages make it an effective tool for addressing complex data clustering challenges.

Details of Spatial-Temporal Co-Clustering Algorithm
The spatial-temporal co-clustering algorithm views the SST data as a joint matrix of timestamps and locations, denoted by the symbol O(S, T), where S takes a value from the site set {s 1 , s 2 , . . . ,s m }, m indicates the total number of sites, and T takes its values from the timestamps set {t 1 , t 2 , . . . ,t n }, where t stands for the number of timestamps.The output of the algorithm is given by O Ŝ, T , where Ŝ is derived from the site clustering outcomes { ŝ1 , ŝ2 , . . . ,ŝk }, k represents the number of spatial clusters ( k < m), T is derived from the timestamp clustering outcomes t1 , t2 , . . ., tl , and l represents the number of time clusters (l < n).The goal of the algorithm is to minimize the information divergence between O(S, T) and O Ŝ, T .The algorithm flowchart is shown in Figure 3. Step 3 is the most crucial part of the spatial-temporal co-clustering algorithm, illustrating how we consider time when performing spatial clustering, and how we consider space when conducting temporal clustering.approximate matrix obtained in step 4 is the optimal co-clustering result   ,  ; otherwise, return to step 2 to begin a new cycle.The software packages we use to implement the algorithm include python3.6 (programming language), CGC version 0.7.0 (a clustering tool for geospatial applications, a software package that implements co-clustering), netcdf4 version 1.7.0 (a platform for processing NetCDF (Network Common Data Format) files), and basemap version 1.4.1 (a visualization tool).In addition, there are some simple data processing software packages, such as NumPy version 1.26.0 and Pandas version 2.2.0.

Spatial-Temporal Co-Clustering Result Analysis
Firstly, we selected the SST data of the SCS from 2003.12 to 2004.11.The results are shown in Figure 4.The temporal patterns show obvious seasonality.Time_cluster0 includes December of 2003, January, and February of 2004 (DJF).Time_cluster1 includes March, April, and May of 2004 (MAM).Time_cluster2 includes June, July, and August of 2004(JJA).Time_cluster3 includes September, October, and November of 2004 (SON).This conclusion is consistent with previous research [34].We can observe that the spatial patterns of the SST in the SCS differ across various temporal clusters.This uniqueness of the algorithm allows it to identify different spatial patterns on different time clusters in a single iteration.We can see that the spatial patterns of the SST in the SCS is clustered into two categories in winter (time_cluster0); the spatial patterns of the SST in the SCS in spring and autumn (time_cluster1, time_cluster3) is similar, and both are divided into three categories.Because spring and autumn are defined as transition seasons, they have similar spatial patterns.This point was also mentioned in paper [34].Due to the changeable climate in summer, the SST spatial clustering results of the SCS in summer are clustered into five categories (time_cluster2).Additionally, the spatial patterns reveal that category boundaries near the land are more inclined, indicating that SST close to land is more affected by terrestrial factors, including differences in heat capacity between the land and sea, as well as human activities [35].Step 1: Initialize at random the station-to-station mapping and timestamp-to-timestamp mapping.This is the subsequent step in calculating the loss of information divergence.The matrix after random initialization is A(S, T), as shown in Equation (2): where R and C are the matrices of m * k, n * l, respectively, representing the cluster members in space and time.
Step 2: For the two matrices before and after mapping, compute the loss of information divergence.The algorithm determines the loss of information by calculating the information divergence between the original data matrix O(S, T) and the approximated matrix O Ŝ, T , as shown in Equation (3): D I (■||■) is representative of the information divergence between the two matrices.O(S, T) is the original data matrix, whereas A(S, T) is the approximate matrix of the original matrix.According to the definition of information divergence, Equation (3) in step 2 can be written as Equation ( 4), wherein o(s, t) originates from O(S, T), and a(s, t) originates from A(S, T).
Step 3: This step is the key to spatial-temporal co-clustering, where we consider the spatial dimension when mapping new temporal clusters, and the temporal dimension when mapping new spatial clusters.Equation ( 4) can be decomposed into the information divergence of the mapping from stations to station groups, as shown in Equation (5).Equation ( 5) considers the temporal dimension when mapping new station clusters, which helps us understand how the spatial patterns of SST change over time, as shown in step 3a of Figure 3.
The information divergence of the mapping from timestamps to clusters of timestamps is shown in Equation (6).Similarly, Equation ( 6) considers the spatial dimension when mapping new temporal clusters, which helps us understand the impact of the spatial dimension on the temporal patterns of SST, as shown in step 3b of Figure 3.
Step 4: As stated previously, optimal co-clustering minimizes mutual information loss.Now that Equation ( 4) has been decomposed into information divergence in terms of station clustering and timestamp clustering, step 4 is to discover a new mapping for each station-to-station cluster that minimizes Equation ( 7): Similarly, locate a new mapping of each timestamp to clusters of timestamps that minimizes Equation ( 8): Step 5: Recalculate the loss of information utilizing the revised mapping and Equation ( 5).If the change in information loss is less than a predetermined threshold, the new approximate matrix obtained in step 4 is the optimal co-clustering result O Ŝ, T ; otherwise, return to step 2 to begin a new cycle.
The software packages we use to implement the algorithm include python3.6 (programming language), CGC version 0.7.0 (a clustering tool for geospatial applications, a software package that implements co-clustering), netcdf4 version 1.7.0 (a platform for processing NetCDF (Network Common Data Format) files), and basemap version 1.4.1 (a visualization tool).In addition, there are some simple data processing software packages, such as NumPy version 1.26.0 and Pandas version 2.2.0.

Spatial-Temporal Co-Clustering Result Analysis
Firstly, we selected the SST data of the SCS from 2003.12 to 2004.11.The results are shown in Figure 4.The temporal patterns show obvious seasonality.Time_cluster0 includes December of 2003, January, and February of 2004 (DJF).Time_cluster1 includes March, April, and May of 2004 (MAM).Time_cluster2 includes June, July, and August of 2004(JJA).Time_cluster3 includes September, October, and November of 2004 (SON).This conclusion is consistent with previous research [34].We can observe that the spatial patterns of the SST in the SCS differ across various temporal clusters.This uniqueness of the algorithm allows it to identify different spatial patterns on different time clusters in a single iteration.We can see that the spatial patterns of the SST in the SCS is clustered into two categories in winter (time_cluster0); the spatial patterns of the SST in the SCS in spring and autumn (time_cluster1, time_cluster3) is similar, and both are divided into three categories.Because spring and autumn are defined as transition seasons, they have similar spatial patterns.This point was also mentioned in paper [34].Due to the changeable climate in summer, the SST spatial clustering results of the SCS in summer are clustered into five categories (time_cluster2).Additionally, the spatial patterns reveal that category boundaries near the land are more inclined, indicating that SST close to land is more affected by terrestrial factors, including differences in heat capacity between the land and sea, as well as human activities [35].In winter, the spatial patterns of the SST in the SCS are usually divided into two categories.We call these two types of areas cold water areas and warm water areas.The central and northern parts of the SCS and the sea areas close to mainland China are usually cold water areas.In the cold water areas, the SCS is usually affected by the northeast monsoon circulation in winter.The northeast monsoon is a seasonal wind system that usually blows from the northeastern part of mainland China to the SCS in winter [36].As a result, the northern and western areas of the SCS were affected by this cold wind.And the northeast monsoon is cold and dry as it blows across mainland China, bringing cold air and low humidity.This cold and dry air causes the SST to drop as it blows across the SCS.Warm water areas are typically affected by tropical ocean currents and tropical climate systems during the winter.These areas are typically affected by tropical air currents, which bring relatively warm air.This helps maintain a higher SST [37].Due to the warmer SST and relatively abundant nutrients, these areas often support rich marine biodiversity, including coral reefs, tropical fish, and other marine life.These factors also contribute to a higher SST.In addition, these areas are usually affected by tropical currents, which bring relatively warm water bodies and help maintain a higher SST.
The reasons why the SST in the SCS is divided into three categories in spring and autumn involve different climate and ocean dynamic processes.We divide it into a cold water area, transition area, and warm water area.The reasons for it being divided into these three regions are probably as follows: in spring and autumn, the northern part of the SCS and the sea areas close to the land are still affected by the northeast monsoon, which leads to a lower SST in these areas [38,39].These areas are still cold water areas.The cold water area may also be impacted by some ocean circulation, such as tropical currents.These currents may flow in the northern part of the SCS, bringing relatively cold water and strengthening the SST characteristics of the cold SST area.Equatorial waters in the SCS tend to be warmer in spring and autumn than in winter.The higher temperatures in these areas are primarily attributed to the influence of the southeast monsoon and tropical currents [40].The southeast monsoon blows from the Pacific to the SCS, bringing relatively warm air.Tropical currents flow from the Pacific to the SCS, bringing relatively warm water, both of which help maintain a higher SST.The third category may be a transition zone between the cold and warm water regions, possibly affected by two different climatic and oceanographic processes, and therefore have a moderate SST.
In summer, the SST in the SCS is divided into five categories.The SST between the southernmost and northernmost parts of the study area exceeds 5 degrees.We have summarized the following reasons for this.First, the geographical position of the SCS within the monsoon-impacted zone of Southeast Asia renders it susceptible to the southwest monsoon's quiet warmth and humidity, predominantly during the summer.Consequently, the southwest part of the SCS exhibits an enhanced SST.Conversely, the northern In winter, the spatial patterns of the SST in the SCS are usually divided into two categories.We call these two types of areas cold water areas and warm water areas.The central and northern parts of the SCS and the sea areas close to mainland China are usually cold water areas.In the cold water areas, the SCS is usually affected by the northeast monsoon circulation in winter.The northeast monsoon is a seasonal wind system that usually blows from the northeastern part of mainland China to the SCS in winter [36].As a result, the northern and western areas of the SCS were affected by this cold wind.And the northeast monsoon is cold and dry as it blows across mainland China, bringing cold air and low humidity.This cold and dry air causes the SST to drop as it blows across the SCS.Warm water areas are typically affected by tropical ocean currents and tropical climate systems during the winter.These areas are typically affected by tropical air currents, which bring relatively warm air.This helps maintain a higher SST [37].Due to the warmer SST and relatively abundant nutrients, these areas often support rich marine biodiversity, including coral reefs, tropical fish, and other marine life.These factors also contribute to a higher SST.In addition, these areas are usually affected by tropical currents, which bring relatively warm water bodies and help maintain a higher SST.
The reasons why the SST in the SCS is divided into three categories in spring and autumn involve different climate and ocean dynamic processes.We divide it into a cold water area, transition area, and warm water area.The reasons for it being divided into these three regions are probably as follows: in spring and autumn, the northern part of the SCS and the sea areas close to the land are still affected by the northeast monsoon, which leads to a lower SST in these areas [38,39].These areas are still cold water areas.The cold water area may also be impacted by some ocean circulation, such as tropical currents.These currents may flow in the northern part of the SCS, bringing relatively cold water and strengthening the SST characteristics of the cold SST area.Equatorial waters in the SCS tend to be warmer in spring and autumn than in winter.The higher temperatures in these areas are primarily attributed to the influence of the southeast monsoon and tropical currents [40].The southeast monsoon blows from the Pacific to the SCS, bringing relatively warm air.Tropical currents flow from the Pacific to the SCS, bringing relatively warm water, both of which help maintain a higher SST.The third category may be a transition zone between the cold and warm water regions, possibly affected by two different climatic and oceanographic processes, and therefore have a moderate SST.
In summer, the SST in the SCS is divided into five categories.The SST between the southernmost and northernmost parts of the study area exceeds 5 degrees.We have summarized the following reasons for this.First, the geographical position of the SCS within the monsoon-impacted zone of Southeast Asia renders it susceptible to the southwest monsoon's quiet warmth and humidity, predominantly during the summer.Consequently, the southwest part of the SCS exhibits an enhanced SST.Conversely, the northern maritime region, inclined towards the cooler northeast monsoon, features a relatively reduced SST [41].Second, ocean circulation profoundly influences the SST disparities across the different areas within the SCS.The dominant ocean streams within the SCS include the West Drift, the North Equatorial Stream, and the SCS Vortex.These ocean currents' strength and direction are perpetually changing in correlation with temporal and spatial shifts, resulting in the marked contrast of SST across diverse sea area within the SCS.In conclusion, the pronounced disparity in summer SST distribution in the SCS is attributed to several factors, such as the monsoon, ocean circulation, depth, topography, and monsoon precipitation, resulting in significant variations in SST distributions.
The seasonal characteristics of the SST in the SCS have direct and indirect effects on global climate patterns.The monsoon system not only affects the climate in Asia, but also affects distant climate systems through air-sea interactions.Seasonal characteristics in heat and humidity in the SCS can spread through atmospheric circulation and affect global climate patterns.In addition, the seasonal characteristics of the SST in the SCS have an important impact on the distribution and diversity of marine life.SST and salinity changes in different seasons affect the reproduction, migration, and distribution of marine life.SST changes in the SCS are particularly important for coral reef ecosystems.Excessively high water temperatures can cause coral bleaching and cause damage to coral reef ecosystems.By understanding these influential variables, we can enhance our understanding of the SST variation within different regions of the SCS.This comprehension is crucial for unraveling the intricate marine climatology of the region.

Spatial-Temporal Co-Clustering Results Analysis during the ENSO Period
We also selected for experiments the SST data in the SCS when the ENSO phenomenon occurred.First, the time range of the SST data we selected spans from 2020.06 to 2022.11.During this period, the La Niña phenomenon occurred.Based on the clustering results during this period, we analyzed the similarities and differences with SST spatial-temporal co-clustering in the non-ENSO period.The clustering results are shown in Figure 5.The temporal clustering results and the spatial pattern of each temporal cluster are shown in Figure 5.We find that even when the ENSO phenomenon occurs, the spatial distribution of the SCS will not change greatly.The spatial distribution of time_cluster0 corresponds to the winter of the non-ENSO period, the spatial distributions of time_cluster1 and time_cluster2 correspond to the spring and autumn of the non-ENSO period, respectively, and the spatial distribution of time_cluster3 corresponds to the summer of the non-ENSO period.However, the seasonal characteristics of the temporal clustering results have been disrupted.
Appl.Sci.2024, 14, x FOR PEER REVIEW 9 of 15 maritime region, inclined towards the cooler northeast monsoon, features a relatively reduced SST [41].Second, ocean circulation profoundly influences the SST disparities across the different areas within the SCS.The dominant ocean streams within the SCS include the West Drift, the North Equatorial Stream, and the SCS Vortex.These ocean currents' strength and direction are perpetually changing in correlation with temporal and spatial shifts, resulting in the marked contrast of SST across diverse sea area within the SCS.In conclusion, the pronounced disparity in summer SST distribution in the SCS is attributed to several factors, such as the monsoon, ocean circulation, depth, topography, and monsoon precipitation, resulting in significant variations in SST distributions.
The seasonal characteristics of the SST in the SCS have direct and indirect effects on global climate patterns.The monsoon system not only affects the climate in Asia, but also affects distant climate systems through air-sea interactions.Seasonal characteristics in heat and humidity in the SCS can spread through atmospheric circulation and affect global climate patterns.In addition, the seasonal characteristics of the SST in the SCS have an important impact on the distribution and diversity of marine life.SST and salinity changes in different seasons affect the reproduction, migration, and distribution of marine life.SST changes in the SCS are particularly important for coral reef ecosystems.Excessively high water temperatures can cause coral bleaching and cause damage to coral reef ecosystems.By understanding these influential variables, we can enhance our understanding of the SST variation within different regions of the SCS.This comprehension is crucial for unraveling the intricate marine climatology of the region.

Spatial-Temporal Co-Clustering Results Analysis during the ENSO Period
We also selected for experiments the SST data in the SCS when the ENSO phenomenon occurred.First, the time range of the SST data we selected spans from 2020.06 to 2022.11.During this period, the La Niña phenomenon occurred.Based on the clustering results during this period, we analyzed the similarities and differences with SST spatialtemporal co-clustering in the non-ENSO period.The clustering results are shown in Fig- ure 5.The temporal clustering results and the spatial pattern of each temporal cluster are shown in Figure 5.We find that even when the ENSO phenomenon occurs, the spatial distribution of the SCS will not change greatly.The spatial distribution of time_cluster0 corresponds to the winter of the non-ENSO period, the spatial distributions of time_clus-ter1 and time_cluster2 correspond to the spring and autumn of the non-ENSO period, respectively, and the spatial distribution of time_cluster3 corresponds to the summer of the non-ENSO period.However, the seasonal characteristics of the temporal clustering results have been disrupted.Abnormal time clustering is seen in time_cluster1 and time_cluster2.Time_cluster1 includes the months of November 2020 and 2021, as well as the months of June 2021 and 2022.Time_cluster2 includes the months of December 2020 and 2021, as well as the months of May 2021 and 2022.We postulate that these irregularities reflect the La Niña phenomenon's influence on the SST in the SCS, an impact that resurfaces periodically.Similarly, when El Niño occurs, the similar clustering results appear.This abnormal clustering result has also been found in the Yellow Sea and the Bohai Sea [42].This paper will select ENSO events of different intensities to verify this abnormal pattern, specifically includes June 1998 to June 2000 (strong La Niña event), June 2010 to June 2012 (mild La Niña event), February 1997 to April 1998 (strong El Niño event), February 2014 to April 2015 (mild El Niño event).

Temporal Clustering Results under ENSO Events of Different Intensities
Here, we set the number of time clusters to three and select two La Niña events for experimental purpose.From 1998.06 to 2000.06, when an intense La Niña event was observed, and from 2010.06 to 2012.06, when a mild La Niña event was observed.Similarly, the clustering results of these two periods mirrored those of the appeal experiment, as illustrated in Figures 6 and 7. Figure 6 shows the temporal clustering result under an intense La Niña event.We can see that the abnormal category is time_cluster1, including 1998-06, 1998 -11, 1998-12, 1999-04, 1999-05, 1999-06, 1999-11, 1999-12, 2000-04, 2000-05, and 2000-06.Figure 7  Abnormal time clustering is seen in time_cluster1 and time_cluster2.Time_cluster1 includes the months of November 2020 and 2021, as well as the months of June 2021 and 2022.Time_cluster2 includes the months of December 2020 and 2021, as well as the months of May 2021 and 2022.We postulate that these irregularities reflect the La Niña phenomenon's influence on the SST in the SCS, an impact that resurfaces periodically.Similarly, when El Niño occurs, the similar clustering results appear.This abnormal clustering result has also been found in the Yellow Sea and the Bohai Sea [42].This paper will select ENSO events of different intensities to verify this abnormal pattern, specifically includes June 1998 to June 2000 (strong La Niña event), June 2010 to June 2012 (mild La Niña event), February 1997 to April 1998 (strong El Niño event), February 2014 to April 2015 (mild El Niño event).

Temporal Clustering Results under ENSO Events of Different Intensities
Here, we set the number of time clusters to three and select two La Niña events for experimental purpose.From 1998.06 to 2000.06, when an intense La Niña event was observed, and from 2010.06 to 2012.06, when a mild La Niña event was observed.Similarly, the clustering results of these two periods mirrored those of the appeal experiment, as illustrated in Figures 6 and 7. Figure 6 shows the temporal clustering result under an intense La Niña event.We can see that the abnormal category is time_cluster1, including 1998-06, 1998 -11, 1998-12, 1999-04, 1999-05, 1999-06, 1999-11, 1999-12, 2000-04, 2000-05, and 2000-06.Figure 7     Abnormal time clustering is seen in time_cluster1 and time_cluster2.Time_cluster1 includes the months of November 2020 and 2021, as well as the months of June 2021 and 2022.Time_cluster2 includes the months of December 2020 and 2021, as well as the months of May 2021 and 2022.We postulate that these irregularities reflect the La Niña phenomenon's influence on the SST in the SCS, an impact that resurfaces periodically.Similarly, when El Niño occurs, the similar clustering results appear.This abnormal clustering result has also been found in the Yellow Sea and the Bohai Sea [42].This paper will select ENSO events of different intensities to verify this abnormal pattern, specifically includes June 1998 to June 2000 (strong La Niña event), June 2010 to June 2012 (mild La Niña event), February 1997 to April 1998 (strong El Niño event), February 2014 to April 2015 (mild El Niño event).

Temporal Clustering Results under ENSO Events of Different Intensities
Here, we set the number of time clusters to three and select two La Niña events for experimental purpose.From 1998.06 to 2000.06, when an intense La Niña event was observed, and from 2010.06 to 2012.06, when a mild La Niña event was observed.Similarly, the clustering results of these two periods mirrored those of the appeal experiment, as illustrated in Figures 6 and 7. Figure 6 shows the temporal clustering result under an intense La Niña event.We can see that the abnormal category is time_cluster1, including 1998-06, 1998-11, 1998-12, 1999-04, 1999-05, 1999-06, 1999-11, 1999-12, 2000-04, 2000-05, and 2000-06.Similarly, we also selected data during the El Niño event period for experiments, specifically including data from 1997.02 to 1998.04 when an intense El Niño event was observed, and 2014.02 to 2015.04 when a mild El Niño event was observed.The experimental results are shown in Figures 8 and 9.The abnormal category in Figure 8  Similarly, we also selected data during the El Niño event period for experiments, specifically including data from 1997.02 to 1998.04 when an intense El Niño event was observed, and 2014.02 to 2015.04 when a mild El Niño event was observed.The experimental results are shown in Figures 8 and 9.The abnormal category in Figure 8

Correlation Analysis for Abnormal Clusters
To verify the lagged correlation, we performed correlation studies on SST sequences of abnormal clusters.We used the Pearson correlation coefficient, as shown in Equation (9).The verification results are shown in Table 1.
where  and  respectively represent two time series of equal length, respectively. representing the mean value of  and  representing the mean value of .In this article,  and  represent the corresponding SST time series.
When the intensity of ENSO events is high, the correlation of the SST time series in abnormal temporal clusters is 83.5% and 80.1%, respectively.When the intensity of ENSO events is not high, the correlation of SST in abnormal temporal clusters is 68.2% and 63.9%, respectively.This indicates that the intensity of ENSO events also affects the temporal clustering results of SST.As mentioned above, during strong ENSO events, the South Similarly, we also selected data during the El Niño event period for experiments, specifically including data from 1997.02 to 1998.04 when an intense El Niño event was observed, and 2014.02 to 2015.04 when a mild El Niño event was observed.The experimental results are shown in Figures 8 and 9.The abnormal category in Figure 8

Correlation Analysis for Abnormal Clusters
To verify the lagged correlation, we performed correlation studies on SST sequences of abnormal clusters.We used the Pearson correlation coefficient, as shown in Equation (9).The verification results are shown in Table 1.
where  and  respectively represent two time series of equal length, respectively. representing the mean value of  and  representing the mean value of .In this article,  and  represent the corresponding SST time series.
When the intensity of ENSO events is high, the correlation of the SST time series in abnormal temporal clusters is 83.5% and 80.1%, respectively.When the intensity of ENSO events is not high, the correlation of SST in abnormal temporal clusters is 68.2% and 63.9%, respectively.This indicates that the intensity of ENSO events also affects the temporal clustering results of SST.As mentioned above, during strong ENSO events, the South

Correlation Analysis for Abnormal Clusters
To verify the lagged correlation, we performed correlation studies on SST sequences of abnormal clusters.We used the Pearson correlation coefficient, as shown in Equation ( 9).The verification results are shown in Table 1.
where X and Y respectively represent two time series of equal length, respectively.X representing the mean value of X and Y representing the mean value of Y.In this article, X and Y represent the corresponding SST time series.When the intensity of ENSO events is high, the correlation of the SST time series in abnormal temporal clusters is 83.5% and 80.1%, respectively.When the intensity of ENSO events is not high, the correlation of SST in abnormal temporal clusters is 68.2% and 63.9%, respectively.This indicates that the intensity of ENSO events also affects the temporal clustering results of SST.As mentioned above, during strong ENSO events, the South China Sea's SST affected by ENSO events will again influence the South China Sea's SST four months later, while during weak ENSO events, the South China Sea's SST affected by ENSO will again influence the South China Sea's SST five months later.

Discussion
This paper studies the spatial-temporal distribution of the SST in the SCS using the proposed algorithm.By analyzing the SST in the SCS at different periods for experimental comparison, we make the following findings: 1.
Through the spatial-temporal co-clustering results, it can be determined that the SST distribution in the SCS and its coastal waters changes greatly with time.Especially in winter, the SST distribution is relatively uniform, and the difference in SST across the entire sea area is not significant.In summer, due to the influence of circulation and monsoons, the average SST of the entire sea area is higher, and the difference between the north and the south is also large.The two seasons, spring, and autumn, have the same spatial distribution.In years when the ENSO phenomenon occurs, the SST will be significantly lower or higher than in normal years, but this will not affect the overall SST spatial distribution in the SCS. 2.
By observing the cluster boundaries, the class boundaries are usually in the northeastsouthwest direction.And the closer it is to the continent, the more inclined the class dividing line is, which can indicate that the SST distribution in the SCS and its adjacent seas can also be found to be affected by the land.Because the heat capacities of land and ocean are different, the SST near land is more susceptible to regulation by the land.

3.
Based on the SST clustering results of the SCS when the ENSO phenomenon occurs, we determined that when the ENSO phenomenon occurs, it will have an impact on the time clustering results of the SST in the SCS.Specifically, the impact of ENSO events on the SST in the SCS will affect the SST in the SCS again after a period.And it is related to the intensity of ENSO events.The greater the intensity, the shorter the lag time.
Studying the spatial-temporal patterns of the SST in the SCS has the following significance: 1.
By studying the spatial-temporal patterns of SST in the SCS, we can encode the discovered patterns into deep learning input to improve the accuracy and interpretability of the algorithm.This helps us more accurately model and predict regional and even global climate change.Subsequently, we will encode the spatial-temporal distribu-tion identified in this article into the feature input of the deep learning algorithm to improve the accuracy of SST prediction.

2.
The SCS is often affected by extreme weather such as typhoons [43], and SST is one of the important factors affecting the intensity of typhoons [44].Therefore, studying the spatial-temporal patterns of the SST in the SCS can improve our ability to predict and respond to these disasters.

3.
Regarding marine ecological protection, SST has a direct impact on the health of marine ecosystems, especially sensitive ecosystems such as coral reefs.Studying spatial-temporal patterns can help us understand and protect these ecosystems [45].Marine heatwaves are extreme ocean phenomena often caused by persistent anomalies in SST.We are also conducting research on marine heat waves.Through the SST spatiotemporal pattern we discovered, we can predict marine heat waves more accurately.And our research on marine heat waves has yielded results [46].

Conclusions
This paper proposes a spatial-temporal co-clustering algorithm based on information divergence, and its advantages are as follows.
Unlike traditional clustering algorithms that only cluster rows or columns, the algorithm proposed by us performs clustering on both the rows and columns of a matrix.This makes it not only suitable for the field of spatial-temporal pattern discovery, but also other fields such as text mining and genetic engineering.It is an iterative optimization strategy that can be initialized randomly or in other effective ways, and then performs row-clustering and column-clustering in a loop.Because these clustering processes can be performed separately and in parallel, this algorithm is very efficient.In addition, the algorithm uses information divergence as a similarity measure, which has advantages when dealing with non-negative data, such as count data or probability distributions.This enables the algorithm to have better performance when processing such data.
This paper demonstrates that the spatial-temporal co-clustering algorithm holds tremendous promise for discovering spatial-temporal patterns of in the Ocean Field.Under the assumption of mutual constraints in time and space, it can unearth potential patterns of spatial-temporal data.Our next moves will be in the two directions listed below: 1.
Our spatial-temporal co-clustering algorithm has only been applied in the South China Sea.This is constrained by our limited computing resources.We are exploring parallel distribution computing technique such as Dask.In the future, we will offer an algorithm implementation utilizing Dask, ideal for distributed systems like compute clusters.The parallel spatial-temporal co-clustering algorithm implemented based on Dask will be applied globally to discover the spatiotemporal patterns of global SST.

2.
The marine environment comprises diverse elements that interact with and impact each other.In future studies, we aim to extend our analysis beyond the spatial and temporal constraints by incorporating critical oceanic elements like salinity and sea surface height into collaborative clustering.This will help us investigate the potential of multi-element co-clustering in marine research.

Figure 1 .
Figure 1.Our study area is the area marked red in the figure.

Figure 1 .
Figure 1.Our study area is the area marked red in the figure.

Figure 1 .
Figure 1.Our study area is the area marked red in the figure.

Figure 3 .
Figure 3. Flowchart of spatial-temporal co-clustering algorithm for discovering spatial-temporal patterns of SST."**" indicates the specific SST value.

Figure 3 .
Figure 3. Flowchart of spatial-temporal co-clustering algorithm for discovering spatial-temporal patterns of SST."**" indicates the specific SST value.

15 Figure 4 .
Figure 4. Spatial-temporal co-clustering results for the SST data between 2003.12~2004.11.The four time clusters correspond to the four seasons, and each season displays unique spatial pattern represented by colors.

Figure 4 .
Figure 4. Spatial-temporal co-clustering results for the SST data between 2003.12~2004.11.The four time clusters correspond to the four seasons, and each season displays unique spatial pattern represented by colors.

Figure 5 .
Figure 5. Spatial-temporal co-clustering results of SST in the SCS when La Niña occurs between 2020.06~2022.11,seasonal characteristics are destroyed, co-clustering result exhibits abnormal temporal patterns.

Figure 5 .
Figure 5. Spatial-temporal co-clustering results of SST in the SCS when La Niña occurs between 2020.06~2022.11,seasonal characteristics are destroyed, co-clustering result exhibits abnormal temporal patterns.

Figure 6 .
Figure 6.Temporal results of SST in the SCS when intense La Niña occurs between 1998.06~2000.06,months-cluster1 is the anomaly category.

Figure 7 .
Figure 7. Temporal results of SST in the SCS when mild La Niña occurs between 2010.06~2012.06,months-cluster1 is the anomaly category.

Figure 6 .
Figure 6.Temporal results of SST in the SCS when intense La Niña occurs between 1998.06~2000.06,months-cluster1 is the anomaly category.

Figure 6 .
Figure 6.Temporal results of SST in the SCS when intense La Niña occurs between 1998.06~2000.06,months-cluster1 is the anomaly category.

Figure 7 .
Figure 7. Temporal results of SST in the SCS when mild La Niña occurs between 2010.06~2012.06,months-cluster1 is the anomaly category.

Figure 7 .
Figure 7. Temporal results of SST in the SCS when mild La Niña occurs between 2010.06~2012.06,months-cluster1 is the anomaly category.

Figure 8 .
Figure 8. Temporal results of SST in the SCS when intense El Niño occurs between 1997.02~1998.04,months-cluster1 is the anomaly category.

Figure 9 .
Figure 9. Temporal results of SST in the SCS when mild El Niño occurs between 2014.02-2015.04,months-cluster1 is the anomaly category.

Figure 8 .
Figure 8. Temporal results of SST in the SCS when intense El Niño occurs between 1997.02~1998.04,months-cluster1 is the anomaly category.

Figure 8 .
Figure 8. Temporal results of SST in the SCS when intense El Niño occurs between 1997.02~1998.04,months-cluster1 is the anomaly category.

Figure 9 .
Figure 9. Temporal results of SST in the SCS when mild El Niño occurs between 2014.02-2015.04,months-cluster1 is the anomaly category.

Figure 9 .
Figure 9. Temporal results of SST in the SCS when mild El Niño occurs between 2014.02-2015.04,months-cluster1 is the anomaly category.

Table 1 .
Verification of SCS's SST lagged correlation during the selected ENSO periods.It demonstrates the lagged correlation of SST time series in the abnormal time category under ENSO events of different intensities.