A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction

Lou, Ge; Zhang, Jiabao; Zhao, Xiaofeng; Zhou, Xuan; Li, Qian

doi:10.3390/rs16173216

Open AccessArticle

A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction

by

Ge Lou

^1,2,

Jiabao Zhang

^1,2,

Xiaofeng Zhao

^1,2

,

Xuan Zhou

³

and

Qian Li

^1,2,*

¹

The College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410005, China

²

Key Laboratory of High Impact Weather (Special), China Meteorological Administration, Changsha 410005, China

³

Key Laboratory of Smart Planet, Beijing 100094, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(17), 3216; https://doi.org/10.3390/rs16173216

Submission received: 15 July 2024 / Revised: 24 August 2024 / Accepted: 29 August 2024 / Published: 30 August 2024

(This article belongs to the Special Issue Artificial Intelligence and Big Data for Oceanography)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Sea surface temperature (SST) is an important factor in the marine environment and has significant impacts on climate, ecology, and maritime activities. Most existing SST prediction methods consider the ocean as a uniform field and use a uniform grid to predict SST. However, the marine environment is a complex system, and factors such as solar radiation, differences in land and sea thermal properties, and ocean circulation lead to uneven spatial distributions of SSTs. We propose a non-uniform grid construction method based on an SST spatial gradient to encode SST data, as well as a Non-uniform Grid Graph Convolutional Network (NGGCN) model. The NGGCN consists of two spatiotemporal modules, each of which extracts spatial features from the GCN module, captures temporal correlations through the GRU module, and performs feature restoration and output results through the fully connected module. We selected data from the Yellow Sea and Bohai Sea to validate the effectiveness of the NGGCN in predicting SST at different time scales and prediction steps. The results indicate that our model shows a significant improvement in prediction performance compared to other models.

Keywords:

sea surface temperature; non-uniform grid; spatiotemporal prediction; graph neural network

1. Introduction

Sea Surface Temperature (SST) is a crucial physical and chemical indicator of seawater. It plays a significant role in the process of interaction between the Earth’s surface and the atmosphere, with a major impact on the global ecological environments and climate [1]. Therefore, the prediction of SST has critical guiding significance for large and medium-scale marine physical phenomena [2,3]. And it plays a fundamental role in many application fields such as marine weather forecasting [4,5], marine activities including fisheries [6,7,8] and mining, marine environmental protection, and military operations [9,10]. Consequently, SST prediction is a critical issue in marine science and has been widely studied in recent years.

Existing SST prediction methods can be divided into three major categories: numerical forecasting methods, traditional machine learning methods, and deep learning methods. The details of the above three categories are discussed as below.

Numerical forecasting methods [11] simulate marine physical processes and combine observational data to forecast SST changes. These methods have been widely applied in practical operations, and with the rapid development of assimilation techniques and model improvements, the accuracy of global and regional numerical forecasts of SST has been significantly enhanced to varying degrees [12]. The Coupled Model Intercomparison Project (CMIP) [13] is an international collaborative project aimed at sharing, analyzing, and comparing simulation results from the latest global climate models. These climate prediction models are based on numerical forecasting methods and improve the accuracy of SST prediction by integrating multiple mathematical models. The project has been updated to CMIP6. Peng et al. [14] studied the seasonal prediction of SST in the nearshore areas of China based on CMIP6. Research on correction methods and assimilation techniques has led to some improvements in the prediction results of numerical methods. However, due to the complexity of the marine environment, models’ descriptions and simulations of marine physical processes are still not accurate enough, and the issues of initial field uncertainty and numerical solution errors remain unresolved. Therefore, there are still accuracy issues in SST prediction through numerical methods.

Traditional machine learning methods [15] directly learn the rules of SST changes from massive historical databases and use them for prediction. Common machine learning methods include linear regression [16], the Support Vector Machine (SVM) models [17], and Artificial Neural Networks (ANNs) [18]. For example, Kug [16] established a linear regression model based on the lag relationship between the SST in the Indian Ocean and NINO3 SST. Lins [19] proposed a tropical Atlantic SST prediction method combining SVM models using data provided by the PIRATA project buoys. Tangang [18] used an ANN model to predict seasonal SST changes in a selected region of the tropical Pacific. Wu [20] utilized an ANN to predict the five principal components of SST in the tropical Pacific using sea level pressure and SST anomalies as inputs. Garcia-Gorriz [21] analyzed the ability of neural networks to estimate seasonal and interannual SSTs in the western Mediterranean from 1960 to 2005 using monthly averages of meteorological parameters such as mean sea level pressure, wind, temperature, and cloud cover. Aparna [22] developed an ANN model for predicting SST and SST fronts in the north-eastern Arabian Sea. Although traditional machine learning methods have achieved certain results in SST prediction, they have limitations in terms of computing power and model refinement when facing increasingly refined SST data, making it difficult to fully learn deep features from a large amount of SST data for prediction.

Deep learning methods use deep neural networks to model and predict SST, which can more effectively process massive amounts of data. Compared to traditional machine learning methods, deep learning models have powerful nonlinear fitting capabilities and can learn deep features from the data. Currently, deep learning methods such as Convolutional Neural Networks (CNN) [23,24], Long Short-Term Memory (LSTM) [25,26], and Graph Neural Networks (GNN) [27,28] have been widely applied to SST prediction, achieving excellent predictive performance. Zhang [25] first treated SST prediction as a time series regression problem, using a fully connected Long Short-Term Memory (FC-LSTM) to model the sequential relationship and predict SST. Yang [29] proposed a CFCC-LSTM model consisting of FC-LSTM layers and convolutional layers. The model takes the temporal and spatial information of SST data as a set of three-dimensional grids, with the FC-LSTM layer extracting temporal features and the convolutional layer extracting spatial features to produce the prediction. The outstanding performance of the ConvLSTM network in precipitation forecasting has attracted the attention of researchers [30]. Xiao [31] conducted a joint forecast experiment for the future 10 days of SST in the East China Sea using the ConvLSTM network. Zha [32] introduced a multi-granularity spatiotemporal network (MGSN), constructing a multi-branch network to extract SST time features at different granularities. This model has richer time granularity features, can simulate more complex SST changes, and considers the influence of other locations in the spatial domain, thereby improving the prediction accuracy to some extent. With increasing prediction steps, the errors between each step also increase. Qiao [33] introduced an attention mechanism to address this issue. The attention mechanism can assign different weights to different parts of the model, allowing the model to focus more on task-relevant parts, thereby improving the model’s quality. Xie [34] proposed a Gated Recurrent Unit (GRU) encoder-decoder (GED) that implements a dynamic impact chain (DIL) between historical and future SST values. Xin [35] used a three-channel convolutional LSTM for real-time SST forecasting. Due to the irregularity of the ocean shape, in certain areas such as land and islands, effective SST data are lacking. In such cases, regular grid networks such as CNNs struggle to fully encode the spatial variations of SST.

In recent years, Graph Neural Networks (GNNs) [36] have rapidly developed and have been applied in various scenarios of deep learning. GNNs can handle graph-structured data representing features in Euclidean space as irregular networks. Zhang [26] introduced the Memory Graph Convolutional Network (MGCN) to address the inability of regular grid neural networks to fully encode SST. Liang [37] proposed an SST prediction method based on the Graph Memory Neural Network (GMNN), using a graph to adequately represent the spatial information of incomplete areas in SST data. This model uses distance thresholds and Pearson correlation coefficients to establish the graph representation of SST in order to fully express the spatial information of irregular areas. By applying edge construction methods to each node, node and edge representations of SST data are obtained, and edge information is updated through edge updates and edge aggregation functions. Although the methods based on GNNs have solved the problem of difficult encoding in irregular regions compared to traditional deep learning methods, there are still some issues that need to be addressed.

These above methods view the marine environment as a uniformly changing field, and then use a uniform grid to model and predict SST. However, the marine environment is a complex system, influenced by various meteorological and hydrological elements such as solar radiation, atmospheric temperature, ocean currents, and wind, resulting in a highly uneven spatial distribution of SSTs. Firstly, due to the differences in the thermal properties of land and sea, in the middle of the ocean, there are usually areas with SST data that are the same or change very little over tens of square kilometers. While closer to the edge of the ocean, the gradient of SST data becomes larger, especially in coastal areas, where the spatial variations in SST are very evident. Secondly, influenced by the variation of solar radiation with latitude, in the low latitude region of 0–20°, the spatial distribution of SST is relatively uniform, while in the mid-to-high latitude region, especially in the 30°–50° region, the spatial variation of SST is more drastic. In this case, if a uniform grid is used for prediction, it cannot adapt to the complexity of the spatial distribution of SST. When using grids that are too sparse, the prediction accuracy is limited in regions with large spatial variations in SST, while using grids that are too dense leads to computational waste in regions with a relatively uniform spatial distribution of SST.

To address the above issues, we propose a Non-uniform Grid Graph Convolutional Network (NGGCN) for the precise prediction of SST. The NGGCN first calculates the spatial gradient of SST in the sea area and designs a threshold weight function based on the gradient. We first established a uniform grid, and then decided whether to select or discard the data points adjacent to each node in the original data field according to the value of the spatial gradient at each node. Then we subdivided and merged the grid to finally obtain a non-uniform grid based on the spatial gradient. Considering the irregularity of SST data distribution, in order to fully encode the spatial variation of SST, we used a GNN to model SST. We converted the SST data into a graph representation based on the generated non-uniform grid and obtained spatial correlations through graph convolution. Then, the output of the graph convolution was constructed into a time series input to the Gated Recurrent Unit (GRU) to obtain temporal correlation, and the final prediction result was output to achieve the effect of fine-grained prediction of SST.

In summary, we have made the following contributions in this work:

We identified the problem brought by the uneven spatial distribution of SST to the prediction work, and proposed the NGGCN method that can achieve the precise prediction of SST.
We designed a threshold weight function based on the spatial gradient of SSTs within the region, generated a non-uniform grid topology for the current region, and captured the spatial correlation of SSTs through graph convolution. We designed a time encoder GRU to capture the trend of SST over time.
We conducted extensive experiments on SST datasets in representative regions, and the results demonstrated the effectiveness and superiority of our model in predicting SSTs in representative regions.

The rest of this paper is organized as follows. We will describe the technical details of our method in Section 2. Section 3 presents the experimental part. Section 4 provides the results of validation and evaluation. Finally, Section 5 will conclude this paper.

2. Methods and Materials

2.1. Problem Statement

For SST prediction, we typically divide the study area into grids based on longitude and latitude, with the data at grid points representing SST. Assuming there are

L

grids along the latitude and

W

grids along the longitude, we obtain a total of

L \times W

grid regions. For example, Figure 1 shows the grid division of a portion of the Northwest Pacific. The target area is within [120°E–130°E, 30°N–40°N], divided into 100 grids of size 1° × 1° each. For each time interval

t

, the SST data of all grid points in the entire area form a matrix

X_{t} \in R^{L \times W}

. Equation (1) shows the principle of SST sequence prediction, where

F

represents a prediction model. It involves using SST data from the previous

u

days,

X_{t - u + 1}

,

X_{t - u + 2}

, …,

X_{t}

, to predict the SST of the following

v

days,

X_{t + 1}

,

X_{t + 2}

, …,

X_{t + v}

.

X_{t + 1}, X_{t + 2}, \dots, X_{t + v} = F (X_{t - u + 1}, X_{t - u + 2}, \dots, X_{t})

(1)

In most existing SST prediction studies, the same uniform grid is used to predict a certain study area. In reality, even within the same area, factors such as ocean currents and differences in ocean–land thermal properties lead to an uneven spatial distribution of SST, with significant variations in SST spatial gradients within the region. Therefore, using the same uniform grid to predict SST within a study area has limitations, demanding high precision in grid accuracy.

2.2. Framework of NGGCN

Figure 2 illustrates the framework of the NGGCN model, which consists of two spatiotemporal modules. Each spatiotemporal module comprises a GCN module, an FC module, and a GRU module. The GCN and GRU modules are employed for extracting spatial and temporal features, respectively, while the FC module is used for feature decoding and prediction result output. First, the SST grid data are converted into a graph representation and constructed into a time series input to the GCN module. After the graph convolution operation, the extracted spatial feature matrix is obtained. This matrix is then restored to the features of each node by the FC module. The graph sequence with the extracted features is subsequently input to the GRU module to extract temporal features, which are then restored again through the FC module. Following the first spatiotemporal module, the node information is updated and then input into the second spatiotemporal module for a second round of feature extraction, ultimately leading to the output of the prediction results.

2.3. Spatial-Gradient-Based Non-Uniform Grid Construction

Due to the influence of solar radiation and the thermal property differences between land and sea, the spatial distribution of SST is uneven. In low-latitude areas and the central ocean, SST typically changes more uniformly, whereas in mid-to-high latitude areas and coastal regions, the changes are more dramatic.

First, we calculated the annual average gradient of SSTs in the Northwest Pacific, as shown in Figure 3. The numbers in the figure represent the annual average gradient of SST within each 10° × 10° grid area. Figure 4 shows the visualization results of the annual average gradient of SSTs in the Northwest Pacific Ocean. It can be observed that in the low-latitude area of 0–20°, the spatial gradient of SSTs is relatively small, indicating stable SST variations in this region, which allows for the construction of a relatively sparse grid. As the latitude increases, especially in the 30°–50° region, the spatial gradient of SSTs significantly increases. The redder area in Figure 4 represent regions with larger SST spatial gradients, indicating more dramatic changes in SSTs, accordingly necessitating the construction of denser grids. Therefore, we constructed a non-uniform grid based on the calculated annual average gradient of SSTs. The core concept is to decide whether to create nodes based on the gradient size. The specific scheme is as follows.

First is the refinement process of the uniform grid, using an initial resolution grid of 0.25°, considered as uniform grid points. A function was designed based on the calculated gradient. Since most areas with uniform SST changes have gradients below 0.1, and areas with dramatic SST changes have gradients above 0.2, 0.1 and 0.2 were used as boundary points to design the gradient function. For gradients between 0–0.1, indicating smooth SST changes, only the node itself is considered. For gradients between 0.1–0.2, indicating relatively dramatic SST changes, the nearest 8 points with a resolution of 0.15° are added. For gradients greater than 0.2, indicating very dramatic SST changes, the nearest 8 points with a resolution of 0.05° are added on top of the 0.15° resolution. The merging process of the uniform grid follows, where, if the current uniform grid point has no fine grid points generated around it, the uniform grid point is merged, retaining the current uniform grid point, ultimately obtaining all non-uniform grid points.

For edge construction, we employed a distance threshold-based method. After generating non-uniform grid nodes, the effective SST data we obtained are stored in a matrix of the same size as our original data, categorized into coarse grid point data, uniform grid point data, and fine grid point data. Coarse grid point data are the merged data. Uniform grid point data are the initial 0.25° resolution data, which are also used for comparative experiments. Fine grid point data are the refined data. Our strategy is to use full connectivity within the same category, connecting each point to nearby points, while using distance threshold-based connections between different category data. Additionally, we introduce two new concepts: direct connection and indirect connection. A direct connection refers to a situation where no other data, such as SST data or land data, exist between two valid data points within the spatial range, allowing for an effective edge between two SST data points. An indirect connection refers to the presence of other data between two valid sea temperature data points within the spatial range, implying no strong correlation, and requiring consideration of the distance between them to decide whether to create an edge based on the distance threshold. Figure 5 shows the construction of uniform and non-uniform grids within the same area.

Figure 6 and Figure 7, respectively, show the annual average gradient of SST in the study area and the non-uniform grid construction based on the gradient. It can be observed that the regions with dense node distribution in Figure 7 correspond to the high gradient red bands in Figure 6, proving the rationality of our non-uniform grid construction method.

2.4. Spatiotemporal Module

The spatiotemporal module consists of GCN modules, FC modules, and GRU modules. The GCN module performs graph convolution operations to extract spatial distribution features of SST, the FC module restores the features extracted by the GCN module to each node, and the GRU module extracts temporal features from the graph sequence after spatial feature extraction.

Figure 8 illustrates the structure of the GCN module. The core of the GCN module lies in the graph convolution operation, which aggregates and transforms information from each node and its neighboring nodes in the graph structure to update node information and extract features. In this study, the number of features is 1, representing the SST data. Given an undirected graph

G

with

N

nodes, the feature vector

S

is of size

N \times 1

. We use spectral graph convolution to represent the spatial variation of SST, with the normalized graph Laplacian matrix

L_{s y m}

defined as follows:

L_{s y m} = I_{N} - D^{- \frac{1}{2}} A D^{- \frac{1}{2}}

(2)

where

I_{N}

is the identity matrix,

A

is the adjacency matrix of graph

G

, and

D

is the degree matrix, with each diagonal element

D_{i i}

representing the degree of a node

i

, i.e., the number of nodes connected to node

i

. Considering the influence of the node’s own information, we introduce the self-loop and normalized Laplacian matrix

{\tilde{L}}_{s y m}

:

{\tilde{L}}_{s y m} = {\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}}

(3)

where

\tilde{A} = A + I

is the adjacency matrix with added self-loops, and

\tilde{D} = D + I

is the degree matrix with added self-loops, calculated from

\tilde{A}

. Since each node has an added edge to itself, the degree of each node increases by 1 accordingly, i.e.,

{\tilde{D}}_{i i} = D_{i i} + 1

.

Given the feature matrix

X^{l}

at layer

l

, a single graph convolution operation is represented as:

X^{l + 1} = σ ({\tilde{L}}_{s y m} X^{l} W + b)

(4)

where

W

is the weight matrix,

b

is the bias, and both

W

and

b

are trainable parameters. And

σ

is the activation function, set to

t a n h

in this case. Since we are dealing with time series data, we not only consider the spatial structure of the graph, i.e., the relationships between nodes, but also need to process the node features of each time step. Therefore, we adopt a GCN module that integrates time series data [38]. Considering the node features and the hidden state of each time step, the combined graph convolution operation is specifically expressed as follows:

O = t a n h ({\tilde{L}}_{s y m} [X, h] W + b)

(5)

where

X

represents the input feature matrix,

h

represents the hidden state, and

[X, h]

combines the two along a specific dimension.

O

represents the output of graph convolution operation, encompassing the comprehensive features of graph structure information and time series dynamic information.

The first FC module uses

s o f t m a x

to map the extracted spatial features back to each node. Then we use the restored features to extract temporal feature. The second FC module similarly restores features. In the first spatiotemporal module, features are used to update nodes for input into the next spatiotemporal module. And in the second spatiotemporal module, features are used to output prediction results.

Figure 9 illustrates the structure of the GRU module. In the GRU module, we combine the graph convolution operation and the gated recurrent unit (GRU) [39]. The update mechanism of the GRU is adapted to handle graph-structured data, processing data related to each node through graph convolution operations. The operation at each time step

t

can be summarized as follows:

r_{t} = s i g m o i d ({\tilde{L}}_{s y m} W_{r} [h_{t - 1}, X_{t}] + b_{r})

(6)

u_{t} = s i g m o i d ({\tilde{L}}_{s y m} W_{u} [h_{t - 1}, X_{t}] + b_{u})

(7)

{\tilde{h}}_{t} = t a n h ({\tilde{L}}_{s y m} W_{h} [{r_{t} ⊙ h}_{t - 1}, X_{t}] + b_{h})

(8)

h_{t} = u_{t} ⊙ h_{t - 1} + (1 - u_{t}) ⊙ {\tilde{h}}_{t}

(9)

where

X_{t}

represents the feature vector matrix of time step

t

,

h_{t - 1}

represents the hidden state of the previous time step

t - 1

, the sigmoid activation function is used to ensure that the gate signal is within the range of [0, 1],

{\tilde{h}}_{t}

represents the candidate hidden state, used to update the current hidden state,

t a n h

represents the hyperbolic tangent activation function,

h_{t}

represents the final hidden state,

⊙

represents Hadamard product,

W

and

b

represent the weight matrix and bias term, with subscripts

r

,

u

, and

h

representing the reset gate, update gate, and candidate hidden state, respectively, and

{\tilde{L}}_{s y m}

represents the normalized Laplacian matrix with added self-loop, used for graph convolution operation. The graph convolution operation

{\tilde{L}}_{s y m}

is applied to inputs

[h_{t - 1}, X_{t}]

and

[{r_{t} ⊙ h}_{t - 1}, X_{t}]

, which allows information to propagate in the spatial dimension of the graph. In this way, the GRU module can consider the graph structured neighborhood information of nodes while updating their state. This design allows the GRU module to simultaneously capture the spatial relationships of graph data and the temporal dependencies of time series data, providing an effective modeling method for graph structured time series data.

We designed two spatiotemporal modules in the experiment to perform two-layer feature extraction. Finally, we output the sequence of extracted spatial and temporal features through an FC layer to obtain the final prediction result.

2.5. Output and Loss Function

In this paper, the feature vector of the data is one-dimensional, which is the SST data. Therefore, the second FC layer of the second spatiotemporal module serves as the output layer of the model. After the feature restoration through this FC layer, the updated node information from the two spatiotemporal modules is mapped back to each corresponding node, resulting in the final output result.

We use Mean Squared Error (MSE) as the loss function for this study, as shown in Equation (10):

L_{M S E} = \frac{1}{T} \sum_{t = 1}^{T} {(y_{t} - {\hat{y}}_{t})}^{2}

(10)

where

T

represents the total prediction steps,

t

represents each time step,

y_{t}

represents the true value at time

t

, and

{\hat{y}}_{t}

represents the predicted value at time

t

.

3. Experiments

3.1. Dataset

In the experiment, we used the global Ocean Surface Temperature and Sea Ice Analysis (OSTIA) dataset from the UK Met Office, which utilizes in situ and satellite data from infrared and microwave radiometers with a time resolution of daily average SST and a spatial resolution of 0.05°.

Specifically, we selected SST data from the representative area of the Yellow Sea and Bohai Sea [30°N–40°N, 120°E–130°E], which included 40,000 grid regions with a resolution of 0.05°. For the comparative model, we conducted experiments using grid data with a resolution of 0.25°. We used a total of 10,585 daily SST data from 1 January 1992 to 31 December 2020 to train and test the model.

3.2. Experimental Settings

We divided the dataset into training set, validation set, and testing set in a ratio of 8:1:1. We used Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R-squared) as metrics to evaluate the performance of different SST prediction models. The specific formulas are as follows:

M A E = \frac{1}{N} \sum_{n = 1}^{N} |X_{n} - {\hat{X}}_{n}|

(11)

R M S E = \sqrt{\frac{1}{N} \sum_{n = 1}^{N} {(X_{n} - {\hat{X}}_{n})}^{2}}

(12)

R^{2} = 1 - \frac{\sum_{n = 1}^{N} {(X_{n} - {\hat{X}}_{n})}^{2}}{\sum_{n = 1}^{N} {(X_{n} - {\bar{X}}_{n})}^{2}}

(13)

where

X_{n}

represents the true value,

{\hat{X}}_{n}

represents the predicted value, and

{\bar{X}}_{n}

represents the mean of the true values. We conducted SST predictions on daily, weekly, and monthly scales, using a cyclic prediction method with 7 days prediction for 1 day, 30 days prediction for 7 days, and 120 days prediction for 30 days. We optimized the model using the Adam optimizer for a maximum of 1000 epochs and used an early stop strategy with the patience of 20. The learning rate was set at 0.001 and the batch size was 32.

Our experiments were implemented on Pytorch and ran on a NVIDIA A100 80 GB PCIe GPU.

3.3. Results

To demonstrate the superiority of the NGGCN model, we selected several advanced GNN SST prediction models as baselines, including the GCN model, GRU model, and GCN-LSTM model. The baseline methods used a uniform grid with a resolution of 0.25°. Table 1 shows the prediction results of these models and the NGGCN model in the Yellow Sea and Bohai Sea areas. The results indicate that the NGGCN is significantly superior to the baseline methods.

To explore the improvement effect of non-uniform grid construction on the model, we ran non-uniform grid data on all baseline models and compared the results with data from a 0.25° uniform grid. Table 2 shows the comparison results. The results indicate that non-uniform grid structure data had significant improvement effects on various baseline models.

4. Validation and Evaluation

4.1. Overall Evaluation

To verify the accuracy of the NGGCN at different time scales and prediction steps, we conducted sequence predictions for the next 1 day, 7 days, and 30 days into the future on the selected sea dataset. Figure 10 shows the comparison results of evaluation metrics for each model, indicating that the NGGCN outperforms the selected baseline models across different time scales. Compared to the GCN-LSTM model, which performed the best out of the baseline models, the NGGCN showed an RMSE improvement of 23.7%, an MAE improvement of 24.9%, and an R-squared increase from 0.990 to 0.995 on the daily scale. On the weekly scale, RMSE improved by 7.3%, MAE by 9.1%, and R-squared increased from 0.985 to 0.986. On the monthly scale, RMSE improved by 4.9%, MAE by 4.1%, and R-squared increased from 0.972 to 0.974. Figure 11 presents the visual comparison of the prediction results and true values for each model. It can be observed that our model shows improvements across all metrics compared to the selected baseline models at different time scales, with the most significant improvements on the daily scale, followed by the weekly scale, and relatively smaller improvements on the monthly scale.

It can be seen that our model has improved in various indicators compared to the selected baseline models at different time scales, with the most significant improvement on the daily scale, followed by the weekly scale, while the improvement on the monthly scale was relatively small. The enhancement effect of our model diminishes as the time scale and prediction steps increase. This is primarily because our proposed non-uniform grid construction method addresses the complexity of SST spatial distribution and is modeled based on the spatial distribution of SST. As the time scale and prediction steps increase, especially for monthly scale predictions where we predict the SST for the next 30 days based on the SST data from the previous 120 days, the data spans seasonal changes and presents significant differences. Therefore, the temporal features become more complex. Although our model includes a spatiotemporal module that combines graph convolution and GRU to extract spatiotemporal correlations, the GRU has certain limitations in extracting long-term temporal features, resulting in less significant improvements for long-term monthly scale predictions. This will be a direction for future improvements in our work.

4.2. Ablation Study

To verify the improvement effect of the non-uniform grid construction method on the model, we conducted experiments on three baseline graph neural network models using non-uniform grid data, and compared the experimental results with those using 0.25° uniform grid data. At different time scales and prediction steps, all baseline models showed improvements in all metrics. Figure 12 shows the improvement of three baseline models in various indicators after adding the non-uniform grid method. On a daily scale, RMSE improved by 9.0%, MAE improved by 9.1%, and R-squared increased from 0.967 to 0.975. On a weekly scale, RMSE improved by 9.9%, MAE improved by 10.9%, and R-squared increased from 0.958 to 0.963. On a monthly scale, RMSE improved by 11.7%, MAE improved by 9.6%, and R-squared increased from 0.972 to 0.976.

Figure 13 shows the visualization results of three baseline models, GCN, GRU, and GCN-LSTM, after incorporating non-uniform grid methods. It can be seen that our proposed non-uniform grid construction method has significantly improved predictive performance compared to the uniform grid method, demonstrating the effectiveness of the non-uniform grid construction method.

5. Conclusions

In this paper, we propose an NGGCN model for sequence prediction of SST. The core idea of this model is based on a non-uniform grid construction method driven by spatial gradients. We first compute the spatial annual mean gradient of SST within the region, and design a threshold function based on the computed gradient values. During the process of converting SST data into a graph representation, we first created a sparse and uniform grid. Based on this initial grid, we decided whether to refine or merge the initial grid according to the spatial gradient and the designed threshold function, in order to complete the creation of nodes. Then, we proposed the concepts of direct connection and indirect connection in the construction of edges. For directly connected points, we directly constructed edges, while for indirectly connected points, we adopted a distance threshold-based method to construct edges. Through the construction of nodes and edges, we developed a non-uniform grid construction method based on spatial gradients. We also proposed a spatiotemporal module for sequence prediction, which consists of a GCN module, a fully connected module, and a GRU module. Firstly, the SST data were converted into a graph representation and constructed into a sequence. It was input into the GCN module to extract spatial features, and then restored to each node through an FC layer. Finally, the spatial features were extracted through the GRU module. In the spatiotemporal module, the GCN module contains sequence information, while the GRU module contains the graph convolution process. Therefore, in each module, both spatial and temporal features were extracted simultaneously. To verify the effectiveness of the NGGCN, we compared it with advanced graph neural network models, GCN, GRU, and GCN-LSTM.

We selected the representative Yellow Sea and Bohai Sea regions for the experiment. This region is located in the mid-latitude area and is far from the ocean center, where SST changes are relatively dramatic, and selecting this sea area can better verify the effectiveness of the non-uniform grid construction method based on spatial gradient. We conducted experiments on daily, weekly, and monthly scales, and the results show that our NGGCN model improves in all evaluation metrics compared to the three baseline models.

To validate the effectiveness of our proposed non-uniform grid construction method, we ran non-uniform grid data on all baseline models and compared the results with uniform grid data. The experimental results indicate that compared with uniform grid data, all evaluation indicators of non-uniform grid data have been improved, demonstrating the effectiveness of the non-uniform grid construction method.

In addition, we found that the model’s improvement in SST prediction results is more significant on a daily scale. However, as the time scale and prediction steps increase, especially on the monthly scale, the improvement becomes relatively less pronounced. We attribute this to the obvious seasonal and periodic changes in SST. And with the increase in prediction steps, the model’s ability to capture the complex temporal dependencies and nonlinear relationships of the data is somewhat insufficient. Moreover, the SST gradients show significant spatial differences during different seasons. Our work mainly focused on the spatial features of SST at different time scales overall. We will conduct research on the seasonal variation of SST in the future. Meanwhile, SST is influenced by various factors such as solar radiation, atmospheric temperature, and ocean current advections, while our paper only makes predictions based on SST data, which has certain limitations. In the future, we will attempt to address this issue. We will incorporate more meteorological and oceanic elements into the model as additional input features to further enhance the predictive performance of SST.

Author Contributions

Conceptualization, G.L. and Q.L.; methodology, G.L.; software, G.L.; validation, G.L., J.Z., X.Z. (Xiaofeng Zhao), Q.L. and X.Z. (Xuan Zhou); formal analysis, G.L.; investigation, G.L.; resources, Q.L.; data curation, G.L.; writing—original draft preparation, G.L.; writing—review and editing, Q.L. and X.Z. (Xiaofeng Zhao); visualization, G.L.; supervision, X.Z. (Xiaofeng Zhao), Q.L. and X.Z. (Xuan Zhou); project administration, Q.L.; funding acquisition, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 42075139, U2242201, 41305138), the China Postdoctoral Science Foundation (Grant No. 2017M621700), the Hunan Province Natural Science Foundation (Grant No. 2021JC0009, 2021JJ30773), and the Fengyun Application Pioneering Project (FY-APP 2022.0605).

Data Availability Statement

The SST data can be downloaded from https://data.marine.copernicus.eu/product/SST_GLO_SST_L4_NRT_OBSERVATIONS_010_001/description (accessed on 26 December 2021).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Moros, M.; Emeis, K.; Risebrobakken, B.; Snowball, I.; Kuijpers, A.; McManus, J.; Jansen, E. Sea surface temperatures and ice rafting in the Holocene North Atlantic: Climate influences on northern Europe and Greenland. Quat. Sci. Rev. 2004, 23, 2113–2126. [Google Scholar] [CrossRef]
Ham, Y.; Kug, J.; Park, J.; Jin, F. Sea surface temperature in the north tropical Atlantic as a trigger for El Niño/Southern Oscillation events. Nat. Geosci. 2013, 6, 112–116. [Google Scholar] [CrossRef]
Chen, Z.; Wen, Z.; Wu, R.; Lin, X.; Wang, J. Relative importance of tropical SST anomalies in maintaining the Western North Pacific anomalous anticyclone during El Niño to La Niña transition years. Clim. Dyn. 2016, 46, 1027–1041. [Google Scholar] [CrossRef]
Khan, T.M.A.; Singh, O.P.; Rahman, M.S. Recent sea level and sea surface temperature trends along the Bangladesh coast in relation to the frequency of intense cyclones. Mar. Geod. 2000, 23, 103–116. [Google Scholar]
Emanuel, K.; Sobel, A. Response of tropical sea surface temperature, precipitation, and tropical cyclone-related variables to changes in global and local forcing. J. Adv. Model. Earth Syst. 2013, 5, 447–458. [Google Scholar] [CrossRef]
Solanki, H.U.; Bhatpuria, D.; Chauhan, P. Integrative analysis of AltiKa-SSHa, MODIS-SST, and OCM-chlorophyll signatures for fisheries applications. Mar. Geod. 2015, 38, 672–683. [Google Scholar] [CrossRef]
Andrade, H.A.; Garcia, C.A.E. Skipjack tuna fishery in relation to sea surface temperature off the southern Brazilian coast. Fish Oceanogr. 1999, 8, 245–254. [Google Scholar] [CrossRef]
Wang, W.; Zhou, C.; Shao, Q.; Mulla, D.J. Remote sensing of sea surface temperature and chlorophyll-a: Implications for squid fisheries in the north-west Pacific Ocean. Int. J. Remote Sens. 2010, 31, 4515–4530. [Google Scholar] [CrossRef]
Hu, Z.Z.; Kumar, A.; Huang, B.; Wang, W.; Zhu, J.; Wen, C. Prediction skill of monthly SST in the North Atlantic Ocean in NCEP Climate Forecast system version 2. Clim. Dyn. 2013, 40, 2745–2759. [Google Scholar] [CrossRef]
Zhang, K.; Geng, X.; Yan, X.H. Prediction of 3-D Ocean Temperature by Multilayer Convolutional LSTM. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1303–1307. [Google Scholar] [CrossRef]
Fujii, Y.; Rémy, E.; Zuo, H.; Oke, P.; Halliwell, G.; Gasparin, F.; Benkiran, M.; Loose, N.; Cummings, J.; Xie, J.; et al. Observing System Evaluation Based on Ocean Data Assimilation and Prediction Systems: On-Going Challenges and a Future Vision for Designing and Supporting Ocean Observational Networks. Front. Mar. Sci. 2019, 6, 417. [Google Scholar] [CrossRef]
Zhang, P.; Zhou, S.; Liang, C. Correction of South China Sea SST forecast errors based on satellite remote sensing sea surface temperature data. J. Trop. Oceanogr. 2020, 39, 57–65. [Google Scholar]
Veronika, E.; Sandrine, B.; Meehl, G.A.; Senior, C.A.; Bjorn, S.; Stouffer, R.J.; Taylor, K.E. Overview of the coupled model intercomparison project phase 6 (cmip6) experimental design and organization. Geosci. Model Dev. 2016, 9, 1937–1958. [Google Scholar]
Peng, W.; Chen, Q.; Zhou, S.; Huang, P. Cmip6 model-based analog forecasting for the seasonal prediction of sea surface temperature in the offshore area of China. Geosci. Lett. 2021, 8, 8. [Google Scholar] [CrossRef]
Kartal, S. Assessment of the spatiotemporal prediction capabilities of machine learning algorithms on sea surface temperature data: A comprehensive study. Eng. Appl. Artif. Intell. 2023, 118, 105675. [Google Scholar] [CrossRef]
Kug, J.S.; Kang, I.S.; Lee, J.Y.; Jhun, J.G. A statistical approach to Indian Ocean sea surface temperature prediction using a dynamical ENSO prediction. Geophys. Res. Lett. 2004, 31, 399–420. [Google Scholar] [CrossRef]
Lins, I.; Moura, M.; Silva, M.; Jacinto, C.M. Sea surface temperature prediction via support vector machines combined with particle swarm optimization. In Proceedings of the 10th International Probabilistic Safety Assessment and Management Conference, Seattle, WA, USA, 7–11 June 2010. [Google Scholar]
Tangang, F.T.; Hsieh, W.W.; Tang, B. Forecasting the Equatorial Pacific Sea Surface Temperatures by neural network models. Clim. Dyn. 1997, 13, 135–147. [Google Scholar] [CrossRef]
Lins, I.D.; Araujo, M.; Moura, M.C.; Silva, M.A.; López Droguett, E. Prediction of sea surface temperature in the tropical Atlantic by support vector machines. Comput. Stat. Data Anal. 2013, 61, 187–198. [Google Scholar] [CrossRef]
Wu, A.; Hsieh, W.W.; Tang, B. Neural network forecasts of the tropical Pacific sea surface temperatures. Neural Netw. 2006, 19, 145–154. [Google Scholar] [CrossRef] [PubMed]
Garcia-Gorriz, E.; Garcia-Sanchez, J. Prediction of sea surface temperatures in the western Mediterranean Sea by neural networks using satellite observations. Geophys. Res. Lett. 2007, 34. [Google Scholar] [CrossRef]
Aparna, S.G.; D’Souza, S.; Arjun, N.B. Prediction of daily sea surface temperature using artificial neural networks. Int. J. Remote Sens. 2018, 39, 4214–4231. [Google Scholar] [CrossRef]
Zheng, G.; Li, X.; Zhang, R.-H.; Liu, B. Purely satellite data–driven deep learning forecast of complicated tropical instability waves. Sci. Adv. 2020, 6, eaba1482. [Google Scholar] [CrossRef] [PubMed]
Ham, Y.G.; Kim, J.H.; Luo, J.J. Deep learning for multi-year ENSO forecasts. Nature 2019, 573, 568–572. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, H.; Dong, J.; Zhong, G.; Sun, X. Prediction of sea surface temperature using long short-term memory. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1745–1749. [Google Scholar] [CrossRef]
Wei, L.; Guan, L.; Qu, L.; Guo, D. Prediction of sea surface temperature in the China seas based on long short-term memory neural networks. Remote Sens. 2020, 12, 2697. [Google Scholar] [CrossRef]
Zhang, X.; Li, Y.; Frery, A.C.; Ren, P. Sea surface temperature prediction with memory graph convolutional networks. IEEE Geosci. Remote Sens. Lett. 2022, 19, 8017105. [Google Scholar] [CrossRef]
Sun, Y.; Yao, X.; Bi, X.; Huang, X.; Zhao, X.; Qiao, B. Time-series graph network for sea surface temperature prediction. Big Data Res. 2021, 25, 100237. [Google Scholar] [CrossRef]
Yang, Y.; Dong, J.; Sun, X.; Lima, E.; Mu, Q.; Wang, X. A CFCC–LSTM model for sea surface temperature prediction. IEEE Geosci. Remote Sens. Lett. 2018, 15, 207–211. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.; Wong, W. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 1–9. [Google Scholar]
Xiao, C.; Chen, N.; Hu, C.; Wang, K.; Xu, Z.; Cai, Y.; Xu, L.; Chen, Z.; Gong, J. A spatiotemporal deep learning model for sea surface temperature field prediction using time-series satellite data. Environ. Model. Softw. 2019, 120, 104502. [Google Scholar] [CrossRef]
Zha, C.; Min, W.; Han, Q.; Xiong, X.; Wang, Q.; Liu, Q. Multiple Granularity Spatiotemporal Network for Sea Surface Temperature Prediction. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1504505. [Google Scholar] [CrossRef]
Qiao, B.; Wu, Z.; Tang, Z.; Wu, G. Sea Surface Temperature Prediction Approach Based on 3D CNN and LSTM with Attention Mechanism. In Proceedings of the 23rd International Conference on Advanced Communication Technology (ICACT), Pyeongchang, Republic of Korea, 7–10 February 2021. [Google Scholar]
Xie, J.; Zhang, J.; Yu, J. An Adaptive Scale Sea Surface Temperature Predicting Method Based on Deep Learning with Attention Mechanism. IEEE Geosci. Remote Sens. Lett. 2020, 17, 740–744. [Google Scholar] [CrossRef]
Xin, Z.; Patil, K.R.; Sonogashira, M.; Iiyama, M. Sea Surface Temperature Nowcasting with 3-channel Convolutional LSTM. In Proceedings of the Global Oceans 2020: Singapore—U.S. Gulf Coast, Biloxi, MS, USA, 5–30 October 2020. [Google Scholar]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Liang, S.; Zhao, A.; Qin, M.; Hu, L.; Wu, S.; Du, Z.; Liu, R. A Graph Memory Neural Network for Sea Surface Temperature Prediction. Remote Sens. 2023, 15, 3539. [Google Scholar] [CrossRef]
Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Stockholm, Sweden, 13–19 July 2018. [Google Scholar]
Peng, H.; Jin, C.; Li, W.; Guan, J. Enhanced Adaptive Graph Convolutional Network for Long-term Fine-grained SST Prediction. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2023, 16, 7968–7978. [Google Scholar] [CrossRef]

Figure 1. The Yellow Sea and Bohai Sea region is divided into a grid system, with the area [120°E–130°E, 30°N–40°N] segmented into 100 grids, including 10 grids along the latitude and 10 grids along the longitude.

Figure 2. Framework of NGGCN model. The core of the model is the spatiotemporal block (ST block), which consists of the GCN module, the FC module, and the GRU module. The ST block 1 is employed for node update and the ST block 2 is employed for outputs. The blue nodes in the figure represent the graph representation obtained by converting SST data. The red nodes in the figure represent the updated nodes after a spatial and temporal feature extraction.

Figure 3. Annual average SST gradient within each 10° × 10° grid area in the Northwest Pacific Ocean. The red color indicates higher temperatures and the blue color indicates lower temperatures.

Figure 4. Visualization results of SST annual average gradient in the Northwest Pacific Ocean. The darker color in the figure indicates the larger gradient. There is an obvious high gradient zone between 30°N–50°N. And the gradient in coastal areas is significantly higher than that in the center of the ocean.

Figure 5. Constructions of uniform and non-uniform grids within the same area: (a) uniform grids; (b) non-uniform grids.

Figure 6. Annual average gradient of SSTs within the region. The darker the color in the figure indicates the larger gradient.

Figure 7. Construction of a non-uniform grid. The denser grid indicates the larger SST gradient at that area. The color of the grid represents the value of SSTs (red indicates higher temperatures and blue indicates lower temperatures).

Figure 8. Architecture of the GCN block, where

⨂

represents the matrix multiplication, and

⨁

represents matrix addition.

Figure 8. Architecture of the GCN block, where

⨂

represents the matrix multiplication, and

⨁

represents matrix addition.

Figure 9. Architecture of the GRU block, where

⨂

represents the matrix multiplication,

⨁

represents matrix addition, and

⨀

represents the Hadamard product.

Figure 9. Architecture of the GRU block, where

⨂

represents the matrix multiplication,

⨁

represents matrix addition, and

⨀

represents the Hadamard product.

Figure 10. Comparison of prediction results at different time scales and prediction steps: (a) RMSE; (b) MAE; (c) R-squared. Each subfigure shows the performance of all models on evaluation metrics at the daily, weekly, and monthly scales.

Figure 11. Visualization effect of comparing the predicted results of various models with the ground truth. The four columns on the left show the predicted results of each model, while the rightmost column represents the true value. Each row, respectively, represents the results on a daily, weekly, and monthly scale.

Figure 12. Improvement of three baseline models in various indicators after adding non-uniform grid method: (a) RMSE; (b) MAE; (c) R-squared.

Figure 13. Visualization results of comparing the baseline models GCN, GRU, and GCN-LSTM with the true values after incorporating non-uniform grid methods: (a) GCN; (b) GRU; (c) GCN-LSTM.

Table 1. Prediction results on a daily, weekly, and monthly scale.

Model	Metrics	Daily	Weekly	Monthly
GCN	RMSE	1.182	1.335	1.553
	MAE	0.793	0.956	1.209
	R-squared	0.967	0.958	0.946
GRU	RMSE	0.684	1.120	2.382
	MAE	0.516	0.843	1.828
	R-squared	0.989	0.970	0.872
GCN-LSTM	RMSE	0.587	0.824	1.120
	MAE	0.413	0.615	0.828
	R-squared	0.990	0.985	0.972
NGGCN (ours)	RMSE	0.448	0.764	1.065
	MAE	0.310	0.559	0.794
	R-squared	0.995	0.986	0.974

Table 2. Improvement of non-uniform grid structure on baseline models.

Model	Metrics	Daily	Weekly	Monthly
GCN	RMSE	1.075	1.202	1.351
	MAE	0.721	0.851	1.052
	R-squared	0.975	0.963	0.959
GRU	RMSE	0.629	1.016	2.004
	MAE	0.474	0.759	1.609
	R-squared	0.991	0.974	0.906
GCN-LSTM	RMSE	0.552	0.768	0.989
	MAE	0.395	0.574	0.748
	R-squared	0.991	0.986	0.976

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lou, G.; Zhang, J.; Zhao, X.; Zhou, X.; Li, Q. A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction. Remote Sens. 2024, 16, 3216. https://doi.org/10.3390/rs16173216

AMA Style

Lou G, Zhang J, Zhao X, Zhou X, Li Q. A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction. Remote Sensing. 2024; 16(17):3216. https://doi.org/10.3390/rs16173216

Chicago/Turabian Style

Lou, Ge, Jiabao Zhang, Xiaofeng Zhao, Xuan Zhou, and Qian Li. 2024. "A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction" Remote Sensing 16, no. 17: 3216. https://doi.org/10.3390/rs16173216

APA Style

Lou, G., Zhang, J., Zhao, X., Zhou, X., & Li, Q. (2024). A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction. Remote Sensing, 16(17), 3216. https://doi.org/10.3390/rs16173216

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction

Abstract

1. Introduction

2. Methods and Materials

2.1. Problem Statement

2.2. Framework of NGGCN

2.3. Spatial-Gradient-Based Non-Uniform Grid Construction

2.4. Spatiotemporal Module

2.5. Output and Loss Function

3. Experiments

3.1. Dataset

3.2. Experimental Settings

3.3. Results

4. Validation and Evaluation

4.1. Overall Evaluation

4.2. Ablation Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI