Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model

Hu, Song; Shao, Qi; Li, Wei; Han, Guijun; Zheng, Qingyu; Wang, Ru; Liu, Hanyu

doi:10.3390/jmse11112096

Open AccessArticle

Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model

by

Song Hu

¹,

Qi Shao

^1,2,3,*

,

Wei Li

^1,*

,

Guijun Han

¹

,

Qingyu Zheng

¹

,

Ru Wang

¹ and

Hanyu Liu

¹

Tianjin Key Laboratory for Marine Environmental Research and Service, School of Marine Science and Technology, Tianjin University, Tianjin 300072, China

²

Institute of Oceanography, College of Geography and Oceanography, Minjiang University, Fuzhou 350108, China

³

Fujian Key Laboratory on Conservation and Sustainable Utilization of Marine Biodiversity, Minjiang University, Fuzhou 350108, China

^*

Authors to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(11), 2096; https://doi.org/10.3390/jmse11112096

Submission received: 25 September 2023 / Revised: 29 October 2023 / Accepted: 30 October 2023 / Published: 1 November 2023

(This article belongs to the Special Issue Sea Surface Temperature: From Observation to Applications II)

Download

Browse Figures

Versions Notes

Abstract

:

Data-driven predictions of marine environmental variables are typically focused on single variables. However, in real marine environments, there are correlations among different oceanic variables. Additionally, sea–air interactions play a significant role in influencing the evolution of the marine environment. Both internal dynamics and external drivers contribute to these changes. In this study, a data-driven model is proposed using sea surface height anomaly (SSHA), sea surface temperature (SST), and sea surface wind (SSW) in the Bohai Sea. This model combines multivariate empirical orthogonal functions (MEOFs) with long and short-term memory (LSTM). MEOF analysis is used on the multivariate dataset of SSHA and SST, considering the correlation among sea surface variables. SSW is introduced as a predictor to enhance the predictability of the multivariate sea surface model. In the case of the Bohai Sea, the comparative tests of the model without wind field effect, the fully coupled model, and the proposed prediction model were carried out. MEOF analysis is employed in comparative experiments for oceanic variables, atmospheric variables, and combined atmospheric and oceanic variables. The results demonstrate that using wind field as a predictor can improve the forecast accuracy of SSHA and SST in the Bohai Sea. The root mean square errors (RMSE) for SSHA and SST in a 7-day forecast are 0.016 m and 0.3200 °C, respectively.

Keywords:

data-driven model; sea surface multivariate prediction; Bohai sea; sea surface wind

1. Introduction

Changes in SSHA and SST may lead to some climate extremes [1,2]. Therefore, accurate prediction of SSHA and SST is of great scientific significance [3]. Traditional oceanic numerical models often rely on complex physical parameters and require significant computational resources, which may limit the accuracy of predictions. However, with the continuous enrichment of ocean data types and volumes, data-driven prediction has gradually become a research hotspot [4,5,6]. Data-driven approaches are primarily based on objective data itself, free from the constraints of physical equations, and they greatly reduce the influence of parameter uncertainties. This allows for more accurate predictions and less reliance on complex physical modeling.

Data-driven methods can effectively use historical data to extract valuable information from them and make forecasts. There are two primary classifications of data-driven forecasting techniques: conventional statistical forecasting techniques and neural network prediction methods. Common statistical forecasting techniques include empirical orthogonal function (EOF) decomposition [7], multivariate empirical orthogonal functions (MEOF) decomposition [8], linear regression (LR) techniques [9,10], conventional correlation analysis methods [11], and support vector machines (SVM) [12,13]. Neural network prediction methods encompass various types, including artificial neural networks (ANNs) [14,15], gated recurrent unit (GRU) neural network [16], memory in memory (MIM) neural network [17], deep neural networks (DNNs) [18], LSTM network [19,20,21], and back propagation (BP) neural networks [22]. Additionally, there is the transfer learning model [23].

Recent studies have utilized data-driven approaches to predict various oceanic variables. For example, Zhang et al. used a long-short memory (LSTM) network to predict the SST in the Bohai Sea [24]; Wei et al. used a multilayer perceptron (MLP) model to predict the SST in the South China Sea [25]; Xie et al. combined deep learning with the attention mechanism to construct an adaptive model for SST prediction in the Bohai Sea region [26]; Xie et al. proposed a method to predict the SST in the Bohai Sea by combining the convolutional gated recurrent unit (GRU) and the multilayer perceptron (CGMP) [27]; Zhang et al. developed a model based on gated recurrent unit (GRU) neural networks to predict SST over the medium- and long-term and used multiple time-scale datasets; researchers conducted experiments in the Bohai Sea [28]; Gao et al. proposed a global spatio-temporal graph attention network (GSTGAT) in combination with a graph neural network (GNN); and multiple time-scale datasets in the Bohai Sea were used to conduct experiments [29].

However, most of the above studies construct prediction models for single variables under single scale conditions and ignore the influence of external drivers. This may limit the improving of forecasting accuracy. To enhance the consistency between data-driven models and actual physical variation processes, it is important to consider the correlation between different ocean variables. In this study, we adopt a similar data-driven forecasting framework as previous studies [30,31,32,33]. However, it is worth noting that our previous study is based on the South China Sea (SCS) region with a mean water depth of 1200 m. In this region, the internal dynamical mechanisms dominate the oceanic evolution process, and the influence of external drivers is relatively small. Hence, the combined forecasting technique for oceanic variables proves to be efficient in the South China Sea. Regarding the study area of this study, which is the Bohai Sea, it has an average depth of only 18 m. In the oceanic evolution process of the Bohai Sea, meteorological driving factors and internal dynamic mechanisms have comparable roles. Thus, it is necessary to introduce external drivers into the coupled model and examine the results obtained through different coupling methods. In addition, the interaction between the atmosphere and the ocean drives us to think about how to introduce atmospheric information into the data-driven models.

A data-driven forecasting model for SSHA and SST in the Bohai Sea is constructed by using MEOF analysis and an LSTM neural network. In particular, the sea surface wind field is introduced into the model as a predictor to improve the forecast accuracy. This method not only considers the dynamical coordination relationship between different variables in the ocean but also takes into account the role of the atmosphere on the ocean, which can improve the forecast accuracy of marine environmental variables.

The remaining portion of this document is structured in the following manner: Section 2 provides an explanation of the data and methodologies employed in this research. In Section 3, the model prediction experiments and results for SSHA and SST in the Bohai Sea are presented. Finally, Section 4 gives the conclusions.

2. Data and Methodology

2.1. Study Area and Data Collection

In this study, the proposed model performance is assessed by analyzing long-term satellite remote sensing data for SSHA, SST, and SSW (U and V component) within the geographical coordinates of 116° to 124° E and 36° to 42° N, specifically in the Bohai Sea of China. SSHA and SST are two important variables in the ocean environment that can directly reflect the changes generated by the ocean. SSW is one of the ways for the exchange of heat between the ocean and the atmosphere and acts as a medium for their interaction. In the subsequent construction of the prediction model, SSW represents an external driver, enabling the model to better simulate real evolution in the ocean environment.

The SSHA here are daily 1/4° data provided by Copernicus Marine and Environmental Monitoring Service (CMEMS). The SST data used are the 1/4° daily best interpolated sea surface temperature (OISST) from National Oceanic and Atmospheric Administration (NOAA). The SSW (U and V component) is the Cross-Calibrated Multi-Platform (CCMP) wind field, which is obtained from NASA Earth Science Enterprise (ESE). The data used in this study span a time length of 28 years, from January 1993 to December 2020. The training dataset includes data from 1993 to 2015, while the model validation adopts independent experimental samples from 2016 to 2020, with a time span of 5 years.

2.2. Proposed Model

This study presents a forecasting model for predicting SSHA and SST in the Bohai Sea, utilizing MEOF analysis and an LSTM neural network. Figure 1 provides a visual representation of the model, highlighting its three primary components: the MEOF analysis phase, the LSTM neural network prediction phase, and the data reconstruction phase.

In the stage of MEOF analysis, the preprocessed satellite data variables are separated into a training set and a testing set. The orthogonal spatial patterns and principal components (PCs) of the training set are obtained by MEOF decomposition. The orthogonal spatial patterns are called EOFs. The PCs of the testing set can be obtained by projecting onto the EOFs. Section 2.3 summarizes the MEOF analysis process.

In the prediction stage of the LSTM neural network, PCs with a certain variance ratio are selected from the previous step and used as the input of the LSTM neural network. The predictive value of PCs is obtained by using an LSTM neural network. Section 2.4 describes the LSTM network used in this study.

During the data reconstruction stage, the reconstructed field is achieved by combining the prediction values of the PCs with the EOFs.

2.3. MEOF Analysis

MEOF analysis is a valuable tool for examining the spatial and temporal distribution characteristics of variables within the ocean and atmospheric domains. In this study, MEOF analysis is adopted to decompose the multivariate sample matrix composed of SSHA and SST. The U and V components of SSW (referred to as Uwind and Vwind) are also subjected to the same decomposition. Additionally, the MEOF analysis is employed to decompose the multivariate sample matrix consisting of SSHA, SST, Uwind, and Vwind. The specific expressions are as follows.

X_{1} = [\begin{matrix} X_{S S H A}^{1} & \dots & X_{S S H A}^{m} & \dots & X_{S S H A}^{M} \\ X_{S S T}^{1} & \dots & X_{S S T}^{m} & \dots & X_{S S T}^{M} \end{matrix}]

(1)

X_{2} = [\begin{matrix} X_{U w i n d}^{1} & \dots & X_{U w i n d}^{m} & \dots & X_{U w i n d}^{M} \\ X_{V w i n d}^{1} & \dots & X_{V w i n d}^{m} & \dots & X_{V w i n d}^{M} \end{matrix}]

(2)

X_{3} = [\begin{matrix} X_{S S H A}^{1} & \dots & X_{S S H A}^{m} & \dots & X_{S S H A}^{M} \\ X_{S S T}^{1} & \dots & X_{S S T}^{m} & \dots & X_{S S T}^{M} \\ X_{U w i n d}^{1} & \dots & X_{U w i n d}^{m} & \dots & X_{U w i n d}^{M} \\ X_{V w i n d}^{1} & \dots & X_{V w i n d}^{m} & \dots & X_{V w i n d}^{M} \end{matrix}]

(3)

where

X_{S S H A}^{m}

is the SSHA sample on the

m t h

day. SST, Uwind, and Vwind are expressed in the same way. The spatial dimension of each variable is denoted by

N

. The time dimension of each variable is represented by

M

. The spatial dimension of the four variables is uniform, so the dimension of each sample matrix is

N \times M

. In this study,

N

and

M

are 257 points and 8395 days, respectively.

In this study, we are more concerned with the variation of the anomalies of variables, which are constructed by subtracting the climatology.

X_{1}^{'} = X_{1} - {\bar{X}}_{1}

(4)

where

X_{1}^{'}

denotes the anomaly sample matrix, and

{\bar{X}}_{1}

denotes the climatology mean.

The covariance matrix

C_{2 N \times 2 N}

of matrix

X_{1}^{'}

can be expressed as follows:

C_{2 N \times 2 N} = \frac{1}{M} X_{1}^{'} \times X_{1}^{' T}

(5)

It is important to mention that

C_{2 N \times 2 N}

effectively takes into account the relationship between various variables. Expression of eigenvalues and eigenvectors as:

C_{2 N \times 2 N} \times V_{2 N \times 2 N} = V_{2 N \times 2 N} \times E_{2 N \times 2 N}

(6)

The arrangement of eigenvalues is in descending order. In

E_{2 N \times 2 N}

, each non-zero eigenvalue corresponds to an eigenvector;

V_{2 N \times 2 N}

is also known as orthogonal spatial patterns. The orthogonal spatial patterns are called EOFs. EOFs can be projected onto the total sample matrix to obtain the principal components (

P C

), expressed as:

P C_{2 N \times M} = V_{2 N \times 2 N}^{T} \times X_{1}^{'}

(7)

This study utilizes a limited quantity of orthogonal spatial patterns exhibiting significant variance to reconstruct the primary attributes of the spatial composition for each component. The data in

P C_{2 N \times M}

represent the

P C

corresponding to each column of eigenvector.

We retained the top 15 EOFs, accounting for 89% of the total variance. Currently, the primary challenge lies in enhancing the analysis and forecasting of these temporal sequences.

2.4. LSTM

In this study, we utilize LSTM networks, which are an improved type of recurrent neural network (RNN). While traditional neural networks are unable to retain information over time, RNNs with a recurrent structure can do so. However, they face the challenge of vanishing gradients when the information is too distant from the current prediction task. This results in the loss of previous information and the inability to handle long-term dependencies. All recurrent neural networks have a chain-like structure consisting of repeating neural network modules. Standard RNNs, such as single-layer RNNs, have a simple repeating module structure, typically a single tanh layer. However, LSTM networks have a special chaining structure that enables information to be looped. By storing both relevant and long-term information, LSTM networks can effectively address the problem of long-term dependencies and predict longer time series [34].

LSTM networks have the ability to store both short and long-term learning information and can selectively add or delete information. This is achieved through the use of gates that carefully regulate the flow of information. Each gate contains a sigmoid neural network layer, which determines the discarding of useless information, and a point multiplication operation. An LSTM network utilizes three gates to safeguard and regulate the cell state, as depicted in Figure 2.

The formula of the forget gate is expressed as Formula (8), the formula of the input gate is expressed as Formulas (9) and (10), the formula of the cell state is expressed as Formula (11), and the formula of the output gate is expressed as Formulas (12) and (13).

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(8)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(9)

{\tilde{C}}_{t} = \tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(10)

C_{t} = f_{t} * C_{t - 1} + i_{t} * {\tilde{C}}_{t}

(11)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(12)

h_{t} = o_{t} * \tanh (C_{t})

(13)

In these formulas, the forget gate

f_{t}

combines the previous hidden layer state value

h_{t - 1}

with the current input

x_{t}

. Decide to discard the original information through sigmoid function

σ

. Input gates

i_{t}

and

\tanh

determine which information to save in

h_{t - 1}

and

x_{t}

, and obtain the cell state candidate value

{\tilde{C}}_{t}

. Cell state

C_{t}

indicates the state of discarding and storing information. Finally, the output gate combines

\tanh

to determine which information in

h_{t - 1}

,

x_{t}

, and

C_{t}

is output as the hidden layer state value

h_{t}

at this time.

W

and

b

are weights and deviations.

In this research, the LSTM model consists of multiple layers, comprising a convolutional layer, a pooling layer, a bidirectional LSTM layer, a concatenate layer, three dropout layers, and three dense layers. This research introduces a one-dimensional convolutional layer that enhances the extraction of input data features. The convolutional layer solely performs temporal convolution, as the Principal Components (PCs) acquired during the MEOF phase inherently encapsulate spatial information. The duration for input is 40 days, while the duration for output is 7 days. It is worth noting that in this study, each day’s forecast is individually modeled, which results in fewer model parameters and makes it easier to train. Throughout the experiments, we meticulously chose the model hyperparameters. After conducting a comprehensive analysis and assessment, we have established the following parameters: the learning rate is assigned as 0.001, the number of training epochs is defined as 300, the batch size is designated as 128, and the dropout rate is configured to be 0.2. To optimize the model globally, we employed Adam’s algorithm, which is a popular optimization algorithm used in deep learning. For the LSTM, the software used is Python 3.10.6.

2.5. Performance Metrics

To evaluate and compare the performance of the models, the metrics of mean square error (

M S E

), root mean square error (

R M S E

), anomaly correlation coefficient (

A C C

) and skillscore (

S S

) are used in this study. The

R M S E

is calculated using the following formula:

M S E = \frac{1}{Y} \sum_{i = 1}^{Y} {(T r u e_{i} - P r e d_{i})}^{2}

(14)

R M S E = \sqrt{M S E}

(15)

The calculation formula for

A C C

is as follows:

A C C = \frac{1}{N N} {\sum_{j = 1}^{N N} [\frac{\sum_{i = 1}^{Y} (A T r u e_{i}_{j} - \bar{A T r u e_{j}}) (A P r e d_{i j} - \bar{A P r e d_{j}})}{\sqrt{\sum_{i = 1}^{Y} {(A T r u e_{i}_{j} - \bar{A T r u e_{j}})}^{2} \sum_{i = 1}^{Y} {(A P r e d_{i j} - \bar{A P r e d_{j}})}^{2}}}]}, (\bar{A T r u e_{j}} = \frac{1}{Y} \sum_{i = 1}^{Y} A T r u e_{i}_{j}, \bar{A P r e d_{j}} = \frac{1}{Y} \sum_{i = 1}^{Y} A P r e d_{i j})

(16)

The calculation formula for

S S

is as follows:

S S = 1 - \frac{M S E (P r e d, T r u e)}{M S E (R e f e r e n c e, T r u e)}

(17)

In these formulas,

Y

is the number of samples,

N N

is the number of spatial grid points,

P r e d_{i}

is the predicted value of the

i t h

sample,

T r u e_{i}

is the true value of the

i t h

sample,

A P r e d

is the predicted abnormal value relative to the climate state,

A T r u e

is the true anomaly value relative to the climate state, and

R e f e r e n c e

represents the comparison field.

3. Results

3.1. Model Selection

In order to enhance the precision of predicting SSHA and SST, we employ the conventional MEOF-ANN [29,30,31,32] approach. The features of element values are obtained through MEOF decomposition, and a portion of the features are used as inputs for prediction using ANN. We use atmospheric variables as predictors. This approach takes into account the impact of external forcing fields on oceanic processes. Some experiments are designed for comparison: (a) Considering only the interactions between ocean variables. This model is called the MEOF-LSTM-Sea model. (b) Complete coupling between ocean and atmosphere. This model is called the MEOF-LSTM-Sea-Air model. (c) Atmosphere and ocean are coupled separately. This model is called the MEOF-LSTM-D model. Figure 3 displays the flow chart for these models.

The MEOF-LSTM-Sea approach solely concentrates on marine components by utilizing MEOF to break down the combined factors of SSHA and SST. This process yields the principal components of marine elements (PC_Sea) for LSTM forecasting, ultimately using reconstruction to obtain the predicted value. The MEOF-LSTM-Sea model is driven by its internal dynamic mechanisms and considers the interrelation among marine components. In the MEOF-LSTM-Sea-Air model, SSW is introduced. Using MEOF to decompose the joint factors of the ocean and atmosphere, the principal components of the joint factors of the ocean and atmosphere (PC_Sea_Air) are obtained for LSTM prediction, and finally, reconstruction is used to obtain the predicted value. This method is strongly coupled, considering both the internal dynamic mechanisms of the ocean itself and the influence of external driving forces. Similarly, in the MEOF-LSTM-D model, the SSW is also introduced. MEOF is used to decompose ocean and atmospheric elements separately, obtaining the principal components of ocean elements (PC_Sea) and the principal components of atmospheric elements (PC_Air). PC_Air is used as a predictor and LSTM is used for prediction, ultimately reconstructing the predicted values. The MEOF-LSTM-D model, in contrast to the MEOF-LSTM-Sea-Air model, is a model with weak coupling that introduces external driving forces using different methods. Through these experimental scenarios, our aim is to evaluate the impact of different coupling methods and external driving factors on the predictive accuracy of sea surface height anomaly (SSHA) and sea surface temperature (SST). This comprehensive analysis allows us to assess the feasibility and effectiveness of each method in capturing the complex dynamics of the ocean-atmosphere system.

Here, we adopted a rolling forecast scheme for multi-day forecasting; therefore, the forecast accuracy on the first day is crucial for model evaluation. Based on this consideration, the accuracy of the three models mentioned above is measured using the forecast values of the first day. It is worth mentioning that we pay more attention to the anomalies of variables. The anomalies of variables are obtained through variable removal climatology, which is introduced in Section 2.3. In this study, the statistical results are based on the anomalies of variables.

Figure 4 illustrates the spatial forecast RMSE of SSHA and SST for three models. It is evident that the MEOF-LSTM-D model exhibits superior forecasting performance in comparison to both the MEOF-LSTM-Sea model and the MEOF-LSTM-Sea-Air model. The RMSE of the MEOF-LSTM-D model is significantly lower than that of the other two models, especially as shown in the black box area in Figure 4c. This is because the coastal waters are shallower and more susceptible to external driving forces, which makes the MEOF-LSTM-D model a more suitable choice for predicting coastal waters. Additionally, the MEOF-LSTM-D model outperforms the other two models when it comes to predicting SSHA and SST in nearshore waters. The MEOF-LSTM-D model has a significant improvement, particularly for the coastal region. The RMSE of these three models were 0.0150 m, 0.0154 m, and 0.0111 m for SSHA, and 0.3226 °C, 0.2753 °C, and 0.2244 °C for SST, respectively.

It should be emphasized that the MEOF-LSTM-Sea-Air model exhibits significantly inferior predictive performance compared to the MEOF-LSTM-D model. This is because the atmospheric and oceanic variables have different temporal scales of variation and different response times to each other’s interactions.

The Bohai Sea is shallow in water, and the contributions of external atmospheric driving and internal dynamic mechanisms of seawater to the evolution of marine elements in the Bohai Sea are generally equivalent. Therefore, it is necessary to consider the contribution of external atmospheric driving. Nevertheless, the interaction between the atmosphere and the ocean does not occur immediately but rather experiences a delay. The mentioned MEOF-LSTM-Sea-Air model is a strongly coupled method, while the MEOF-LSTM-D model is a weakly coupled method. The strong coupling method forcibly decomposes the joint elements of the atmosphere and ocean to obtain a joint EOF. This approach assumes that the exchange between the atmosphere and the ocean occurs instantly and without rationality. The method of weak coupling breaks down atmospheric and oceanic components individually, in accordance with the hysteresis of the interaction between the atmosphere and the sea. Therefore, the weak coupling method is more suitable for the Bohai Sea. Additionally, if a strong coupling method is used, more factors need to be considered, such as the physical relationship between oceanic and atmospheric variables, and more variables may be involved. This is an issue worth considering.

3.2. Evaluation of MEOF-LSTM-D Model

3.2.1. RMSE and ACC Evaluation

To assess the effectiveness of the MEOF-LSTM-D model, we employed the persistence prediction (PER) model and climatology to forecast the SSHA and SST in the Bohai Sea spanning 2016 to 2020. The persistence forecast is a widely recognized standard for comparing and predicting atmospheric and oceanographic phenomena. It assumes that the initial state of the ocean will remain unchanged during the prediction period. Similarly, the climatology forecasts are used for comparison, with the forecast based on the average historical data spanning from 2016 to 2020. Figure 5 displays the root mean square errors (RMSE) of SSHA and SST predictions for forecast windows of 1, 3, 5, and 7 days. The findings indicate that the MEOF-LSTM-D model outperforms the PER model for predicting a 7-day period. As the forecast time horizon was increased from 1 to 7 days, the SSHA RMSE of the MEOF-LSTM-D model increased from 0.011 m to 0.016 m, and the SST RMSE predictions increased from 0.2244 °C to 0.3200 °C. Additionally, the error in the PER model increases at a faster rate than that of the MEOF-LSTM-D model throughout the entire forecast period, and the MEOF-LSTM-D model exhibits a gradual and slow increase in error, whereas the PER model rapidly loses its relevance. The diagram illustrates that the spatial arrangement of RMSE for SSHA and SST forecast by the MEOF-LSTM-D model remained stable throughout the entire prediction period, resulting in outstanding forecast outcomes for both deep and shallow water situations within the examined region. However, the PER model exhibits a noteworthy rise in RMSE in shallow water regions, particularly in the Bohai Bay area, providing additional evidence of the MEOF-LSTM-D model superiority in forecasting shallow water conditions. The MEOF-LSTM-D model exhibits a significant improvement over the PER model, highlighting its exceptional predictive capability.

Temporal RMSE for the MEOF-LSTM-D model, PER model, and climatology is displayed in Figure 6. Additionally, Figure 6 also presents the temporal ACC for the MEOF-LSTM-D model and PER model. These values are computed using the forecasts made for every 7 days in the 5-year testing set.

The RMSE of SSHA and SST are shown in Figure 6a,b, while their ACCs are displayed in Figure 6c,d. Figure 6 clearly demonstrates that both the MEOF-LSTM-D model and the PER model outperform the climatology results in RMSE across the entire forecast period. This is primarily because the climatology results, being multi-year averages, fail to capture the dynamic changes in oceanic multiscale processes over the short and medium term. Additionally, the MEOF-LSTM-D model significantly enhances the performance of the PER model across the entire prediction period. At the conclusion of the prediction period, the MEOF-LSTM-D model demonstrates an RMSE of approximately 0.016 m and 0.32 °C for SSHA and SST forecasting, correspondingly. The ACC stands at roughly 0.95 and 0.97, respectively. Additionally, the MEOF-LSTM-D model exhibits increases slowly and steadily in prediction error over the 7-day forecast period, highlighting its predictive advantage. Nevertheless, the PER error grows at a faster rate compared to the MEOF-LSTM-D model.

Figure 7 displays the results of the Skill Score (SS) experiments for both models. If SS is greater than 0, it indicates that the prediction result surpasses the PER. A value of 1 for SS signifies a perfect prediction. On the other hand, if SS is less than 0, it implies that the prediction result is inferior to the PER. By referring to Figure 7, it becomes evident that the MEOF-LSTM-D model exhibits SS values above 0 for both SSHA and SST during the entire prediction period. This outcome suggests that the MEOF-LSTM-D model outperforms the PER model in terms of prediction accuracy.

3.2.2. Case Study

Ultimately, the MEOF-LSTM-D prediction model predictive performance is demonstrated through the provision of examples. In the study area, Figure 8 displays a snapshot of MEOF-LSTM-D predictions and the corresponding truth fields for anomalous SSHA and anomalous SST. In the time series of the original dataset for testing, they represent 4 June, 6 June, 8 June, and 10 June 2019, respectively.

In Figure 8a,b, the observed and predicted results for SSHA and SST are presented. The MEOF-LSTM-D model integrates external and internal drivers. As evident from the figure, the MEOF-LSTM-D model has produced accurate predictions for the SSHA and SST in the Bohai Sea region, and there exists a significant correlation between the true values and the predicted values of the model. However, in regions of the study area with high (low) variable values, an unavoidable error exists between the prediction of the MEOF-LSTM-D model and the true value. For the construction of the prediction field, we choose the PC whose variance accounts for 89%, but this inevitably leads to the loss of some information. It is noteworthy that the MEOF-LSTM-D model exhibits a good level of prediction accuracy in regional evolution. In comparison to the current situation, the MEOF-LSTM-D model excellently portrays the evolving patterns of SSHA and SST.

4. Discussion

In this study, we consider the coordination between different variables in the real marine environment and the forcing from the atmosphere to the ocean. Using remote sensing data for the Bohai Sea region from January 1993 to December 2020, we focus on the interactions between oceanic variables and the feasibility of using atmospheric variables as predictors. In order to adequately assess the model performance, we utilize the initial 23 years of data for training the model, while the remaining 5 years of data serve as independent experimental samples for forecasting SSHA and SST for seven days.

The average water depth in the Bohai Sea is shallow, and the influence of external driving factors can not be ignored. To improve the prediction accuracy of oceanic variables, we considered introducing atmospheric variables into the model. Additionally, we designed three comparative experiments, one modeled without the influence of wind field and the other two modeled with the introduction of an external wind field, to simulate the strong and weak coupling of atmospheric and oceanic variables. Through these experiments, we identified a weak coupling approach that considers the interactions between oceanic variables and utilizes atmospheric variables as predictors.

Therefore, the MEOF-LSTM-D model utilizes MEOF analysis to examine the multivariate predictions for the sea surface, which includes SSHA and SST. The U and V components of SSW are employed as predictors to establish the prediction model for SSHA and SST. The MEOF-LSTM-D model achieves an RMSE of 0.011 m and 0.32 °C for SSHA and SST in the Bohai region, respectively, at the conclusion of the prediction period. The RMSE prediction for SSHA and SST is improved in the MEOF-LSTM-D model compared to the PER model and climatology results. Additionally, the model yields accuracy scores of approximately 0.95 for SSHA and 0.97 for SST, surpassing the performance of both the PER model and climatology results significantly. Throughout the prediction window, both the MEOF-LSTM-D model and PER model exhibit SS values that are above 0. The effectiveness of sea surface wind as a predictor for predicting SSHA and SST in the Bohai Sea was demonstrated by the MEOF-LSTM-D model consistently outperforming the PER model for SSHA and SST in the case study of the Bohai Sea.

In this study, our contribution lies in enhancing the accuracy of oceanic environmental variable prediction by incorporating the correlation between different variables within the real oceanic environment and accounting for atmospheric forcing on the ocean. This novel approach provides valuable insights and opens up new avenues for future research in the field of oceanic variable prediction.

Author Contributions

Conceptualization, S.H., Q.S. and W.L.; methodology, S.H.; software, S.H.; validation, S.H. and Q.Z.; formal analysis, S.H.; investigation, Q.S. and W.L.; resources, S.H., H.L. and R.W.; data curation, S.H.; writing—original draft preparation, S.H.; writing—review and editing, S.H., G.H. and W.L.; visualization, S.H. and Q.S.; supervision, S.H., W.L. and Q.S.; project administration, W.L.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation, grant number 42376190, the National Key Research and Development Program, grant number 2021YFC3101500, and the National Key Research and Development Program, grant number 2022YFC3104800.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The SSHA data from Copernicus Marine and Environmental Monitoring Service (CMEMS), the SST data from National Oceanic and Atmospheric Administration (NOAA), and the SSW data from NASA Earth Science Enterprise (ESE).

Acknowledgments

The authors are grateful to Wei Li and Qi Shao for their advice and support for the theoretical aspects of this study. The authors thank Wei Li and Qi Shao for their help in reviewing and editing this paper. We also thank the reviewers and editors for their suggestions to improve the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bouali, M.; Sato, O.T.; Polito, P.S. Temporal trends in sea surface temperature gradients in the south Atlantic Ocean. Remote Sens. Environ. 2017, 194, 100–114. [Google Scholar] [CrossRef]
Yao, S.-L.; Luo, J.-J.; Huang, G.; Wang, P. Distinct global warming rates tied to multiple ocean surface temperature changes. Nat. Clim. Change 2017, 7, 486–491. [Google Scholar] [CrossRef]
Hudson, D.; Alves, O.; Hendon, H.H.; Wang, G. The impact of atmospheric initialisation on seasonal prediction of tropical pacific SST. Clim. Dyn. 2011, 36, 1155–1171. [Google Scholar] [CrossRef]
Zeng, X.; Li, Y.; He, R. Predictability of the Loop Current Variation and Eddy Shedding Process in the Gulf of Mexico Using an Artificial Neural Network Approach. J. Atmos. Ocean. Technol. 2015, 32, 1098–1111. [Google Scholar] [CrossRef]
Yang, Y.; Dong, J.; Sun, X.; Lima, E.; Mu, Q.; Wang, X. A CFCC-LSTM Model for Sea Surface Temperature Prediction. IEEE Geosci. Remote Sens. Lett. 2018, 15, 207–211. [Google Scholar] [CrossRef]
Wang, J.L.; Zhuang, H.; Cherubin, L.M.; Lbrahim, A.K.; Ali, A.M. Medium-Term Forecasting of Loop Current Eddy Cameron and Eddy Darwin Formation in the Gulf of Mexico with a Divide and Conquer Machine Learning Approach. J. Geophys. Res. Oceans 2019, 124, 5586–5606. [Google Scholar] [CrossRef]
Hannachi, A.; Jolliffe, I.T.; Stephenson, D.B. Empirical orthogonal functions and related techniques in atmospheric science: A review. Int. J. Climatol. 2007, 27, 1119–1152. [Google Scholar] [CrossRef]
Chen, D.K.; Yuan, X.J. A Markov model for seasonal forecast of Antarctic sea ice. J. Clim. 2004, 17, 3156–3168. [Google Scholar] [CrossRef]
Tang, B.; Hsieh, W.W.; Monahan, A.H.; Tangang, F.T. Skill comparisons between neural networks and canonical correlation analysis in predicting the equatorial pacifific sea surface temperatures. J. Clim. 2000, 13, 287–293. [Google Scholar] [CrossRef]
Kug, J.-S.; Kang, I.-S.; Lee, J.-Y.; Jhun, J.-G. A statistical approach To Indian Ocean sea surface temperature prediction using a dynamical ENSO prediction. Geophys. Res. Lett. 2004, 31, L09212. [Google Scholar] [CrossRef]
Collins, D.C.; Reason, C.J.C.; Tangang, F. Predictability of Indian Ocean sea surface temperature using canonical correlation analysis. Clim. Dyn. 2004, 22, 481–497. [Google Scholar] [CrossRef]
Lins, I.D.; Moura, M.; Silva, M.A.; Droguett, E.L. Sea surface temperature prediction via support vector machines combined with particle swarm optimization. In Proceedings of the 10th International Probabilistic Safety Assessment & Management Conference (PSAM), Seattle, WA, USA, 7–10 June 2010; pp. 16–29. [Google Scholar]
Lins, I.D.; Araujo, M.; Moura, M.; Silva, M.A.; Droguett, E.L. Prediction of sea surface temperature in the tropical Atlantic by support vector machines. Comput. Stat. Data Anal. 2013, 61, 187–198. [Google Scholar] [CrossRef]
Aparna, S.G.; Souza, S.D.; Arjun, N.B. Prediction of daily sea Surface temperature using artifificial neural networks. Int. J. Remote Sens. 2018, 39, 4214–4231. [Google Scholar] [CrossRef]
Patil, K.; Deo, M.C. Basin-Scale Prediction of Sea Surface Temperature with Artificial Neural Networks. J. Atmos. Ocean. Technol. 2018, 35, 1441–1455. [Google Scholar] [CrossRef]
Liu, X.Y.; Li, N.; Guo, J.; Fan, Z.Y.; Lu, X.P.; Liu, W.F.; Liu, B.D. Multistep-Ahead Prediction of Ocean SSTA Based on Hybrid Empirical Mode Decomposition and Gated Recurrent Unit Model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 7525–7538. [Google Scholar] [CrossRef]
Xu, S.; Dai, D.; Cui, X.; Yin, X.; Jiang, S.; Pan, H.; Wang, G. A deep leaning approach to predict sea surface temperature based on multiple modes. Ocean. Model. 2023, 181, 102158. [Google Scholar] [CrossRef]
Jahanbakht, M.; Xiang, W.; Azghadi, M.R. Sea Surface Temperature Forecasting With Ensemble of Stacked Deep Neural Networks. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1502605. [Google Scholar] [CrossRef]
Hao, P.; Li, S.; Song, J.B.; Gao, Y. Prediction of Sea Surface Temperature in the South China Sea Based on Deep Learning. Remote Sens. 2023, 15, 1656. [Google Scholar] [CrossRef]
Zhang, K.; Geng, X.P.; Yan, X.H. Prediction of 3-D Ocean Temperature by Multilayer Convolutional LSTM. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1303–1307. [Google Scholar] [CrossRef]
Xiao, C.J.; Chen, N.C.; Hu, C.L.; Wang, K.; Gong, J.Y.; Chen, Z.Q. Short and mid-term sea surface temperature prediction using time-series satellite data and LSTM-AdaBoost combination approach. Remote Sens. Environ. 2019, 233, 111358. [Google Scholar] [CrossRef]
Wu, Z.Y.; Jiang, C.B.; Conde, M.; Deng, B.; Chen, J. Hybrid improved empirical mode decomposition and BP neural network model for the prediction of sea surface temperature. Ocean. Sci. 2019, 15, 349–360. [Google Scholar] [CrossRef]
Zhang, X.D.; Wang, H.Y.; Wang, S.; Liu, Y.L.; Yu, W.D.; Wang, J.; Xu, Q.; Li, X.F. Oceanic internal wave amplitude retrieval from satellite images based on a data-driven transfer learning model. Remote Sens. Environ. 2022, 272, 112940. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, H.; Dong, J.; Zhong, G.; Sun, X. Prediction of sea surface temperature using long short-term memory. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1745–1749. [Google Scholar] [CrossRef]
Wei, L.; Guan, L.; Qu, L. Prediction of sea surface temperature in the South China sea by artifificial neural networks. IEEE Geosci. Remote Sens. Lett. 2019, 17, 558–562. [Google Scholar] [CrossRef]
Xie, J.; Zhang, J.; Yu, J.; Xu, L. An Adaptive Scale Sea Surface Temperature Predicting Method Based on Deep Learning With Attention Mechanism. IEEE Geosci. Remote. Sens. Lett. 2020, 17, 740–744. [Google Scholar] [CrossRef]
Xie, J.; Ouyang, J.; Zhang, J.; Jin, B.; Shi, S.; Xu, L. An Evolving Sea Surface Temperature Predicting Method Based on Multidimensional Spatiotemporal Influences. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1502005. [Google Scholar] [CrossRef]
Zhang, Z.; Pan, X.; Jiang, T.; Sui, B.; Liu, C.; Sun, W. Monthly and Quarterly Sea Surface Temperature Prediction Based on Gated Recurrent Unit Neural Network. J. Mar. Sci. Eng. 2020, 8, 249. [Google Scholar] [CrossRef]
Gao, Z.; Li, Z.; Yu, J.; Xu, L. Global Spatiotemporal Graph Attention Network for Sea Surface Temperature Prediction. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1500905. [Google Scholar] [CrossRef]
Shao, Q.; Li, W.; Hou, G.; Han, G.; Wu, X. Mid-Term Simultaneous Spatiotemporal Prediction of Sea Surface Height Anomaly and Sea Surface Temperature Using Satellite Data in the South China Sea. IEEE Geosci. Remote Sens. Lett. 2020, 19, 1501705. [Google Scholar] [CrossRef]
Shao, Q.; Li, W.; Han, G.; Hou, G.; Liu, S.; Gong, Y.; Qu, P. A Deep Learning Model for Forecasting Sea Surface Height Anomalies and Temperatures in the South China Sea. J. Geophys. Res. Oceans. 2021, 126, e2021JC017515. [Google Scholar] [CrossRef]
Shao, Q.; Zhao, Y.; Li, W.; Han, G.; Hou, G.; Li, C.; Liu, S.; Gong, Y.; Liu, H.; Qu, P. A Simple Statistical Intra-Seasonal Prediction Model for Sea Surface Variables Utilizing Satellite Remote Sensing. Remote Sens. 2022, 14, 1162. [Google Scholar] [CrossRef]
Shao, Q.; Hou, G.; Li, W.; Han, G.; Liang, K.; Bai, Y. Ocean Reanalysis Data-Driven Deep Learning Forecast for Sea Surface Multivariate in the South China Sea. Earth Space Sci. 2021, 8, e2020EA001558. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Framework of the prediction model.

Figure 2. (a) Forget Gate; (b) Input Gate; (c) Cell state; (d) Output Gate.

Figure 3. Flowchart depicting the algorithm for three different experimental approaches. (a) MEOF-LSTM-Sea model, (b) MEOF-LSTM-Sea-Air model, and (c) MEOF-LSTM-D model.

Figure 4. The spatial forecast RMSE of SSHA (top panels) and SST (bottom panels) for three models with a forecast time horizon of 1 day. (a,d) MEOF-LSTM-Sea model, (b,e) MEOF-LSTM-Sea-Air model, and (c,f) MEOF-LSTM-D model.

Figure 5. Spatial map of RMSE. (a) and (b) for SSHA (m) and SST (°C), respectively.

Figure 6. RMSE of MEOF-LSTM-D model, PER model and climatology; ACCs of MEOF-LSTM-D model and PER model; Calculated based on forecasts made every 7-day interval during the period from 2016 to 2020. (a,c) represent SSHA (m), while (b,d) represent SST (°C).

Figure 7. The MEOF-LSTM-D model forecasting skill score (in comparison to PER forecasts).

Figure 8. Snapshots of anomalous SSHA (m) and anomalous SST (°C) forecasts for the 1st day, 3rd day, 5th day, and 7th day, respectively, corresponding to 4 June, 6 June, 8 June, and 10 June 2019. (a) represents anomalous SSHA, and (b) represents anomalous SST.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, S.; Shao, Q.; Li, W.; Han, G.; Zheng, Q.; Wang, R.; Liu, H. Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model. J. Mar. Sci. Eng. 2023, 11, 2096. https://doi.org/10.3390/jmse11112096

AMA Style

Hu S, Shao Q, Li W, Han G, Zheng Q, Wang R, Liu H. Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model. Journal of Marine Science and Engineering. 2023; 11(11):2096. https://doi.org/10.3390/jmse11112096

Chicago/Turabian Style

Hu, Song, Qi Shao, Wei Li, Guijun Han, Qingyu Zheng, Ru Wang, and Hanyu Liu. 2023. "Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model" Journal of Marine Science and Engineering 11, no. 11: 2096. https://doi.org/10.3390/jmse11112096

APA Style

Hu, S., Shao, Q., Li, W., Han, G., Zheng, Q., Wang, R., & Liu, H. (2023). Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model. Journal of Marine Science and Engineering, 11(11), 2096. https://doi.org/10.3390/jmse11112096

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multivariate Sea Surface Prediction in the Bohai Sea Using a Data-Driven Model

Abstract

1. Introduction

2. Data and Methodology

2.1. Study Area and Data Collection

2.2. Proposed Model

2.3. MEOF Analysis

2.4. LSTM

2.5. Performance Metrics

3. Results

3.1. Model Selection

3.2. Evaluation of MEOF-LSTM-D Model

3.2.1. RMSE and ACC Evaluation

3.2.2. Case Study

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI