Next Article in Journal
Event-Sampled Adaptive Neural Automatic Berthing Control for Underactuated Ships Under FDI Attacks
Previous Article in Journal
Impact of Reducing Waiting Time at Port Berths on CII Rating: Case Study of Korean-Flagged Container Ships Calling at Busan New Port
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Significant Wave Height Prediction Using LSTM Augmented by Singular Spectrum Analysis and Residual Correction

1
College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
2
First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China
3
Key Laboratory of Marine Science and Numerical Modeling, Ministry of Natural Resources, Qingdao 266061, China
4
Shandong Key Laboratory of Marine Science and Numerical Modeling, Qingdao 266061, China
5
Laboratory for Regional Oceanography and Numerical Modeling, Qingdao Marine Science and Technology Center, Qingdao 266237, China
6
College of Oceanography and Space Informatics, China University of Petroleum, Qingdao 266580, China
7
Qingdao Innovation and Development Center, Harbin Engineering University, Qingdao 266000, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(9), 1635; https://doi.org/10.3390/jmse13091635
Submission received: 30 June 2025 / Revised: 10 August 2025 / Accepted: 11 August 2025 / Published: 27 August 2025
(This article belongs to the Section Physical Oceanography)

Abstract

Significant wave height (SWH) is a key physical parameter influencing the safety of shipping, fisheries, and marine engineering projects, and is closely related to climate change and marine disasters. Existing models struggle to balance a high prediction accuracy with low parameter counts, and are challenging to deploy on platforms such as buoys. To address these issues, this study proposes an innovative method for SWH prediction by combining Singular Spectrum Analysis (SSA) with a residual correction mechanism in a Long Short-Term Memory (LSTM) network. This method utilizes SSA to decompose SWH time series, accurately extracting its main feature modes as inputs to the LSTM network and significantly enhancing the model’s ability to capture time-series data. Additionally, a residual correction module is introduced to fine-tune the prediction results, effectively improving the model’s 12 h forecasting accuracy. The experimental results show that for 1, 3, 6, and 12 h SWH predictions, by incorporating SSA and the residual correction module, the model reduces the Mean Squared Error (MSE), Root-Mean-Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) by 60–95%, and increases the coefficient of determination (R2) by 2–60%. The proposed model has only 10% of the parameters for LSTM based on Variational Mode Decomposition (VMD), striking an excellent balance between prediction accuracy and computational efficiency. This study provides a new methodology for deploying SWH prediction models on platforms such as buoys, and holds significant application value in marine disaster warning and environmental monitoring.

1. Introduction

Significant wave height, as an important parameter in marine meteorology, has a significant impact on various fields, such as marine engineering, shipping, fisheries, and climate research. Accurate predictions of significant wave height can help vessels avoid harsh sea conditions, improving their navigation safety and reducing their likelihood of accidents, especially in shipping and marine engineering. Additionally, significant wave height prediction results provide key decision-making support for fishery management, marine resource development, and coastal infrastructure construction [1,2,3,4]. Due to the significant dynamic variability and complex spatial distribution patterns of wave height, achieving accurate predictions remains a core challenge in oceanography. Currently, the main methods for predicting significant wave height include wave spectrum inversion and numerical wave models, as well as traditional machine learning and deep learning approaches.
The wave spectrum inversion method is based on existing wave spectrum data, and it calculates the wave characteristics through inversion, making it suitable for wave forecasting based on observational data, especially in cases where direct physical modeling is not feasible. Its accuracy depends on the quality of the data and the precision of spectral analysis. Torsethaugen et al. proposed a bimodal wave spectrum model that includes both locally generated wind waves and swells, and modeled the two peaks using an extended JONSWAP spectrum [5]. Ochi and Hubble proposed a six-parameter dual-component ocean wave spectrum model, which combines a three-parameter formula to cover the entire evolution of storm seas and establishes a statistical relationship between parameters and significant wave height, thus generating a spectrum family adapted to specific sea conditions [6]. The wave spectrum inversion method relies on existing wave spectrum data, making it difficult for it to handle nonlinear effects and dynamic changes in wave systems, and it also has strict requirements with regard to data quality. To overcome these limitations, the wave numerical model method was introduced. This method simulates the generation, propagation, and evolution of waves by solving physical equations, and offering higher accuracy and applicability, particularly under complex sea conditions. The WAMDI Group proposed the WAM model, which integrates wind input, nonlinear transfer, and whitecap dissipation source functions, and demonstrated its wave forecasting capability in multiple sea areas [7]. Tolman developed the third-generation full-spectrum wind–wave model based on the previous two generations, detailing the model’s development, operation, and numerical methods [8]. Booij et al. introduced the third-generation numerical wave model SWAN, which is used to compute random short waves in shallow-water areas and coastal regions with environmental flows. Experimental validation showed that the model’s results were highly consistent with theoretical solutions and experimental observations [9]. Numerical models can forecast waves by simulating multiple factors in a marine environment, but they rely heavily on computational resources. They are very sensitive to the choice of initial and boundary conditions, and can be affected by the accuracy of input data, often yielding poor results in certain localized areas.
With the increase in computational power and data availability, traditional machine learning methods have gradually been introduced into wave height prediction. Traditional machine learning methods can handle large and complex datasets, and can perform predictions by learning underlying patterns from historical data. Many researchers have explored the use of machine learning methods such as Random Forest and Support Vector Machines (SVMs) for predicting significant wave height. Etemad et al. compared the performance of the M5’ model tree and an Artificial Neural Network (ANN) in predicting significant wave height for Lake Superior, finding that the M5’ model tree provided more interpretable rules and slightly outperformed the ANN in terms of prediction accuracy [10]. Mahjoobi et al. proposed using SVM to predict significant wave height and compared it with an ANN and other models. The results showed that the SVM performed excellently in terms of both accuracy and computational efficiency [11]. Malekmohamadi et al. evaluated the effectiveness of various soft computing methods for wave height prediction and found that the SVM, ANN, and Adaptive Network-based Fuzzy Inference System (ANFIS) provided effective predictions, while Bayesian Networks (BNs) performed poorly [12]. Feng et al. developed a machine learning model based on a Multi-Layer Perceptron (MLP) for wave forecasting in Lake Michigan, and the results showed that this model outperformed the traditional SWAN physical model in terms of both prediction accuracy and computational efficiency [13]. Chen et al. used Support Vector Regression (SVR) combined with local meteorological and neighboring wave data to improve the 1 h and 3 h prediction accuracy of coastal significant wave height during typhoon events [14]. Although traditional machine learning methods can achieve high prediction accuracy in situations with limited information and small sample sizes, they face the issue of error accumulation as the dataset grows, which limits their performance on larger-scale datasets [15].
In recent years, deep learning methods have been widely applied in the prediction of significant wave height. Specifically, time-series models such as Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRUs) have achieved significant results in wave height prediction. Sadeghifar et al. used an RNN for wave height prediction along the southern coast of the Caspian Sea, and the results showed that the RNN model outperformed previous neural network methods in terms of prediction accuracy across different time scales [16]. Despite the impressive performance of RNNs, issues such as vanishing and exploding gradients arise when processing long sequences, leading to the development of variants such as LSTM. LSTM, by introducing a gating mechanism with forget, input, and output gates, can dynamically control the retention and forgetting of information, significantly enhancing this method’s ability to model long-term dependencies. This structured improvement tailored to the characteristics of time-series data has made LSTM one of the most popular methods for predictive tasks in recent research [17]. Fan et al. used LSTM to predict significant wave height over different durations. The results showed that, compared to Backpropagation Neural Networks (BPNNs), Extreme Learning Machines (ELMs), SVMs, Residual Networks (ResNets), and Random Forest (RF), LSTM achieved better predictive performance [18]. Jörges et al. developed an LSTM-based model for significant wave height prediction and reconstruction. The experiments demonstrated that LSTM excelled in both the reconstruction and prediction of wave heights [19]. Minuzzi et al. used LSTM to predict 6, 12, 18, and 24 h significant wave height at seven different locations in the South-West Atlantic, and compared the LSTM predictions with those from the ERA5 numerical model. The results indicated that LSTM outperformed the ERA5 model [20]. Gao et al. proposed an LSTM-based significant wave height prediction model, which showed higher prediction accuracy than traditional numerical models when applied to three stations in the Bohai Sea [21]. Meng et al. proposed an LSTM-based long-term wave sequence prediction method, and improved the prediction accuracy of irregular waves through multiple prediction steps [22].
Some researchers have effectively improved the accuracy of time-series predictions by integrating LSTM models with other algorithmic architectures or data preprocessing techniques. Fu et al. proposed a hierarchical hybrid model that combines CEEMDAN-RCMSE feature decomposition with VMD secondary decomposition, significantly improving marine wave height prediction accuracy through multi-scale LSTM dynamic modeling and a weighted ensemble strategy [23]. Ni et al. introduced a deep learning model that integrates Principal Component Analysis (PCA) with LSTM for short-term wave height prediction. The results showed that this model outperformed other data-driven methods in terms of prediction accuracy [24]. Martin and Felix proposed an RNN-LSTM-based method for predicting significant wave height. This model can make accurate predictions across different time intervals, and performed better than traditional persistence models over longer prediction periods [25]. Although significant breakthroughs have been made in improving significant wave height prediction accuracy, there remains a clear research gap in lightweight modeling for platforms such as buoys. While signal processing techniques like VMD have effectively enhanced prediction performance through modal forecasting and re-integration, they significantly increase the number of model parameters. This leads to a sharp rise in the demand for computational resources and energy consumption during the training process. The contradiction between computational complexity and energy efficiency severely limits the feasibility of deploying these models on resource-constrained platforms such as marine observation buoys [26,27].
This study aims to effectively improve the prediction accuracy of short- to medium-term significant wave height under the constraint of low model-parameter counts. Current time-series prediction models, such as LSTM, GRUs, and TCNs, generally exhibit insufficient accuracy in significant wave height prediction tasks, making it difficult for them to meet practical application requirements. Although integrating data decomposition techniques is an effective strategy to address this issue, the application of traditional data decomposition methods often leads to a significant increase in model parameters. To resolve this contradiction, this study innovatively constructs an LSTM prediction model based on Singular Spectrum Analysis (SSA) and a residual correction mechanism. The core innovation of this model lies in using the feature modes obtained from SSA decomposition as the input features to the LSTM network. This design balances prediction performance and model complexity. Owing to its ability to capture key fluctuation patterns through feature extraction while effectively controlling its parameter scale, the model achieves the dual objectives of improved prediction accuracy and model lightweighting. This model provides a new methodology for significant wave height prediction on platforms such as buoys, and holds significant application value in marine disaster warning, shipping safety, and marine environmental monitoring.

2. Materials and Methods

2.1. Materials

2.1.1. Data Source

The National Data Buoy Center (NDBC), under the National Oceanic and Atmospheric Administration (NOAA), is entrusted with data collection using a monitoring network composed of nearly 100 buoys and Coastal-Marine Automated Network (C-MAN) stations. This study uses data from 2016 to 2018 from NDBC site 44013, located approximately 16 nautical miles east of Boston, Massachusetts. The geographical coordinates of the site are 42°20′44″ N, 70°39′4″ W (i.e., latitude 42.346° N, longitude 70.651° W), and the buoy’s depth is 64.6 m. The climate in this region exhibits significant seasonal characteristics: the average wind speed ranges from 10 to 15 knots (18 to 28 km/h), with stronger winds observed in spring and autumn. The prevailing wind direction is southwest, a phenomenon closely related to the local seasonal climate patterns and the land–sea breeze effect. In terms of temperature, the annual average temperature is about 11 °C, with summer highs reaching over 22 °C and winter lows around 0 °C. The corresponding sea surface temperature reaches 20 °C in summer and falls to about 5 °C in winter. Furthermore, humidity shows a clear seasonal contrast, with summer humidity often exceeding 80%, while winter humidity is lower. The overall atmospheric pressure remains relatively stable, with an annual average of approximately 1013 hPa. A wind rose for this location during the study period is shown in Figure 1.

2.1.2. Dataset Construction and Processing

A dataset was constructed using 26,051 data points from NOAA buoy 44013, spanning from 1 January 2016 to 31 December 2018. Each data entry included ten variables: wind direction (WDIR), wind speed (WSPD), gust speed (GST), wave period (DPD), average wave period (APD), wave direction (MWD), atmospheric pressure (PRES), air temperature (ATMP), water temperature (WTMP), and significant wave height (SWH).
The dataset had an hourly sampling interval, with a total of 1153 missing values, accounting for 4.4% of the data. The proportion of missing values was relatively small, with no long periods of continuous missing data, and the distribution was uniform. Missing values were handled through interpolation. The processed dataset was divided into training, validation, and test sets, accounting for 64%, 16%, and 20% of the data, respectively. Table 1 presents the details of the dataset used in this study.

2.1.3. Feature Factor Selection

The generation of, and variation in, waves are influenced by multiple factors, with wind speed, wind direction, atmospheric pressure, and temperature playing crucial roles in wave generation and propagation. Studies have shown that these meteorological elements can serve as input variables for predicting wave height. Hashim et al. investigated the major climatic parameters affecting offshore wave height prediction, and used ANFIS to select influencing factors, finding that wind speed, wind direction, air temperature, and sea surface temperature were the most important input parameters [28]. Pang et al. incorporated wind speed and wind direction into an LSTM wave height model, significantly improving both short-term and long-term significant wave height prediction accuracy through multi-dimensional input [29].
Although existing research has extensively explored the impact of these factors on waves, quantitative analyses of their correlation with wave height remain relatively scarce. Traditional studies mostly use empirical models or physics-based analytical methods; however, these approaches do not enable in-depth investigation of the degree of correlation between influencing factors and wave height; thus, certain key features may not have been fully considered [30]. Common correlation analysis methods include Pearson, Spearman, and Kendall correlation analyses. Pearson correlation analysis is primarily used to measure the linear relationship between variables, with the assumption that the data follows a normal distribution [31]. In contrast, while Kendall’s rank correlation analysis can effectively capture nonlinear relationships, it faces limitations in practical applications, such as high computational complexity and inefficiency when processing large-scale datasets [32]. On the other hand, Spearman’s rank correlation analysis, as a non-parametric method, can effectively address issues of nonlinear relationships and outliers, without relying on distributional assumptions of the data [33].
Spearman’s rank correlation analysis is a non-parametric statistical method used to measure the monotonic relationship between two variables. This method involves ranking the data of each variable and then calculating the differences between the ranks to assess the degree of association between the variables. Spearman’s rank correlation coefficient ranges from −1 to +1, where +1 indicates a perfect positive correlation, −1 indicates a perfect negative correlation, and 0 indicates no correlation.
The calculation formula for Spearman’s rank correlation analysis is shown in Equation (1) [34].
ρ = 1 6 d i 2 N ( N 2 1 )
where d i is the rank difference for the i -th data pair in the two variables; the calculation formula is shown in Equation (2).
d i = R X i R Y i
where R X i is the rank of the i -th data point in the first variable, and R Y i is the rank of the i -th data point in the second variable.
Since wave generation is a highly nonlinear process, the linear relationship between historical features is relatively weak. Therefore, features with positive correlations are selected as input features [18]. This study conducted Spearman’s rank correlation analysis on the ten features and, based on statistical conventions, set a correlation coefficient of 0.5 as the threshold for determining significant correlations. This value is widely regarded as a reasonable cutoff for measuring moderate-strength correlations between variables [35]. As shown in the Spearman’s rank correlation matrix in Figure 2, the correlation coefficients between WSPD and GST with significant wave height are 0.56 and 0.57, respectively, both exceeding the threshold of 0.5, indicating a relatively high correlation between these features and significant wave height. WSPD reflects wind speed variations, which are generally strongly associated with the generation and propagation of waves, thereby establishing a significant relationship with significant wave height. In contrast, GST only reflects instantaneous peak values over short time periods. Although its correlation coefficient is also relatively high, it primarily represents short-term fluctuations and fails to effectively capture the long-term variations of significant wave height [36]. Therefore, despite the correlation between GST and significant wave height, it is unsuitable as an independent predictor for a prediction horizon exceeding one hour, in this study.
In comparison, historical significant wave height data provides valuable temporal information that aids the model in capturing the changing trends of significant wave height, which is critical for long-term forecasting. Significant wave height exhibits clear temporal characteristics, and historical data reflects the underlying patterns of wave height variations, providing strong contextual information for the model. Therefore, by selecting WSPD and historical significant wave height data as input features, the model can account for both the immediate changes in wind speed and the temporal dynamics of past wave heights, thereby enhancing the accuracy and stability of the predictions.

2.2. Methods

2.2.1. SSA Principle

SSA originates from Karhumen–Loeve theory [37]. A one-dimensional wave height sequence x is arranged into a two-dimensional time-delay matrix, according to the nesting dimension M and a certain time lag. PCA is performed on the phase space of the original wave height sequence x in the time-delay matrix, to obtain the eigenvalues and eigenvectors of the matrix. Generally, the eigenvectors obtained through this method have orthogonal properties, with different eigenvectors corresponding to different fluctuation signals in the wave height sequence x. Given the wave height sequence x, embedding dimension M, and time lag 1, the time-delay matrix X is arranged as shown in Equation (3).
X = x 1 x 2 x N M + 1 x 2 x 3 x N M + 2 x M x M + 1 x N = x 10 x 11 x 1 , N M x 20 x 21 x 2 , N M x M 0 x M 1 x M , N M
where N is the sequence length and M is the embedding dimension. Many practical studies show that M is generally taken as N/3 [38]. The i -th state value of the delay matrix X is given by Equation (4).
X i = x i + 1 x i + 2 x i + M = X 1 i X 2 i X M i                 i = 0 , 1 , 2 , , N M
The formula in Equation (4) yields NM + 1 states, and the elements of the delay matrix X correspond to the elements of the original wave height sequence x, according to the relationship X j i = x j   +   i . The covariance between the variables in Equation (3) represents the autocovariance of different lagged elements of the original sequence x i . The constructed lagged autocovariance matrix T x is shown in Equation (5).
T x = C ( 0 ) C ( 1 ) C ( 2 ) C ( M 1 ) C ( 1 ) C ( 0 ) C ( 1 ) C ( 2 ) C ( 1 ) C ( 0 ) C ( 0 ) C ( 1 ) C ( M 1 ) C ( M 2 ) C ( 1 )
T x is a Toeplitz matrix and a real symmetric matrix. The main diagonal elements in the matrix T x are the variances of the wave height sequence x. C ( j ) is the autocovariance of the wave height sequence x with a delay of j, where 0 ≤ jM − 1. C ( j ) is calculated using the Yule–Walker estimation method [39], as shown in formula (6).
C j = 1 N j i = 0 N j x i x i + j           j = 0 , 1 , 2 , , M 1
T x E k = λ k E k , k = 1 , 2 , 3 , , M
The eigenvalues and eigenvectors of the matrix T x are obtained by Equation (7). The eigenvector E k of T x represents a time series composed of M components, reflecting the temporal evolution pattern of the wave height sequence x. The projection of the state vector X i onto the M-th eigenvector is then defined as shown in Equation (8).
a i k = X i E k = j = 1 M x i + j E j k         0 i N M
The components of x can be reconstructed from a subset of the eigenvectors and time coefficients, as shown in Equation (9).
x i k = 1 M j = 1 M a i j k E j k                                             M i N M + 1 1 i j = 1 i a i j k E j k                                                                   1 i M 1 1 N i + 1 j = i N + M M a i j k E j k                 N M + 2 i N
The sum of the reconstructed components x k equals the wave height sequence x. The eigenvalues λ k of the matrix T x are arranged in descending order, i.e., λ 1     λ 2     λ 3     λ M     0 . The sum of the reconstructed components x k corresponding to the first p eigenvalues yields a reconstructed sequence that can fully reflect the overall characteristics of the original wave height sequence while removing noise and random errors from the original sequence, as shown in Equation (10).
x y = k p x k             1 p M

2.2.2. LSTM Principle

LSTM is a special type of recurrent neural network designed to address the vanishing- or exploding-gradient problems encountered by standard RNNs during the training of long sequences [40]. LSTM controls the flow of information by introducing three gating mechanisms—the input, forget, and output gates—thereby retaining long-term dependencies in time series. Specifically, the input gate determines the input of new information, the forget gate decides which information should be discarded, and the output gate controls the output of information at the current time step. LSTM effectively captures and memorizes information over long time spans, and it is widely applied in tasks such as speech recognition, natural language processing, and time-series prediction, particularly excelling in scenarios that require consideration of long-term dependencies. The specific structure is shown in Figure 3.
The calculation formula for the forget gate is shown below.
f t = δ W f h t 1 , x t + b f
where W f is the weight matrix of the forget gate; b f is the bias term; δ is the sigmoid activation function; x i is the current input; and h t 1 is the hidden output at the previous moment.
The input gate calculation formulas are as follows:
f t = δ W f h t 1 , x t + b f
C t ˜ = tanh W c h t 1 , x t + b c
C t = f t C t 1 + i t a t
where W i is the weight matrix of the sigmoid layer; b i is the bias term of the input gate sigmoid layer; W c is the weight matrix of the input gate tanh layer; b c is the bias term of the output gate tanh layer; and C t is the new cell shape.
The output gate calculation formula is as follows:
o t = σ W o h t 1 , x t + b o
where W o is the weight matrix of the output gate, and b o is the bias term of the output gate.
The final output of the LSTM is shown below.
h t = o t t a n h C t

2.2.3. Design and Implementation of SWH Prediction Model

Due to the non-stationarity of the significant wave height sequence, directly using the raw wave data without preprocessing for prediction would make it difficult for the model to capture key patterns, severely affecting its prediction performance. To address this, this study introduces Singular Spectrum Analysis (SSA) for data preprocessing. SSA is well-suited to identifying multi-time-scale variation features in data, and can effectively separate the main trend components from periodic fluctuation patterns. Through this feature extraction mechanism, the LSTM model can focus on learning the more predictive wave features, significantly improving its prediction accuracy.
This study first constructs an SSA-LSTM model for preliminary significant wave height prediction. It uses 6 h significant wave height data to predict the following 1 h significant wave height, 18 h data for 3 h prediction, 40-h data for 6 h prediction, and 80 h data for 12 h prediction, yielding preliminary prediction results. For the 12 h significant wave height prediction, to further uncover the unstable features in the original wave height data and correct any patterns and information not captured in the initial prediction, a second, two-layer LSTM network is used to model the residual sequence generated. The residual correction module is applied to predict the test set residuals, constructing an SSA-LSTM-R model for the 12 h significant wave height prediction, further improving the model’s prediction accuracy and reducing overall errors.
The architectures of the SSA-LSTM and SSA-LSTM-R models are shown in Figure 4 and Figure 5, respectively.
The processes in Figure 4 and Figure 5 are primarily composed of the following stages:
(1) Data processing stage: in the data processing stage, this study uses Spearman rank correlation analysis to select two key feature parameters, WSPD and historical significant wave height. The significant wave height sequence is then decomposed into modes using SSA, extracting 10 key modal components ( S I M F 1 , S I M F 2 , S I M F 3 , …, S I M F 10 ). Through feature fusion, traditional statistical features are combined with time–frequency domain features, ultimately constructing a hybrid input vector consisting of 12 dimensions, which provides multidimensional spatiotemporal feature representations for the subsequent LSTM model.
(2) Data input layer: this layer includes two key parameters: the time window length N and the input feature dimension. The time window N is dynamically adjusted based on the prediction duration, corresponding to the needs of different prediction scenarios (1, 3, 6, and 12 h predictions). The input feature dimension is fixed at 12, integrating the multi-source hybrid features extracted by the data processing module.
(3) Preliminary prediction model architecture: this stage uses a single-layer LSTM network structure, where the LSTM layer is configured with 128 memory units to capture temporal sequence features. After feature extraction, dimensional mapping is performed through a fully connected layer, designed as a linear transformation from a 128-dimensional input to a 1-dimensional output. Ultimately, the high-dimensional feature vector is compressed into a single significant wave height prediction value.
(4) Residual correction module: for 12 h wave height prediction, to further improve prediction accuracy, this study introduces a residual correction module based on the preliminary prediction. Residual modeling is achieved through the construction of a dual-layer LSTM network. The module employs a cascaded architecture design: the first layer is configured with an LSTM network containing 100 memory units for coarse-grained feature extraction, while the second layer, using an LSTM network with 50 units, models fine-grained features. The residual prediction value is then generated through a fully connected layer, which transforms a 50-dimensional input into a 1-dimensional output. This residual value is used to compensate for and correct the preliminary prediction results through linear addition, effectively capturing nonlinear error distribution features that are difficult for traditional single models to represent. Finally, the preliminary prediction and residual correction results are combined to form the final prediction output, which possesses fine-grained error correction capabilities.

2.2.4. Definitions and Background Information of Comparative Models

This section provides a brief introduction to the experimental comparison models, including Bidirectional Long Short-Term Memory Network (BiLSTM), Convolutional Neural Networ–-Long Short-Term Memory Network (CNN-LSTM), Gated Recurrent Unit (GRU), Bidirectional Gated Recurrent Unit (BiGRU), Convolutional Neural Network–Gated Recurrent Unit (CNN-GRU), and Temporal Convolutional Network (TCN).
BiLSTM is an extension of LSTM that processes data in both forward and backward directions, capturing bidirectional temporal dependencies within the input data. In the context of effective wave height prediction, BiLSTM can more comprehensively capture the dynamic features of the time series, particularly aiding in the prediction of future wave height trends. However, due to its bidirectional structure, BiLSTM incurs increased computational complexity and training time, resulting in higher computational cost and resource consumption.
CNN-LSTM combines CNN with LSTM, using CNN to extract spatial features and LSTM to model temporal dependencies. In effective wave height prediction, CNN-LSTM can simultaneously handle both spatial and temporal dependencies, making it suitable for tasks involving complex spatial patterns and time-series data. While CNN-LSTM excels in spatial feature extraction, it requires more computational resources and longer training times compared to LSTM when dealing with pure time-series data.
GRU is a simplified version of LSTM with fewer parameters and greater computational efficiency. By reducing the number of gating mechanisms, GRU demonstrates similar performance to LSTM in time-series modeling. In effective wave height prediction, GRU offers advantages in training speed and real-time performance, due to its lower computational demands, making it suitable for applications requiring fast responses. However, GRU’s accuracy in capturing long-term dependencies is slightly inferior to that of LSTM.
BiGRU is an extension of GRU that adopts a bidirectional structure, enabling it to capture both forward and backward temporal dependencies. Similar to BiLSTM, BiGRU can leverage both past and future information for predictions in effective wave height forecasting. However, due to its more streamlined architecture, BiGRU incurs lower computational costs and faster training times compared to BiLSTM. Despite these advantages, BiGRU’s modeling capability is somewhat limited when dealing with complex wave height patterns.
CNN-GRU integrates CNN with GRU, where CNN extracts spatial features and GRU models temporal dependencies. In effective wave height prediction, CNN-GRU is capable of processing data in shorter time frames and performing real-time predictions. Although it performs well in spatial feature extraction and computational efficiency, CNN-GRU does not match CNN-LSTM in terms of handling complex temporal dependencies.
TCN is a convolutional network specifically designed for sequential data, utilizing causal convolutions to ensure that the model only relies on past and present time steps for predictions. TCN captures long-range temporal dependencies effectively through dilated convolutions and skip connections, avoiding the vanishing gradient problem commonly encountered in traditional recurrent neural networks for long sequences. Although TCN handles long-term dependencies efficiently and offers faster training speeds, its accuracy in modeling complex temporal patterns is not as high as that of LSTM or BiLSTM.

2.2.5. Evaluation Indicators

This study constructs a multidimensional evaluation system using MSE, RMSE, MAE, MAPE, and R2. MSE amplifies error differences through squaring, placing emphasis on larger prediction deviations. RMSE, derived by taking the square root of MSE, retains the same unit as the original data, allowing it to directly correspond to the error magnitude on the actual measurement scale. MAE calculates the absolute values, eliminating the effect of positive and negative errors canceling each other out, providing a more balanced reflection of the overall prediction bias. MAPE presents the relative error as a percentage, enabling cross-comparison across data with different scales. R2 quantifies the model’s ability to explain data variability from a statistical perspective, with a value range of [0–1], intuitively demonstrating the model’s overall explanatory power over the data [41,42]. This combination balances absolute errors, relative errors, dimensional consistency, and statistical explanatory power, focusing on both error distribution characteristics and overall prediction accuracy, thus forming a comprehensive performance evaluation framework.
M S E = 1 N i = 1 N y i y ^ i 2
R M S E = 1 N i = 1 N y i y ^ i 2
M A E = 1 N i = 1 N y i y ^ i
M A P E = 1 N i = 1 N y i y ^ i y i × 100 %
R 2 = 1 i = 1 N y i y ^ i 2 i = 1 N y i y ¯ 2
where N represents the number of significant wave height data points; y i is the actual value of significant wave height; y ^ i is the predicted value of significant wave height; and y ¯ is the mean of the actual significant wave height values. The smaller the values of MSE, RMSE, MAE, and MAPE, and the closer R2 is to 1, the better the model’s prediction performance.

3. Results

3.1. Experimental Environment and Parameter Settings

The experimental environment of this study is the Windows 11 operating system, with an NVIDIA T1000 (8 GB) GPU and 64 GB of memory. The experiment uses the TensorFlow deep learning framework, with the model in this study tuned using the hyperparameter settings shown in Table 2.

3.2. Comparative Analysis of SWH Prediction Effects

3.2.1. SSA Decomposition

When utilizing SSA for data decomposition, selecting a correct number of modes is critical. Based on comprehensive evaluations encompassing multiple experimental outcomes and performance metrics, this study ultimately adopts 10 modes as the optimal count for SSA decomposition. Through systematic exploration of varying mode quantities, the optimal configuration was identified by assessing the model’s predictive error, goodness-of-fit statistics, and training efficiency. Figure 6 illustrates that an insufficient number of modes leads to inadequate capture of both low-frequency trend components and mid-to-high-frequency oscillatory patterns, resulting in an incomplete characterization of the data’s dynamic behavior and elevating the risk of underfitting. Conversely, as depicted in Figure 7, exceeding 10 modes compromises the effective representation of dominant fluctuation patterns, where redundant modal components may be erroneously interpreted as valid signals, thereby precipitating overfitting. The selection of 10 modes strikes a deliberate balance between preserving characteristic integrity and mitigating noise contamination. As evidenced in Figure 8, this configuration successfully retains critical time-frequency features—including low-frequency trends and salient oscillations—while suppressing noise interference through exclusion of higher-order noisy modes. The 10-mode SSA decomposition maintains the completeness of feature information while circumventing the curse of dimensionality, providing the LSTM with high-quality features that balance information density with computational efficiency.

3.2.2. Comparison of Model Prediction Effects

To validate the prediction performance of the SSA-LSTM model, it is compared with seven other significant wave height prediction models. Considering the model’s deployability, this study also evaluates the model’s performance, based on the number of parameters and training time, in addition to prediction accuracy. A comparison of the performance of SSA-LSTM with that of the other seven models on the test set is presented in Table 3, and the experimental data are taken from multiple trials, to ensure optimal results.
As shown in Table 3, for prediction durations of 1 h, 3 h, 6 h, and 12 h, the LSTM model, except for SSA-LSTM, achieves the best prediction metrics in terms of MSE, RMSE, MAE, MAPE, and R2. In terms of model parameter count, LSTM ranks after BiGRU and GRU, but performs better than the other models. Regarding model training time, LSTM is only surpassed by CNN-LSTM, performing better than the other models. These results indicate that in the field of significant wave height prediction, LSTM not only offers high prediction accuracy, but also has a relatively low model parameter count and short training time, demonstrating well-balanced overall performance.
Based on the LSTM model’s predictions, the modes from SSA decomposition were added as input features to the LSTM model. Table 3 shows that the SSA-decomposed modes significantly improve the prediction accuracy of the LSTM model. For instance, in the 1 h prediction case, the model’s MSE, RMSE, MAE, and MAPE decreased by 99.82%, 95.94%, 95.80%, and 95.03%, respectively, while R2 increased by 2.11%. Meanwhile, the number of model parameters and training time only increased by 7.62% and 6.80%, respectively. These results indicate that SSA decomposition of the significant wave height sequence can significantly enhance the model’s prediction performance, primarily because SSA decomposition removes noise, extracts multi-scale features, and improves the LSTM model’s ability to capture long-term dependencies. In the 3, 6- and 12 h predictions, the SSA-LSTM model also shows significant improvements in prediction performance over the LSTM model.
Significant wave height prediction curves for different models at various prediction durations are shown in Figure 9, Figure 10, Figure 11 and Figure 12.
In the 1 h significant wave height prediction graph, the prediction curves of all models generally exhibit similar trends. In the zoomed-in view, the blue SSA-LSTM prediction curve closely matches the black true-value curve, indicating that this model provides the best prediction performance. The prediction curves of the other models show slightly worse performance, but the overall difference is not significant.
In the 3 h significant wave height prediction graph, the prediction curves of all models exhibit greater fluctuation compared to the 1 h prediction, but the overall trend remains stable. In the zoomed-in view, the blue SSA-LSTM prediction curve closely matches the black true-value curve, indicating that this model still provides the best prediction performance. The prediction curves of the other models show increased fluctuation and perform worse than SSA-LSTM.
In the 6 h significant wave height prediction chart, the fluctuation of the prediction curves for each model is larger, compared to the 1 h prediction. In the zoomed-in view, the SSA-LSTM curve exhibits the best performance, with the smallest error. The prediction curves of the other models show larger fluctuations, and their prediction performance significantly decreases.
In the 12 h significant wave height prediction graph, all curves show a significant increase in fluctuation. In the zoomed-in view, only the SSA-LSTM curve remains closely synchronized with the true-value curve, while the prediction performance of the other models significantly declines, demonstrating poor results. Nevertheless, the prediction accuracy of SSA-LSTM no longer meets the practical application requirements and needs to be further improved.
To overcome the prediction accuracy bottleneck of the SSA-LSTM model for 12 h predictions, this study introduces a residual correction module into the SSA-LSTM model architecture, constructing an SSA-LSTM-R hybrid prediction model. During the training process for the SSA-LSTM model using the training and validation sets, residuals are generated between the model’s predicted values and the true values. These residuals are treated as a new dataset, which is divided into new training, validation, and test sets with the same proportions as the original sets, to train the residual correction module. After training, the residual module can correct the output of the SSA-LSTM model, further improving the significant wave height prediction accuracy. The effects of the residual correction are illustrated in Table 4.
As shown in Table 4, for the 12 h significant wave height prediction, compared to SSA-LSTM, SSA-LSTM-R achieved reductions of 90.74%, 68.15%, 63.24%, and 62.54% in MSE, RMSE, MAE, and MAPE, respectively, while R2 increased by 9.43%, demonstrating significant improvement.
A comparison chart for SSA-LSTM-R 12 h significant wave height prediction is shown in Figure 13.
To highlight the superiority of the SSA-LSTM and SSA-LSTM-R models proposed in this study, in terms of prediction accuracy and model parameter count, a comparison is made with the LSTM model based on VMD. VMD is an efficient algorithm that uses variational optimization to decompose a signal into multiple frequency-independent modes, with good adaptability and precise frequency separation capabilities. It performs exceptionally well when dealing with complex and non-stationary signals, and is an important tool in the field of signal processing. The number of decomposition modes for the VMD algorithm is set to 10, consistent with the SSA decomposition method. A comparison of the model prediction results is shown in Table 5.
As shown in Table 5, in terms of prediction accuracy, the VMD-LSTM model was outperformed by both the SSA-LSTM model for 1 h, 3 h, and 6 h predictions and the SSA-LSTM-R model for 12 h predictions. In terms of model parameters and training time, the proposed models also have an advantage. For instance, in the 1-h prediction case, the SSA-LSTM model has 72,321 parameters, which is only 10.76% of the VMD-LSTM’s parameters, and a training time of 171.17 s, which is just 9.51% of that of VMD-LSTM, demonstrating a significant advantage. The main reason for these differences is that VMD-LSTM needs to perform predictions separately for each mode decomposed by VMD and then integrate all the predicted values, requiring substantial resources during the training process.
The above experimental results indicate that the proposed SSA-based LSTM model with residual correction achieves high prediction accuracy while having a low number of model parameters, a short training time, and low resource consumption. It demonstrates excellent overall performance, making it suitable for deployment on energy-limited offshore platforms such as buoys.

4. Discussion

To address the issue of balancing high prediction accuracy with a low model parameter count in current significant wave height prediction models, this study proposes an innovative SSA-based LSTM model with residual correction for significant wave height prediction. Unlike previous studies, this study uses the decomposed modes as input features for the LSTM model, achieving a good balance between model parameters and prediction accuracy. To predict 12 h significant wave height, a residual correction module is added after the initial LSTM prediction, to further improve accuracy. The experimental results show that the proposed model demonstrates superior prediction performance on the constructed significant wave height test set.
In terms of prediction accuracy, early studies using sequential learning algorithms such as MRAN and GAP-RBF achieved wave height prediction with a small number of neurons, but the highest accuracy reached only 93.63% [43]. In [18], the researchers used LSTM for 1 h significant wave height prediction, increasing the model’s accuracy to 98.04%. Furthermore, the CNN-LSTM model increased prediction accuracy to 99.52%, although the model’s parameter scale was significantly larger than that of the proposed model [44]. Compared to the aforementioned models, the proposed model not only improves prediction accuracy, but also effectively controls the model parameter count. Notably, when compared to the VMD-based LSTM model, the proposed model demonstrates superior prediction accuracy across various prediction durations, including 1, 3, 6, and 12 h.
In terms of model parameters, the LSTM model with the addition of SSA only increases the parameter count by about 7%, which is just 10% of the VMD-LSTM model’s parameters. The training time for the model using the same dataset is also approximately 10% of that for the VMD-LSTM model, making it suitable for deployment on energy-limited platforms. Considering factors such as prediction accuracy, model parameters, and training time, the proposed model demonstrates the best performance, achieving accurate, fast, and efficient significant wave height prediction.
However, there is still room for improvement in the proposed model. First, for long-term significant wave height predictions, such as those conducted over 24 h, 48 h, and 72 h, the prediction accuracy of the current model does not meet the required standards. Future work will focus on improving the model’s prediction performance for long-term forecasts by refining the data and model structure. Second, due to significant differences in wave motion models across different marine areas, the model’s generalization ability is somewhat limited when applied to data from other marine areas, and its prediction accuracy is lower compared to when it is applied to data from the marine area used for training. Future studies will involve training the model with data from multiple marine areas, to enhance its generalization capability. Overall, the proposed model exhibits a significantly improved prediction accuracy with only a slight increase in model parameters, representing a basis for developing new ideas and methods for deploying high-accuracy significant wave height prediction models. Deploying this model on offshore platforms such as buoys will facilitate long-term site-specific predictions of significant wave height in marine areas, providing strong support for marine disaster warning, shipping safety, and climate change research.

Author Contributions

Conceptualization, H.L., Z.W. and C.N.; methodology, H.L.,L.Z. and C.N.; software, H.L.; validation, H.L.; formal analysis, C.N. and W.S.; investigation, H.L. and L.Z.; resources, C.N., W.S. and L.Z.; data curation, S.N., C.N. and C.L.; writing—original draft, H.L., C.N. and S.N.; writing—review and editing, Z.W., C.N. and W.S.; visualization, H.L. and W.S.; supervision, C.N., C.L. and Z.W., project administration, C.N.; funding acquisition, C.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China [2022YFC3104301].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in from the National Data Buoy Center at https://www.ndbc.noaa.gov/station_history.php?station=44013 (accessed on 10 March 2025), reference number [Table 1]. These data were derived from the following resources in the public domain: the National Oceanic and Atmospheric Administration (https://www.noaa.gov/ [accessed on 10 March 2025]) and the National Data Buoy Center (https://www.ndbc.noaa.gov/station_history.php?station=44013 [accessed on 10 March 2025]) [Table 1].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SWHSignificant Wave Height
SSASingular Spectrum Analysis
LSTMLong Short-Term Memory
MSEMean Squared Error
RMSERoot-Mean-Squared Error
MAEMean Absolute Error
MAPEMean Absolute Percentage Error
R2Coefficient of Determination
VMDVariational Mode Decomposition
ANNArtificial Neural Network
SVMSupport Vector Machine
ANFISAdaptive Neuro-Fuzzy Inference System
BNBayesian Network
MLPMultilayer Perceptron
SVRSupport Vector Regression
RNNRecurrent Neural Network
GRUGated Recurrent Unit
BPNNBackpropagation Neural Network
ELMExtreme Learning Machine
ResNetResidual Network
PCAPrincipal Component Analysis
BiLSTMBidirectional Long Short-Term Memory
BiGRUBidirectional Gated Recurrent Unit
TCNTemporal Convolutional Network
RFRandom Forest
CNN-LSTMConvolutional Neural Network–Long Short-Term Memory
CNN-GRUConvolutional Neural Network–Gated Recurrent Unit

References

  1. Young, I.R.; Zieger, S.; Babanin, A.V. Global trends in wind speed and wave height. Science 2011, 332, 451–455. [Google Scholar] [CrossRef] [PubMed]
  2. Temarel, P.; Bai, W.; Bruns, A.; Derbanne, Q.; Dessi, D.; Dhavalikar, S.; Fonseca, N.; Fukasawa, T.; Gu, X.; Nestegård, A.; et al. Prediction of wave-induced loads on ships: Progress and challenges. Ocean Eng. 2016, 119, 274–308. [Google Scholar] [CrossRef]
  3. Pérez-Collazo, C.; Greaves, D.; Iglesias, G. A review of combined wave and offshore wind energy. Renew. Sustain. Energy Rev. 2015, 42, 141–153. [Google Scholar] [CrossRef]
  4. Bonar, P.A.J.; Bryden, I.G.; Borthwick, A.G.L. Social and ecological impacts of marine energy development. Renew. Sustain. Energy Rev. 2015, 47, 486–495. [Google Scholar] [CrossRef]
  5. López, I.; Andreu, J.; Ceballos, S.; de Alegría, I.M.; Kortabarria, I. Review of wave energy technologies and the necessary power-equipment. Renew. Sustain. Energy Rev. 2013, 27, 413–434. [Google Scholar] [CrossRef]
  6. Ochi, M.K.; Hubble, E.N. Six-parameter wave spectra. In Coastal Engineering 1976; Coastal Engineering Press: Tokyo, Japan, 1976; pp. 301–328. [Google Scholar]
  7. Group, T.W. The WAM model—A third generation ocean wave prediction model. J. Phys. Oceanogr. 1988, 18, 1775–1810. [Google Scholar] [CrossRef]
  8. Tolman, H.L. User Manual and System Documentation of WAVEWATCH III TM Version 3.14; Technical Note; MMAB Contribution No. 276; U. S. Department of Commerce; National Oceanic and Atmospheric Administration; National Weather Service; National Centers for Environmental Prediction: Camp Springs, MD, USA, 2009.
  9. Booij, N.; Ris, R.C.; Holthuijsen, L.H. A third-generation wave model for coastal regions: 1. Model description and validation. J. Geophys. Res. Ocean. 1999, 104, 7649–7666. [Google Scholar] [CrossRef]
  10. Etemad-Shahidi, A.; Mahjoobi, J. Comparison between M5′ model tree and neural networks for prediction of significant wave height in Lake Superior. Ocean Eng. 2009, 36, 1175–1181. [Google Scholar] [CrossRef]
  11. Mahjoobi, J.; Mosabbeb, E.A. Prediction of significant wave height using regressive support vector machines. Ocean Eng. 2009, 36, 339–347. [Google Scholar] [CrossRef]
  12. Malekmohamadi, I.; Bazargan-Lari, M.R.; Kerachian, R.; Nikoo, M.R.; Fallahnia, M. Evaluating the efficacy of SVMs, BNs, ANNs and ANFIS in wave height prediction. Ocean Eng. 2011, 38, 487–497. [Google Scholar] [CrossRef]
  13. Feng, X.; Ma, G.; Su, S.-F.; Huang, C.; Boswell, M.K.; Xue, P. A multi-layer perceptron approach for accelerated wave forecasting in Lake Michigan. Ocean Eng. 2020, 211, 107526. [Google Scholar] [CrossRef]
  14. Chen, S.T.; Wang, Y.W. Improving coastal ocean wave height forecasting during typhoons by using local meteorological and neighboring wave data in support vector regression models. J. Mar. Sci. Eng. 2020, 8, 149. [Google Scholar] [CrossRef]
  15. Caruana, R.; Niculescu-Mizil, A. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 161–168. [Google Scholar]
  16. Sadeghifar, T.; Motlagh, M.N.; Azad, M.T.; Mahdizadeh, M.M. Coastal wave height prediction using Recurrent Neural Networks (RNNs) in the south Caspian Sea. Mar. Geod. 2017, 40, 454–465. [Google Scholar] [CrossRef]
  17. Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
  18. Fan, S.; Xiao, N.; Dong, S. A novel model to predict significant wave height based on long short-term memory network. Ocean Eng. 2020, 205, 107298. [Google Scholar] [CrossRef]
  19. Jörges, C.; Berkenbrink, C.; Stumpe, B. Prediction and reconstruction of ocean wave heights based on bathymetric data using LSTM neural networks. Ocean Eng. 2021, 232, 109046. [Google Scholar] [CrossRef]
  20. Minuzzi, F.C.; Farina, L. A deep learning approach to predict significant wave height using long short-term memory. Ocean Model. 2023, 181, 102151. [Google Scholar] [CrossRef]
  21. Gao, S.; Huang, J.; Li, Y.; Liu, G.; Bi, F.; Bai, Z. A forecasting model for wave heights based on a long short-term memory neural network. Acta Oceanol. Sin. 2021, 40, 62–69. [Google Scholar] [CrossRef]
  22. Meng, Z.-F.; Chen, Z.; Khoo, B.C.; Zhang, A.-M. Long-time prediction of sea wave trains by LSTM machine learning method. Ocean Eng. 2022, 262, 112213. [Google Scholar] [CrossRef]
  23. Fu, Y.; Ying, F.; Huang, L.; Liu, Y. Multi-step-ahead significant wave height prediction using a hybrid model based on an innovative two-layer decomposition framework and LSTM. Renew. Energy 2023, 203, 455–472. [Google Scholar] [CrossRef]
  24. Ni, C.; Ma, X. An integrated long-short term memory algorithm for predicting polar westerlies wave height. Ocean Eng. 2020, 215, 107715. [Google Scholar] [CrossRef]
  25. VS, F.E. Forecasting significant wave height using RNN-LSTM models. In Proceedings of the 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 13–15 May 2020; pp. 1141–1146. [Google Scholar]
  26. Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
  27. Rilling, G.; Flandrin, P.; Goncalves, P. On empirical mode decomposition and its algorithms. In Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing NSIP-03, Grado, Italy, 8–11 June 2003. [Google Scholar]
  28. Hashim, R.; Roy, C.; Motamedi, S.; Shamshirband, S.; Petković, D. Selection of climatic parameters affecting wave height prediction using an enhanced Takagi-Sugeno-based fuzzy methodology. Renew. Sustain. Energy Rev. 2016, 60, 246–257. [Google Scholar] [CrossRef]
  29. Pang, J.; Dong, S. A novel multivariable hybrid model to improve short and long-term significant wave height prediction. Appl. Energy 2023, 351, 121813. [Google Scholar] [CrossRef]
  30. Sabique, L.; Annapurnaiah, K.; Nair, T.B.; Srinivas, K. Contribution of Southern Indian Ocean swells on the wave heights in the Northern Indian Ocean—A modeling study. Ocean Eng. 2012, 43, 113–120. [Google Scholar] [CrossRef]
  31. Schober, P.; Boer, C.; Schwarte, L.A. Correlation coefficients: Appropriate use and interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef] [PubMed]
  32. Xu, W.; Hou, Y.; Hung, Y.S.; Zou, Y.X. A comparative analysis of Spearman’s rho and Kendall’s tau in normal and contaminated normal models. Signal Process. 2013, 93, 261–276. [Google Scholar] [CrossRef]
  33. De Winter, J.C.F.; Gosling, S.D.; Potter, J. Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychol. Methods 2016, 21, 273. [Google Scholar] [CrossRef]
  34. Ramsey, P.H. Critical values for Spearman’s rank order correlation. J. Educ. Stat. 1989, 14, 245–253. [Google Scholar] [CrossRef]
  35. Xu, J.; Mu, H.; Wang, Y.; Huang, F. Feature genes selection using supervised locally linear embedding and correlation coefficient for microarray classification. Comput. Math. Methods Med. 2018, 2018, 5490513. [Google Scholar] [CrossRef]
  36. Holmes, J.D.; Ginger, J.D. The gust wind speed duration in AS/NZS 1170.2. Aust. J. Struct. Eng. 2012, 13, 207–217. [Google Scholar] [CrossRef]
  37. Karhunen, K. Über lineare Methoden in der Wahrscheinlichkeitsrechnung. Ann. Acad. Sci. Fenn. 1947, 37, 1. [Google Scholar]
  38. Hassani, H. Singular spectrum analysis: Methodology and comparison. J. Data Sci. 2007, 05, 396. [Google Scholar] [CrossRef]
  39. Yule, G.U. On the theory of correlation. J. R. Stat. Soc. 1897, 60, 812–854. [Google Scholar] [CrossRef]
  40. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  41. Botchkarev, A. Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology. arXiv 2018, arXiv:1809.03006. [Google Scholar] [CrossRef]
  42. Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
  43. Savitha, R.; Al Mamun, A. Regional ocean wave height prediction using sequential learning neural networks. Ocean Eng. 2017, 129, 605–612. [Google Scholar] [CrossRef]
  44. Guan, X. Wave height prediction based on CNN-LSTM. In Proceedings of the 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 23–25 October 2020; IEEE: New York, NY, USA, 2020; pp. 10–17. [Google Scholar]
Figure 1. Wind rose diagram.
Figure 1. Wind rose diagram.
Jmse 13 01635 g001
Figure 2. Results of Spearman’s rank correlation analysis.
Figure 2. Results of Spearman’s rank correlation analysis.
Jmse 13 01635 g002
Figure 3. LSTM structure.
Figure 3. LSTM structure.
Jmse 13 01635 g003
Figure 4. SSA-LSTM model for significant wave height prediction.
Figure 4. SSA-LSTM model for significant wave height prediction.
Jmse 13 01635 g004
Figure 5. SSA-LSTM-R model for significant wave height prediction.
Figure 5. SSA-LSTM-R model for significant wave height prediction.
Jmse 13 01635 g005
Figure 6. Five-mode data decomposition.
Figure 6. Five-mode data decomposition.
Jmse 13 01635 g006
Figure 7. Fifteen-mode data decomposition.
Figure 7. Fifteen-mode data decomposition.
Jmse 13 01635 g007
Figure 8. Ten-mode data decomposition.
Figure 8. Ten-mode data decomposition.
Jmse 13 01635 g008
Figure 9. One-hour significant wave height prediction and a zoomed-in view of the results.
Figure 9. One-hour significant wave height prediction and a zoomed-in view of the results.
Jmse 13 01635 g009
Figure 10. Three-hour significant wave height prediction and a zoomed-in view of the results.
Figure 10. Three-hour significant wave height prediction and a zoomed-in view of the results.
Jmse 13 01635 g010
Figure 11. Six-hour significant wave height prediction and a zoomed-in view of the results.
Figure 11. Six-hour significant wave height prediction and a zoomed-in view of the results.
Jmse 13 01635 g011
Figure 12. Twelve-hour significant wave height prediction and a zoomed-in view of the results.
Figure 12. Twelve-hour significant wave height prediction and a zoomed-in view of the results.
Jmse 13 01635 g012
Figure 13. SSA-LSTM-R 12-h significant wave height prediction and a zoomed-in view of the results.
Figure 13. SSA-LSTM-R 12-h significant wave height prediction and a zoomed-in view of the results.
Jmse 13 01635 g013
Table 1. Details of the dataset.
Table 1. Details of the dataset.
IndicatorUnitMeanMaximumMinimumStandard Deviation
WDIR°189.60360.001.0098.99
WSPDm/s6.3222.8003.36
GSTm/s7.6828.700.104.15
SWHm0.948.160.250.74
DPDs7.4019.052.253.03
APDs4.8311.732.661.32
MWD°126.01360.001.0085.83
PREShPa1015.611043.90972.508.65
ATMP°C10.2629.30−19.507.97
WTMP°C11.7325.502.405.68
Table 2. Model hyperparameter configuration.
Table 2. Model hyperparameter configuration.
EpochsBatch SizeLearning RateOptimizerLossUnitsLayers
100360.001AdamMSE1283
Table 3. Comparison of prediction performance across models.
Table 3. Comparison of prediction performance across models.
ModelPrediction Duration
/h
MSERMSEMAEMAPE
/%
R2
/%
Parameter
Count
T—Time
/s
SSA-LSTM10.000010.003090.002110.3499199.9973372,321171.17
LSTM0.005610.076070.050227.0401397.8890467,201160.27
BiLSTM0.006970.08420.054557.3684397.60021134,401243.63
CNN-LSTM0.016120.125940.0866711.7988894.3727399,265159.34
GRU0.006380.078960.051727.1361897.7830150,817205.73
BiGRU0.006370.082430.053317.1026497.5761116,301298.83
CNN-GRU0.014660.120370.0810310.9515394.9263974,945399.11
TCN0.006720.083240.057268.5590797.51558136,5771062.66
SSA-LSTM30.000070.006380.004720.7711399.9836172,579419.36
LSTM0.022570.149120.0989313.9286491.9213167,459388.31
BiLSTM0.026370.162490.1050215.0207390.71068134,915701.28
CNN-LSTM0.039010.198220.1330918.5121785.7719999,523333.25
GRU0.024220.153050.1010613.7459791.7621451,075528.44
BiGRU0.024220.157350.1027614.8218391.0795316,503834.77
CNN-GRU0.043370.207780.1349718.9126184.9050175,203566.32
TCN0.024990.158860.1058815.3147190.97502136,7071672.13
SSA-LSTM60.000290.013730.009971.2287699.9320272,966918.71
LSTM0.067020.258890.1602922.1455375.7615167,846793.07
BiLSTM0.077810.278020.1762723.9103472.61257135,6861522.28
CNN-LSTM0.096380.309820.2044129.2105965.4492199,910679.61
GRU0.084070.289610.1922326.9128270.2975951,4621182.32
BiGRU0.079510.281480.1805925.4886271.5182416,8061471.71
CNN-GRU0.118040.344830.2216231.9123457.9444575,5901367.73
TCN0.089340.298290.1954728.4873568.05859136,9021783.93
SSA-LSTM120.030890.174340.1104715.0153989.4921173,7401762.43
LSTM0.193990.440310.2823538.8853831.7892768,6201517.26
BiLSTM0.235070.485710.3040243.2673818.05041137,2283061.17
CNN-LSTM0.200620.448170.2981943.5529229.42923100,6841253.46
GRU0.213510.461730.3184946.4828624.8187252,2362206.91
BiGRU0.181880.42760.2857141.3289136.8457317,4123192.65
CNN-GRU0.181270.425830.2760138.5011636.0869276,3641681.33
TCN0.162930.403380.2532734.0127143.43838137,2921854.22
Table 4. Effects of residual correction.
Table 4. Effects of residual correction.
ModelPrediction Duration
/h
MSERMSEMAEMAPE
/%
R2
/%
SSA-LSTM120.030890.174340.1104715.0153989.49211
SSA-LSTM-R0.002860.055530.040615.6243798.91838
Table 5. Comparison of prediction performance of SSA-LSTM, SSA-LSTM-R, and VMD-LSTM.
Table 5. Comparison of prediction performance of SSA-LSTM, SSA-LSTM-R, and VMD-LSTM.
ModelPrediction Duration
/h
MSERMSEMAEMAPE
/%
R2
/%
Parameter
Count
T—Time
/s
SSA-LSTM10.000010.003090.002110.3499199.9973372,321171.17
VMD-LSTM0.001720.040340.026513.6523899.42166672,0101799.08
SSA-LSTM30.000070.006380.004720.7711399.9836172,579419.36
VMD-LSTM0.002590.048620.033124.6502499.15418674,5904817.43
SSA-LSTM60.000290.013730.009971.2287699.9320272,966918.71
VMD-LSTM0.004570.065390.048867.2549998.49747678,4608561.29
SSA-LSTM-R120.002860.055530.040615.6243798.91838145,1911793.41
VMD-LSTM0.004290.067980.046726.8720798.36697686,20016,936.08
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ning, C.; Li, H.; Wang, Z.; Li, C.; Zeng, L.; Shao, W.; Nie, S. Significant Wave Height Prediction Using LSTM Augmented by Singular Spectrum Analysis and Residual Correction. J. Mar. Sci. Eng. 2025, 13, 1635. https://doi.org/10.3390/jmse13091635

AMA Style

Ning C, Li H, Wang Z, Li C, Zeng L, Shao W, Nie S. Significant Wave Height Prediction Using LSTM Augmented by Singular Spectrum Analysis and Residual Correction. Journal of Marine Science and Engineering. 2025; 13(9):1635. https://doi.org/10.3390/jmse13091635

Chicago/Turabian Style

Ning, Chunlin, Huanyong Li, Zongsheng Wang, Chao Li, Lingkun Zeng, Wenmiao Shao, and Shiqiang Nie. 2025. "Significant Wave Height Prediction Using LSTM Augmented by Singular Spectrum Analysis and Residual Correction" Journal of Marine Science and Engineering 13, no. 9: 1635. https://doi.org/10.3390/jmse13091635

APA Style

Ning, C., Li, H., Wang, Z., Li, C., Zeng, L., Shao, W., & Nie, S. (2025). Significant Wave Height Prediction Using LSTM Augmented by Singular Spectrum Analysis and Residual Correction. Journal of Marine Science and Engineering, 13(9), 1635. https://doi.org/10.3390/jmse13091635

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop