Improving Signiﬁcant Wave Height Forecasts Using a Joint Empirical Mode Decomposition–Long Short-Term Memory Network

: Wave forecasts, though integral to ocean engineering activities, are often conducted using computationally expensive and time-consuming numerical models with accuracies that are blunted by numerical-model-inherent limitations. Additionally, artiﬁcial neural networks, though signiﬁcantly computationally cheaper, faster, and effective, also experience difﬁculties with nonlinearities in the wave generation and evolution processes. To solve both problems, this study employs and cou-ples empirical mode decomposition (EMD) and a long short-term memory (LSTM) network in a joint model for signiﬁcant wave height forecasting, a method widely used in wind speed forecasting, but not yet for wave heights. Following a comparative analysis, the results demonstrate that EMD-LSTM signiﬁcantly outperforms LSTM at every forecast horizon (3, 6, 12, 24, 48, and 72 h), considerably improving forecasting accuracy, especially for forecasts exceeding 24 h. Additionally, EMD-LSTM responds faster than LSTM to large waves. An error analysis comparing LSTM and EMD-LSTM demonstrates that LSTM errors are more systematic. This study also identiﬁes that LSTM is not able to adequately predict high-frequency signiﬁcant wave height intrinsic mode functions, which leaves room for further improvements.


Introduction
Surface gravity waves (hereinafter, waves) are crucial physical phenomena to be considered in ocean engineering and renewable energy [1][2][3], shipping [4,5], scour protection, offshore wind foundations and breakwaters [6,7], amongst other activities. As such, accurate forecasts of evolution are indispensable. Currently, these forecasts are performed through the use of third-generation numerical wave models such as WaveWatch III [8] and SWAN [9] but these models consume significant computational resources and time, in addition to being imperfect due to theoretical and computing rigidities. Copula approaches are also popular and are mandatory in an array of design norms [10,11], but may be limited in their ability to accurately represent inter-series dependencies. Artificial-intelligencebased methods can provide forecasts of similar quality for a fraction of the computational and time costs and display strong abilities to overcome nonlinear physics problems. For example, in an early study, Deo and Naidu [12] used an artificial neural network (ANN) for wave predictions over 3 to 24 h horizons and found a satisfactory agreement with observations. More recently, Mandal and Prabaharan [13] used a recurrent neural network (RNN) to predict wave heights at 3, 6, and 12 h horizons, achieving correlation coefficients with the observations of 0.95, 0.9, and 0.87, respectively. Zubier [14] used the nonlinear auto-regressive network with exogeneous (NARX) ANN for wave-height predictions in the eastern central Red Sea at 3, 6, 12 and 24 h horizons, and observed that model performance could be enhanced with the difference between wind and wave directions used as an input. Ali and Prasad [15] built a machine learning model for 30 min significant wave height predictions in the eastern coastal zones of Australia by coupling the extreme learning model (ELM) with the improved complete ensemble empirical mode decomposition method with adaptive noise. Relevant for the current study, Gao et al. [16] used LSTM for wave height forecasting in the Bohai Sea at an variety of buoy locations.
Due to strong nonlinearities in wave generation and evolution processes, forecasts are consequently complicated and thus require additional tools to overcome. For example, Fan et al. [17] coupled a third-generation wave model to LSTM and found that the joint SWAN-LSTM model's forecasting efficacy was superior to those of extreme learning and support vector machine. To minimize computational cost, empirical mode decomposition is an efficient alternative to deal with wave nonlinearities. Although joint EMD-LSTM models were successfully applied in a variety of fields such as foreign exchange rate and stock price [18][19][20], electrical load [21], and metro passenger flow [22] forecasting, its use in earth science applications is comparatively rare. Dai et al. [23] applied the model to PM 2.5 concentrations in Beijing and found that EMD-LSTM provided a higher accuracy than using LSTM alone. This result was observed in the study performed by Liu et al. [24], who found that the EMD-LSTM can improve the Yangtze River streamflow predictions even during floods. Within the field of meteorology and oceanography, Guo et al. [25] demonstrated that EMD-LSTM can provide more accurate, stable, and reliable El Niño forecasting results compared to traditional neural networks. Huang et al. [26] first used EMD to decompose wind speed time series into several intrinsic mode functions before LSTM and Gaussian process regression (GPR) were employed for predictions. The results demonstrated that this methodology outperformed other wind forecasting methods such as the auto regressive integrated moving average and the back propagation neural network. Tang et al. [27] combined EMD with particle swarm optimization and least square support vector machine in a joint EMD-PSO-LSSVM model for significant wave height predictions at 1, 3, and 6 h lead times and optimized forecasts in the offshore and deep-sea areas of the North Atlantic. Raj and Brown [28] jointly used a hybrid Boruta random forest (BRF)-ensemble empirical mode decomposition (EEMD)-bidirectional LSTM (BiLSTM) algorithm to predict significant wave height over 24 h based on inputs of a zero-up crossing wave period, peak energy wave period, and sea surface temperature. The results showed that the EEMD-BiLSTM outperformed all other tested models for short-term forecasting (up to 24 h), but the authors noted that the forecast horizon can be extended to medium and long terms.
Therefore, this study extends forecast horizons to 48 and 72 h horizons. Due to the lack of research in this area and the need to improve wave forecasts as demonstrated by Hurricane Dorian's (2019) catastrophic landfall in The Bahamas, we use the National Data Buoy Center (NDBC) buoy observations of significant wave height in the Atlantic Ocean and initiate forecasting using a joint EMD-LSTM model. The rest of this paper is structured as follows: Section 2 describes the data and methodology employed, Section 3 presents the main findings of this study, and Section 4 summarizes the main findings of this study.

Buoy Data and Data Preprocessing
Observations of significant wave height were acquired from two buoys deployed in the Atlantic Ocean, east of The Bahamas (Figure 1). These buoys are owned, operated, and maintained by the NDBC. The acquired data ranges from 2018 to 2019 and are provided at an hourly resolution. Because EMD requires a stream of data that is not missing values, before LSTM model training, spline interpolation was performed to remove missing values from the time series. All relevant buoy statistics concerning geographical positions, water depth, and the number of observations before and after interpolation are available in Table 1.

Buoy Data and Data Preprocessing
Observations of significant wave height were acquired from two buoys deployed in the Atlantic Ocean, east of The Bahamas (Figure 1). These buoys are owned, operated, and maintained by the NDBC. The acquired data ranges from 2018 to 2019 and are provided at an hourly resolution. Because EMD requires a stream of data that is not missing values, before LSTM model training, spline interpolation was performed to remove missing values from the time series. All relevant buoy statistics concerning geographical positions, water depth, and the number of observations before and after interpolation are available in Table 1.  In Figure 2, a full time series of significant wave height at both buoy locations is plotted. It can be observed that, generally, wave heights naturally fluctuate widely over the course of two years, excluding extreme events as may be caused by winter storms or passing hurricanes (e.g., Hurricanes Dorian and Humberto in September 2019), but generally range from 1-4 m. Here, the training dataset for LSTM and EMD-LSTM was set so the whole of 2018 and 2019 wave heights were used in the testing dataset. To ensure that wave conditions throughout the entire year could be captured, rather than seasonal trends, the full year was used instead of splitting the data into the traditional 70% training and 30% test datasets.  In Figure 2, a full time series of significant wave height at both buoy locations is plotted. It can be observed that, generally, wave heights naturally fluctuate widely over the course of two years, excluding extreme events as may be caused by winter storms or passing hurricanes (e.g., Hurricanes Dorian and Humberto in September 2019), but generally range from 1-4 m. Here, the training dataset for LSTM and EMD-LSTM was set so the whole of 2018 and 2019 wave heights were used in the testing dataset. To ensure that wave conditions throughout the entire year could be captured, rather than seasonal trends, the full year was used instead of splitting the data into the traditional 70% training and 30% test datasets.

The Long Short-Term Memory Network
To eliminate the problems associated with vanishing gradients, LSTM belongs to a class of RNNs that is especially capable of analyzing time series. Through a series of forget

The Long Short-Term Memory Network
To eliminate the problems associated with vanishing gradients, LSTM belongs to a class of RNNs that is especially capable of analyzing time series. Through a series of forget ( f t ), input (i t ), and output (o t ) gates, the LSTM network can selectively remember patterns in long sequences of data, providing an advantage over conventional feed-forward neural networks and other RNNs. Past information is forgotten by the forget gate, with decisions on which information to delete defined as the value obtained by taking the sigmoid following receiving h t−1 and x t . The output of the sigmoid function ranges from 0 to 1 so that if the value is 0, information of the previous state is completely forgotten, and if the value is 1, information is completely retained. Information to be retained is saved in the input gate and following recording the values of h t−1 and x t , applies it to the sigmoid function. The value processed with the default tanh function and Hadamard product operator ( ) [29] is sent from the input gate. To represent the strength and direction of current information storage, i t ranges from 0 to 1 and C (the current state) ranges from −1 to 1. Each gate can be sequentially computed as follows: where W is each layer's assigned weight, c t are the new cell states, x t is the input time step at t, b is the bias. A schematic of the LSTM memory block is depicted in Figure 3.

Empirical Mode Decomposition
Widely used in time series feature extraction, EMD provides a set of intrinsic mode functions (IMFs) decomposed from a signal, which allows users to decompose singular values and avoid being trapped in a local optimum [30]. To ensure model performance and robustness, all IMFs must meet the following two conditions: Firstly, for a given set of data sequence, the number of extremal points must be either equal to the number of zero crossings or, at the most, differ by one. Secondly, at any point, the mean value of the envelope of the local maxima and minima must be zero.

Empirical Mode Decomposition
Widely used in time series feature extraction, EMD provides a set of intrinsic mode functions (IMFs) decomposed from a signal, which allows users to decompose singular values and avoid being trapped in a local optimum [30]. To ensure model performance and robustness, all IMFs must meet the following two conditions: Firstly, for a given set of data sequence, the number of extremal points must be either equal to the number of zero crossings or, at the most, differ by one. Secondly, at any point, the mean value of the envelope of the local maxima and minima must be zero.
For an original signal x(t), EMD decomposes x(t) through a sifting process, which is described as follows. An accompanying flowchart of the process is provided in Figure 4.
The theoretical foundations for both LSTM and EMD are both established and the full flowchart of the EMD-LSTM model is depicted in Figure 5. A total of thirteen IMFs and one residual were used in conjunction with the 2018 buoy data as the training dataset and predictions were compared to the 2019 buoy data that served as the testing dataset. The model's activation function is the default tanh setting as a sensitivity test with the alternative ReLu activation function showed no meaningful difference between functions. The time step was set to six as this is the maximum allowed time step as established by Fan et al. [17] when using 1 year of data. Based on the 3, 6, 9, 12, 24, 48, and 72 h forecast windows, significant wave height forecasts were performed.

1.
For signal x(t), identify all the maxima and minima.

2.
Through a cubic spline interpolation, fit upper u(t) and lower l(t) envelopes of signal x(t). The mean of the two envelopes is the average envelope curve m 1 (t): 3.
To obtain an IMF candidate, subtract m from x(t):

4.
If h 1 (t) does not satisfy the two IMF conditions, then h 1 (t) is set as the original signal and the prior step is repeated k times. Here, h k1 (t) can be estimated as follows: where h 1(k−1) (t) and h k1 (t) represent the signal after shifting k − 1 and k times, respectively. m 1k (t) is the average envelope of h k1 (t).

5.
However, if h 1 (t) satisfies the two IMF conditions, then define h k1 (t) as c 1 (t). The standard deviation is defined as follows: 6.
To obtain a new signal r 1 (t), subtract c 1 (t) from x(t): 7.
Repeat steps 1-6 until r n (t) cannot be further decomposed into IMFs. The residual of the original signal x(t) is given by r n (t). The original signal x(t) can finally be presented as a collection of n components u i (t)(i = 1, 2, . . . , n ) and a residual r n (t): The theoretical foundations for both LSTM and EMD are both established and the full flowchart of the EMD-LSTM model is depicted in Figure 5. A total of thirteen IMFs and one residual were used in conjunction with the 2018 buoy data as the training dataset and predictions were compared to the 2019 buoy data that served as the testing dataset. The model's activation function is the default tanh setting as a sensitivity test with the alternative ReLu activation function showed no meaningful difference between functions. The time step was set to six as this is the maximum allowed time step as established by Fan et al. [17] when using 1 year of data. Based on the 3, 6, 9, 12, 24, 48, and 72 h forecast windows, significant wave height forecasts were performed.

Performance Indicators
To assess the relative forecast efficacy of the LSTM and EMD-LSTM models, three commonly applied statistical techniques are used. The correlation coefficient (R), root mean square error (RMSE), and mean absolute percentage error (MAPE) are given as follows:

Performance Indicators
To assess the relative forecast efficacy of the LSTM and EMD-LSTM models, three commonly applied statistical techniques are used. The correlation coefficient (R), root mean square error (RMSE), and mean absolute percentage error (MAPE) are given as follows: where x i and x are the observed and mean significant wave heights, respectively; y i and y are the predicted significant wave height and average predicted wave height, respectively; and the length of the time series is given by n.

Results
Although currently widely used in time series forecasting due to its powerful ability to selectively remember and forget information, LSTM is, as are all methods, restricted by its mathematical underpinnings and the scientific inability to completely observe physical phenomena. The addition of other tools is thus necessary. In this section, the efficacy of LSTM is compared when the model is integrated and undergirded with EMD. Using the performance indicators in Equations (13)- (15) with results presented in Figure 6, significant wave height (SWH) forecasting efficacy based on LSTM and EMD-LSTM for the 3, 6, 9, and 12 h windows can be examined. As can be easily observed, EMD-LSTM significantly better predicts trend changes in SWH than LSTM. At the 3 h forecast, the discrepancies between forecast accuracy in the two methods are minor and it is not easy to distinguish distinct advantages or disadvantages between the two. However, as time progresses, the LSTM's forecast errors accumulate precipitously, as observed in later forecast horizons. For example, in Figure 6b-d where the forecast horizon is set to 6, 9, and 12 h, respectively, LSTM and EMD-LSTM forecasting results fall gradually out of phase with the observations, thus indicating worsening predictions. This is especially the case with large SWH values: when there are rapid changes in SWH, EMD-LSTM is superior. Thus, it can be concluded that LSTM on its own cannot accurately predict large SWH values.
To compare the forecast efficacy of these two methods more objectively, histograms of forecast errors are plotted in Figure 7. Errors larger than 1 m are binned together, while errors smaller than −1 m are binned together, and the total frequency of each occurrence (i.e., the occurrence of an error of a particular magnitude) is shown. Over time (i.e., as shown from Figure 7a-d that in sequence gives forecast errors at the 3, 6, 9, and 12 h windows), LSTM forecast errors gradually increase, though these errors are generally concentrated at 0.1 m, the frequency of errors at this magnitude also increases over time. By contrast, EMD-LSTM forecast errors are more evenly distributed and are distributed at around the 0 m mark. The frequency of errors greater than 1 m is also much lower than LSTM error frequencies. Thus, it can be demonstrated that the joint EMD-LSTM method not only displays a significantly higher prediction accuracy compared to LSTM, but also effectively minimizes the distribution of accumulated errors.
respectively, LSTM and EMD-LSTM forecasting results fall gradually out of phase with the observations, thus indicating worsening predictions. This is especially the case with large SWH values: when there are rapid changes in SWH, EMD-LSTM is superior. Thus, it can be concluded that LSTM on its own cannot accurately predict large SWH values. To compare the forecast efficacy of these two methods more objectively, histograms of forecast errors are plotted in Figure 7. Errors larger than 1 m are binned together, while errors smaller than −1 m are binned together, and the total frequency of each occurrence (i.e., the occurrence of an error of a particular magnitude) is shown. Over time (i.e., as shown from Figure 7a-d that in sequence gives forecast errors at the 3, 6, 9, and 12 h windows), LSTM forecast errors gradually increase, though these errors are generally concentrated at 0.1 m, the frequency of errors at this magnitude also increases over time. By contrast, EMD-LSTM forecast errors are more evenly distributed and are distributed at around the 0 m mark. The frequency of errors greater than 1 m is also much lower than LSTM error frequencies. Thus, it can be demonstrated that the joint EMD-LSTM method not only displays a significantly higher prediction accuracy compared to LSTM, but also effectively minimizes the distribution of accumulated errors. Listed in Table 2 are the results of the comprehensive error analysis conducted to determine the efficacy of both LSTM and EMD-LSTM over 3, 6,9,12,24,48 and 72 h forecast windows. There, it can be easily observed through enlarged forecast windows that RMSE and MAPE both steadily increased while R decreased for both LSTM and EMD-LSTM. However, EMD-LSTM's performance, as measured by R, at each forecast window was noticeably higher than LSTM for either buoy. The degree of improvement in EMD-LSTM over LSTM clearly demonstrated that over increasingly longer time frames, the joint EMD-LSTM model was able to retard the growth of errors and maintain good correlations with the observations, with the most dramatic increase seen at the 72 h horizon that represented an ~93% increase in forecast skill. A different pattern was seen for MAPE, however. While the RMSE for EMD-LSTM was improved by approximately 25-45%, MAPE's increase fluctuated. For buoy 41046′s short-term forecasts, MAPE's degree of improvement was initially very large, ~78% at the 3 h horizon, but then dropped precipitously to ~27% at the 6 h window. Over a moderate-term horizon (9 and 12 h horizons), Listed in Table 2 are the results of the comprehensive error analysis conducted to determine the efficacy of both LSTM and EMD-LSTM over 3, 6,9,12,24,48 and 72 h forecast windows. There, it can be easily observed through enlarged forecast windows that RMSE and MAPE both steadily increased while R decreased for both LSTM and EMD-LSTM. However, EMD-LSTM's performance, as measured by R, at each forecast window was noticeably higher than LSTM for either buoy. The degree of improvement in EMD-LSTM over LSTM clearly demonstrated that over increasingly longer time frames, the joint EMD-LSTM model was able to retard the growth of errors and maintain good correlations with the observations, with the most dramatic increase seen at the 72 h horizon that represented an~93% increase in forecast skill. A different pattern was seen for MAPE, however. While the RMSE for EMD-LSTM was improved by approximately 25-45%, MAPE's increase fluctuated. For buoy 41046 s short-term forecasts, MAPE's degree of improvement was initially very large,~78% at the 3 h horizon, but then dropped precipitously to~27% at the 6 h window. Over a moderate-term horizon (9 and 12 h horizons), the degree of improvement stopped at~8 and~9%, respectively, before it increased again to~40% at the 24 and 72 h long-term horizons. This pattern was not observed for buoy 41047 and, thus, this discrepancy may be due in part to data quality differences between the two buoys. Although buoys are assumed to be ground truth, the errors of buoys in their measurements as they imperfectly record the wave state must be considered. Additionally, because intermittency forces the use of interpolation (as in this study) or the introduction of reanalysis or model data (as may be found in other studies, e.g., [31]), these methods are not a perfect representation of the wave state. Errors consequently creep naturally and unavoidably into the results, following compounding by EMD-LSTM's inherent errors, thus necessitating caution. Though correlation is often the most valued performance metric, the RMSE, which carries the same unit as the measured variable, is even more important when considering, for example, the sensitivity of wave energy estimates to SWH (i.e., P ∝ H 2 s T p ). Here, minor increases or decreases in wave height (H 2 s ) had a disproportionate and extremely large impact on total energy estimates; thus, precise forecasts for wave energy conversion operations are of primary importance to the commercial viability of this emerging field [32][33][34]. An example is provided in Figure 8 where it can be observed that an observed significant wave height of 3 m (black line) with forecast errors of ±0.5 m for a given wave period (here ranging from 2-6 s), the maximum total wave energy predicted could then range anywhere from 18.75 kW/m for underestimations to overestimations of 36.75 kW/m, though the actual maximum value is 27 kW/m. Observations of RMSE's evolution over time from the LSTM and EMD-LSTM models listed in Table 2 show that over increasingly long forecast horizons, RMSE increases. At the 72 h forecast window where LSTM's RMSE was measured as 0.58 m, and EMD-LSTM's RMSE was measured a 0.38 m for NDBC buoy 41046; a difference of 0.2 m could be observed. Similarly, for NDBC buoy 41047, the RMSE for LSTM was measured as 0.55, while for EMD-LSTM, it was 0.28, representing a halving of errors. Though these differences are minor, the total energy is proportional to the square of the wave height. Thus, these minor differences are essential to capture as they can lead to dramatically different forecasts, affecting not only wave energy estimates, but also forecasts of hurricane-induced waves [35][36][37] and storm surges [38,39]. Consequently, the joint EMD-LSTM model displays a dramatically lower RMSE for wave forecasts compared to LSTM alone and thus, wherever possible, should be used. As it is yet to be determined if the lower RMSEs produced by the joint EMD-LSTM model compared to LSTM are universally the case, caution should be applied in future studies. time from the LSTM and EMD-LSTM models listed in Table 2 show that over increasingly long forecast horizons, RMSE increases. At the 72 h forecast window where LSTM's RMSE was measured as 0.58 m, and EMD-LSTM's RMSE was measured a 0.38 m for NDBC buoy 41046; a difference of 0.2 m could be observed. Similarly, for NDBC buoy 41047, the RMSE for LSTM was measured as 0.55, while for EMD-LSTM, it was 0.28, representing a halving of errors. Though these differences are minor, the total energy is proportional to the square of the wave height. Thus, these minor differences are essential to capture as they can lead to dramatically different forecasts, affecting not only wave energy estimates, but also forecasts of hurricane-induced waves [35][36][37] and storm surges [38,39]. Consequently, the joint EMD-LSTM model displays a dramatically lower RMSE for wave forecasts compared to LSTM alone and thus, wherever possible, should be used. As it is yet to be determined if the lower RMSEs produced by the joint EMD-LSTM model compared to LSTM are universally the case, caution should be applied in future studies.   In Figure 9a where the high-frequency IMFs are depicted, it can be easily seen that LSTM poorly predicts the observations, with only the main trends being captured and completely missing the extra and important details. Forecast skill improves significantly with the gradual lowering of frequencies, as shown in Figure 9b-d. Thus, it can be concluded that the EMD-LSTM model is able to significantly improve forecast quality for the primary reason that alone, LSTM is unable to capture high-frequency signals but following the decomposition of a given signal into a variety of components as achievable through EMD, lower-frequency signals can be separated from higher-frequency signals that contaminate underlying trends. As a result, forecast skill is significantly improved. With regards to high-frequency IMFs, it is hypothesized that the inclusion of wind information may increase the accuracy of wave forecasts, as shown in past research [40][41][42]. Although this is currently outside the scope of the present work, additional research into investigating this hypothesis is of significant value to overcome the present limitations of the EMD-LSTM model. In Figure 9a where the high-frequency IMFs are depicted, it can be easily seen that LSTM poorly predicts the observations, with only the main trends being captured and completely missing the extra and important details. Forecast skill improves significantly with the gradual lowering of frequencies, as shown in Figure 9b-d. Thus, it can be concluded that the EMD-LSTM model is able to significantly improve forecast quality for the primary reason that alone, LSTM is unable to capture high-frequency signals but following the decomposition of a given signal into a variety of components as achievable through EMD, lower-frequency signals can be separated from higher-frequency signals that contaminate underlying trends. As a result, forecast skill is significantly improved. With regards to high-frequency IMFs, it is hypothesized that the inclusion of wind information may increase the accuracy of wave forecasts, as shown in past research [40][41][42]. Although this is currently outside the scope of the present work, additional research into investigating this hypothesis is of significant value to overcome the present limitations of the EMD-LSTM model.

Conclusions
For oceanographic and maritime applications, precise, and not merely accurate, estimates and forecasts of significant wave height are crucial. It is for this reason that an equally wide array of physics-based numerical wave models and statistical techniques

Conclusions
For oceanographic and maritime applications, precise, and not merely accurate, estimates and forecasts of significant wave height are crucial. It is for this reason that an equally wide array of physics-based numerical wave models and statistical techniques were developed to achieve both short-and long-term forecasts. We attempted to improve such forecasts through investigating the symbiosis and efficacy of EMD and LSTM network through a comparative analysis of wave forecasts conducted using LSTM alone. The results strongly suggest that in terms of accuracy, the joint EMD-LSTM network is superior to LSTM because EMD is able to decompose the original nonlinear significant wave height into a variety of IMFS, which then allows LSTM to better capture the changes in the trend. This result remains true, even up to a 72 h lead time as, judging from the forecasting evaluation indicators (root mean square error, mean average percent error, and correlation), EMD-LSTM remained superior to LSTM.
However, for rapid changes in the higher-frequency intrinsic mode functions (IMFs), LSTM is still unable to make effective predictions. Consequently, for raw, undecomposed significant wave height signals that contain high-frequency components, these components interfere with the accuracy of the model. For high-frequency IMFS that cannot currently be accurately predicted, errors may accrue from both instrumentation (i.e., buoys) and other noise. Additional errors may be due to only wave data being used to predict wave data and waves being generated by wind forcing (wind speed and direction) or modulated by ocean currents, which is a significant hindrance to further improvements. Naturally, wind speed should be a more feasible significant wave height predictor [43], but a weaker relationship between wind and waves may occur if there is swell contamination of the wave field. This necessitates careful screening when predictors are added, as these may serve to not only improve the prediction accuracy of significant wave height, but also to improve the response speed to extreme wave heights. This study used the EMD-LSTM method to predict significant wave height. Compared to previous studies, the accuracy of the forecast is considerably improved, and the correlation of the 72 h forecast reaches 0.7. However, in practical applications, because EMD decomposition requires continuous data, and the actual data are inevitably missing, certain interpolation processing should be performed. The EMD-LSTM network can provide more precise forecasts of significant wave height and thus can increase capacity for real-time scheduling of fishing boat operations, wave energy, or other offshore engineering activities.