1. Introduction
Significant wind wave height (SWH) is important for predicting seaquakes, tsunamis, and tropical cyclones; wave period and wave length is also needed for ships, maritime structures, and other business [
1]. Accurate short-term SWH measurements are essential for planning protective measures against tsunamis, hydraulic structures, and wave energy facilities [
2,
3]. Hourly estimation of SWH is essential for short-term management, such as power generation [
4]. Prediction of ship movements, construction of maritime structures, dredging operations, and disaster warnings are all examples of marine engineering that benefit from accurate real-time predictions of SWH characteristics. However, due to natural waves’ unpredictable and irregular nature, predicting wave power and building wave power plants is difficult [
2]. The height of the waves is affected by environmental factors and climatic variations [
5]. Wind causes waves and is the most important meteorological factor in determining wave height. The accuracy of weather forecasts is affected by the non-stationarity and non-linearity of wind and wave properties [
6]. In early wave models, nonlinear interactions and energy dissipation were not adequately accounted for, resulting in unpredictable wind fields that made predictions difficult [
4]. Many wave height prediction methods rely on semi-analytical approaches such as the Pierson–Neumann–James and Sverdrup–Munk–Bretschneider methods. However, these cannot provide sufficient information about the surface waves [
3]. Numerical models are widely used for wave prediction. However, due to the large amount of data and the complexity of the calculations, they require high-performance computers and a considerable amount of time [
7,
8]. Although numerical models are useful for simulating the interaction between flow and structure, they may not be practical in critical situations where fast solutions are required [
9].
Rahimian et al., 2022 [
10] performed atmospheric simulations using weather and research forecasts (WRF) and compared the results with meteorological observations. Their results show that using the Mellor–Yamada–Nakanishi–Niino (MYNN) scheme for the planetary boundary and surface layers had the best performance for stations over water, while using the Mellor–Yamada–Janjic scheme for the planetary boundary and Eta-like surface layers had the best performance for stations over land. Lira-Loarca et al. [
11] studied the wave hazard in the Mediterranean Sea using long-term hourly data and an unstructured grid wave model. Their results show that an SPI of 3 and 5 at the beginning and at the peak of the storm, respectively, leads to an SPI of 3–5, depending on the characteristics and socioeconomic importance of the coastal sections. Myslenkov et al. [
12] studied the wind wave height in the Black Sea using different models. They concluded that for an SWH range of 0 to 3 m, the error does not exceed 0.5 m. However, for a SWH range of 3–4 m, the error increased significantly to −2 or −3 m. The quality of wave prediction was evaluated for several storm cases. Raj et al. [
13] performed wind wave simulations in the Indian Ocean. Their results show that all wave simulations have significant errors at low wind speeds compared to medium and strong winds, regardless of the error in the wind forecast.
Advances in AI have enabled the widespread use of soft computing techniques to predict SWH. These methods are more efficient and versatile than their linear counterparts because they can represent nonlinear waves without requiring knowledge of input–output connections. Soft computing models have developed rapidly and are widely used as computation time decreases. Several soft computing techniques such as the RBM-DBN hybrid model, the BMA-MARS/RF/GBRT ensemble approach, the DBN-IF model, the En- RLMD-RF ensemble method, LSTM and GRU networks, and the CLTS-Net deep neural network model have been studied and found to be good at predicting significant wave height. Several methods were used to predict the SWH. The Restricted Boltzmann Machine (RBM) and the conventional Deep Belief Network (DBN) model were used in a hybrid form by Zhang and Dai [
14] to predict the SWH on an hourly basis. According to their results, the hybrid model could predict the short-term maximum wave height with a relative error of less than 26%. Adnan et al. [
15] used a Bayesian Model Averaging (BMA) ensemble strategy that included multivariate adaptive regression splines (MARS), random forests (RF), and gradient-boosted regression trees (GBRT). Specifically, they discovered that the BMA model predicted SWH up to six days in advance with slightly higher accuracy than previous techniques. The short-term wave height prediction was the focus of Li and Liu’s model [
16] DBN-IF, a mixture of the dynamic Bayesian network and information flow. Their results showed the superior performance of the proposed DBN-IF model in predicting SWH. Ali et al. [
15] proposed an ensemble local mean decomposition combined with random forest (En- RLMD-RF) to predict the short-term SWH. The results showed that the En- RLMD-RF model outperformed its benchmarks in prediction accuracy. Long short-term memory (LSTM) networks and recurrent gating networks (GRU) were two of the recurrent neural networks (RNN) that Feng et al. [
17] investigated to predict SWH. Their results showed that gating-based LSTM and GRU networks performed better than conventional RNNs. Recently, a deep neural network model called CLTS-Net was developed by Li et al. [
18] to predict SWH. Their results show that the CLTS-Net can simultaneously capture the temporal relationships in the data, which enables accurate prediction of future large wave heights.
The generalization capability and gradient-based parameter learning of soft computing algorithms still have limitations despite their superior accuracy in predicting significant wave height [
5,
19,
20]. Therefore, reliance on a single machine learning approach can increase statistical variance and uncertainty due to limited input data for wave parameter prediction. To address this problem, the results of several different models can be combined using a multi-model approach. In this study, the performance of a refined neuro-fuzzy method was evaluated in conjunction with the algorithm used by marine predators. Adaptive Neuro-Fuzzy Inference System (ANFIS), ANFIS with genetic algorithm (ANFIS-GA), and ANFIS with particle swarm optimization were compared with this novel hybrid machine learning approach (ANFIS-MPA) (ANFIS-PSO). Half-hour time series data and a prediction method that looks several steps into the future were used for the analysis. This study is innovative in using the ANFIS-MPA approach to make multi-step predictions for SWH.
Section 2 discusses the use of soft computing models and details the study area and data collection.
Section 3 provides the main results and examines their implications for extending the newly tested model to additional climatic conditions.
Section 4 summarizes the main results of this paper.
4. Development of Hybrid ANFIS-PSO, ANFIS-GA, and ANFIS-MPA Models
In the final model developed, MARS was used to determine the best input combination, i.e., the best scenario for predicting SWH is evaluated. Each scenario considers different lagged SWH values. Then, all input combinations are analyzed for three hybrid ANFIS models, including ANFIS-PSO, ANFIS-GA, and ANFIS-MPA. Three statistical indices, including RMSE, MAE, and R2, are used for comparison suggestions.
5. Results and Discussion
This section compares the results of the MPA-based neuro-fuzzy approach in predicting significant wave heights for multiple horizons from t + 1 (one hour ahead) to t + 24 (one day ahead) with other hybrid neuro-fuzzy methods.
5.1. Results
In this study, we first apply the MARS method to determine the best input combination. The goal was to investigate whether this method can be applied to determine the best scenario for predicting SWH. This was then evaluated via hybrid ANFIS methods for all input combinations. The training and test results of the method MARS are shown in
Table 2 for the first station. As seen from the input combinations, three lagged inputs were used because inputs beyond this lag did not improve the prediction accuracy, and our goal was to predict SWH for multiple horizons from t + 1 to t + 24.
Table 3 shows that adding earlier lags slightly improves the accuracy of MARS. Therefore, three delayed inputs were selected as the best input combination. Then, this combination was used to predict SWH for other periods. As expected, the model’s accuracy deteriorates as the prediction horizon increases. The RMSE and MAE decreased from 0.0325 and 0.0232 to 0.1410 and 0.1076, and R2 increased from 0.9748 to 0.5201 over the test period.
Table 3,
Table 4 and
Table 5 summarize the training and testing results of the hybrid models ANFIS-PSO, ANFIS-GA, and ANFIS-MPA in predicting the SWH of the first station. The accuracy of the implemented methods is consistent, and all three methods provide the best performance for the third input combination. The model ANFIS-MPA showed the lowest RMSE (0.0277) and MAE (0.0192) and the highest R2 (0.9831) during the test period; followed by ANFIS-GA with an RMSE, MAE, and R2 of 0.0302, 0.0216, and 0.9787; and ANFIS-PSO with an RMSE, MAE, and R2 of 0.0312, 0.0226, and 0.9753. From t + 1 (1 h ahead) to t + 24 (1 day ahead), the accuracy of ANFIS-MPA decreases significantly; the RMSE, MAE, and R2 range from 0.0277, 0.0192, and 0.9831 to 0.1344, 0.1019, and 0.5833, respectively. At all forecast horizons, ANFIS-MPA is superior to the other hybrid methods. The improvement in RMSE of ANFIS-GA and ANFIS-PSO at the 1 h lead time test period is 11.2% and 8.3%, respectively. In contrast, the corresponding improvement at one day lead time (t + 24) is 3.38% and 0.59%.
Table 6 shows the training and test results of the method MARS for the first station. Again, accuracy for this station decreased slightly when delayed inputs were added. Accuracy decreased significantly when the prediction horizon was increased from 1 h to 1 day (t + 24). The RMSE, MAE, and R2 range from 0.1067, 0.0824, and 0.9635 to 0.2928, 0.2029, and 0.7303 in the test period. The best accuracy is obtained by the model with inputs Hst, Hst-1, and Hst-2 with the lowest RMSE (0.1067) and MAE (0.0824) and the highest R2 (0.9635) in the test period.
The training and test results of the hybrid ANFIS methods in predicting SWH at the second station are shown in
Table 7,
Table 8 and
Table 9. This station also shows consistent accuracy of the implemented methods with MARS. The best performance is obtained at the third input combination. Again, ANFIS-MPA outperforms ANFIS-PSO and ANFIS-GA in the 1 h SWH prediction with the lowest RMSE (0.0689) and MAE (0.0475) and the highest R2 (0.9847) in the test period. The use of ANFIS-MPA improves the RMSE accuracy of ANFIS-PSO by about 7% in predicting SWH 1 h ahead. Similar to the first station, the accuracy of the hybrid methods decreases significantly. For example, the RMSE, MAE, and R2 of ANFIS-MPA range from 0.0689, 0.0475, and 0.9847 to 0.2640, 0.1962, and 0.7735, respectively, for forecast horizons t + 1 to t + 24. The ANFIS-MPA outperforms the other hybrid methods at all forecast horizons.
The hybrid neuro-fuzzy and MARS models for predicting SWH are compared in
Figure 6 and
Figure 7 using scatter plots. The MPS-based ANFIS has the least scattered predictions, with the fitting equation closer to the exact line (y = x) and the highest R
2 in both stations. The models with three inputs (best models) are compared using Taylor diagrams in
Figure 8 and
Figure 9. This type of graph is very useful for observing the accuracy of the models based on RMSE, standard deviation, and correlation. The plots show that the MPA-based ANFIS has the highest correlation and the lowest squared error in predicting the SWH of both stations. The violin charts in
Figure 10 and
Figure 11 compare the SWH predictions and observations distributions. The figures show that the mean, median, and distribution of the MPA-based ANFIS are more like the observed values.
Figure 12 illustrates the average RMSE and MAE errors of all implemented models in predicting the SWH of both stations. It is clearly seen from the bar charts that the ANFIS-MPA has fewer RMSE and MAE errors in the short-term prediction of SWH in both sites.
5.2. Discussion
This study uses a new hybrid neuro-fuzzy method (ANFIS-MPA) to predict SWH using previous values as input. The results are compared with other hybrid neuro-fuzzy models. The MPA-based model is observed to outperform the other models in predicting SWH for multiple horizons from 1 h to 1 day.
The best input combination is investigated using the MARS method. Next, hybrid ANFIS methods are applied to the same scenarios to see if MARS is suitable for determining the best input combination in predicting SWH. A similar trend is observed between the MARS and hybrid ANFIS methods, indicating that the MARS can successfully determine the best input combination in SWH prediction. The comparison of the two stations shows that the methods are more successful in predicting SWH at the second station. The main reason could be the higher autocorrelation of SWG at the second station. These results are consistent with the previous literature [
48,
50].
It is observed that with the increasing horizon from 1 h to 24 h, the models’ accuracy highly deteriorates. However, the ANFIS-MPA generally provides superiority in such cases, which can be useful in monitoring SWH.
Machine learning allows us to find connections between physical parameters that we do not see or do not know. The formation of waves has a nonlinear and complex physical mechanism, and SWH is affected by different parameters, including wind speed, sea surface temperature, water depth, air humidity, and some other weather parameters. In this present study, only SWH data were used as inputs because of the unavailability of other influencing parameters.
6. Conclusions
This study examined the performance of a new hybrid neuro-fuzzy model, ANFIS-MPA, in predicting significant wave height in multiple horizons from 1 h to 1 day. Hourly data were obtained from two stations, Cairns and Palm Beach buoys, Australia. MARS as a simple tool was used to determine the best input of significant wave height for the much more complex hybrid ANFIS methods. This was also justified by employing hybrid methods for the same input combinations. It was observed that the MARS can be successfully used for selecting the best input combination in predicting significant wave height. The results of ANFIS-MPA were compared with those of the hybrid models ANFIS-PSO and ANFIS-GA. The results showed that the ANFIS-MPA model performed better than the other hybrid models in predicting significant wave height at both stations. At the second station, ANFIS-GA and ANFIS-PSO provided better accuracy than ANFIS-MPA for predicting the significant wave height 1 h ahead, while the latter model outperformed the ANFIS-GA and ANFIS-PSO for other forecasting horizons involving wave heights 2, 4, 8, 12, and 24 h ahead. Assessment criteria involving average RMSE and MAE and graphical inspections such as Taylor and Violin charts revealed that the ANFIS-MPA is superior to the other models in predicting SWH for multiple horizons. Overall results recommend the use of ANFIS-MPA in monitoring significant wave height for multiple time horizons, using only earlier values as inputs.
In this study, we used hourly data from two sites. The results can be generalized if data from other sites and other data intervals (daily or monthly) are used. The developed methods can also be compared with other hybrid machine learning methods to evaluate the accuracy of the implemented methods in predicting significant wave height. In this study, only previous SVH data were used as inputs, and in future studies, more effective parameters such as wind speed, sea surface temperature, water depth, and air humidity can be involved to develop more robust and accurate models in predicting SWH for multiple horizons.