Next Article in Journal
Frequency-Domain Physics-Informed Neural Networks for Modeling and Parameter Inversion of Wave-Induced Seabed Response
Previous Article in Journal
Joint Optimization of Berth and Shore Power Allocation Considering Vessel Priority Under the Dual Carbon Goals
Previous Article in Special Issue
Stability of Beach Nourishment Under Extreme Wave Conditions: Insights from Physical-Model Experiments and XBeach Simulations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning Approaches for Probabilistic Prediction of Coastal Freak Waves

1
Department of Hydraulic and Ocean Engineering, National Cheng Kung University, Tainan 701, Taiwan
2
Coastal Ocean Monitoring Center, National Cheng Kung University, Tainan 701, Taiwan
3
Marine Meteorology and Climate Division, Central Weather Administration, Taipei 100, Taiwan
4
Department of Marine Environmental Informatics, National Taiwan Ocean University, Keelung 202, Taiwan
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2026, 14(8), 689; https://doi.org/10.3390/jmse14080689
Submission received: 17 March 2026 / Revised: 1 April 2026 / Accepted: 3 April 2026 / Published: 8 April 2026
(This article belongs to the Special Issue Coastal Disaster Assessment and Response—2nd Edition)

Abstract

Coastal freak waves (CFWs) are sudden and hazardous wave events that occur near shorelines and can pose serious threats to coastal visitors and infrastructure. Due to the complex interactions among coastal bathymetry, wave dynamics, and environmental conditions, the mechanisms governing CFW formation remain poorly understood, making reliable prediction difficult. This study investigates the feasibility of applying machine learning techniques to predict CFW occurrences using observational environmental data. Three machine learning algorithms, the Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN), were developed to generate probability-based predictions of CFW events. Environmental variables derived from buoy observations, including wave characteristics, wind conditions, swell parameters, wave grouping indicators, and nonlinear wave interaction indices, were used as model inputs. Hyperparameters were optimized using grid search combined with k-fold cross-validation. The results show that all three models achieved comparable predictive performance, with AUC values close to 0.80 and overall prediction accuracy around 74%. The ANN model achieved the highest recall, indicating strong capability in detecting CFW events, while the RF and SVM models showed more balanced precision and recall. Analysis of high-probability prediction events suggests that CFW occurrences are associated with swell-dominated conditions, strong wave grouping behavior, and enhanced nonlinear wave interactions. These results demonstrate that machine learning provides a promising framework for probabilistic prediction of coastal freak waves and has potential applications in coastal hazard assessment and early warning systems.

1. Introduction

Most freak waves occur in the open ocean and are characterized by their sudden appearance and exceptionally large wave height. They are commonly defined as waves whose height exceeds twice the significant wave height of the surrounding sea state. Numerous shipwreck incidents have been associated with the unexpected occurrence of freak waves [1,2,3,4,5,6]. Over the past decades, extensive research has investigated the generation mechanisms of freak waves through physical experiments, numerical simulations, and theoretical analyses. These studies suggest that several mechanisms may contribute to the formation of freak waves, including modulation instability, nonlinear wave–wave interactions, wind forcing, ocean currents, and wave focusing processes [7,8,9,10,11,12,13]. However, most of these mechanisms primarily apply to deep-water conditions.
When ocean waves propagate toward shallow coastal areas, their behavior becomes strongly influenced by local bathymetry, coastal morphology, and interactions with coastal structures such as rocks, armor blocks, or breakwaters. Under certain circumstances, these interactions can produce sudden and powerful splashing waves near the shoreline, commonly referred to as coastal freak waves (CFWs). Unlike deep-water freak waves, CFWs typically occur when waves break against coastal structures or irregular seabed topography, producing large splashes that may unexpectedly sweep people into the sea [14,15,16]. CFWs often occur without obvious warning signs and may appear even during apparently calm sea conditions. According to Didenkulova et al. (2023), a total of 429 coastal or offshore freak wave events were recorded worldwide between 2005 and 2021 [17]. In Taiwan, such events occur relatively frequently, particularly during typhoon seasons or strong monsoon periods. Previous reports indicate that more than 1000 injuries or fatalities have been associated with CFW incidents along the Taiwanese coast [18,19].
Despite the increasing awareness of this hazard, the generation mechanisms of CFWs remain poorly understood. The formation of CFWs involves complex interactions among wave dynamics, coastal bathymetry, and local hydrodynamic conditions, which vary significantly from site to site. As a result, developing deterministic prediction models based solely on physical mechanisms remains challenging. Nevertheless, predicting the occurrence of coastal freak waves is critically important for coastal safety, as anglers and tourists are frequently exposed to these hazards. Reducing casualties and improving public safety are important responsibilities for coastal management authorities. Therefore, in the absence of reliable physics-based prediction models, alternative approaches capable of identifying potential risk conditions are needed.
In recent years, artificial intelligence (AI), particularly machine learning (ML), has emerged as a powerful tool for analyzing complex natural phenomena with nonlinear and partially unknown mechanisms. ML methods are capable of learning patterns and relationships directly from observational data without requiring explicit physical formulations. In ocean engineering and coastal studies, ML techniques have been increasingly applied to various prediction tasks, including wave forecasting, wave energy optimization, wave breaking dynamics, and extreme event prediction. For example, ML models have been used to predict wave characteristics such as wave height, period, and direction, as well as ocean currents and long-term wave climate patterns [20,21,22,23,24]. In addition, machine learning approaches have been employed to estimate the probability of extreme ocean events, including extreme waves, storm surges, and coastal wave heights during typhoons [25,26,27]. Recent studies have also applied ML to simulate the morphological evolution of sandbars and beaches, demonstrating that ML models outperform traditional statistical approaches in most scenarios [28]. Moreover, interpretable ML has been employed for shoreline and wave runup prediction, enabling the identification of region-specific physical phenomena and providing insights into the underlying mechanisms [29,30]. Furthermore, recent research has integrated ML into coastal vulnerability assessment frameworks by incorporating physical, environmental, and socio-economic factors. These approaches leverage ML to capture the nonlinear relationships among variables, thereby offering advanced decision-support tools for coastal management authorities [31]. These studies demonstrate the potential of ML methods to provide reliable predictions even when the underlying physical mechanisms are not fully understood.
Machine learning can generally be categorized into three major types: supervised learning, unsupervised learning, and reinforcement learning. Among them, supervised learning has been widely adopted in ocean wave research because it allows models to learn from labeled historical data. Previous studies have applied artificial neural networks (ANNs) to predict the occurrence of coastal freak waves. For instance, Doong et al. (2018) developed an ANN-based model to predict CFW occurrences [32], and later studies further extended this approach to probabilistic forecasting frameworks [33]. Although ANN models have shown promising results, relying solely on a single machine learning method may limit prediction accuracy and model robustness. Different machine learning algorithms possess distinct strengths in capturing nonlinear relationships and handling complex datasets. Therefore, exploring additional machine learning approaches is essential for improving prediction performance and reliability. Furthermore, comparing multiple algorithms can provide valuable insights into their respective advantages and limitations in predicting coastal freak waves. To address this research gap, this study differs from previous works by not only applying the ANN method but also incorporating other common ML algorithms. Furthermore, rather than merely providing a general overview of model training performance, this study further analyzes the distinct characteristics of different ML models in predicting CFWs.
Despite the growing body of research on extreme waves, the prediction of coastal freak waves (CFWs) remains largely unexplored, particularly in terms of probabilistic forecasting. Most existing studies rely on deterministic or binary classification approaches, which are often insufficient for practical coastal hazard management. In addition, only limited studies have systematically compared different machine learning algorithms for predicting CFW occurrences using real observational datasets. To address these research gaps, this study investigates the feasibility of applying multiple machine learning techniques to predict the occurrence of coastal freak waves. Three machine learning algorithms, including Artificial Neural Networks (ANN), Support Vector Machines (SVM), and Random Forest (RF), are employed to develop probabilistic prediction models that estimate the likelihood of CFW occurrence rather than providing simple binary outputs. A comparative analysis of these algorithms is conducted to evaluate their predictive performance and characteristics. The results are expected to provide new insights into the application of machine learning methods for coastal freak wave prediction and contribute to the development of more reliable coastal hazard assessment and early warning systems.

2. Machine Learning Approaches for CFW Prediction

2.1. Random Forest (RF)

Random forest (RF), a machine learning technique originally proposed in 2001 [34], involves the random strategic selection of data samples to build multiple decision trees to construct the forest. By iteratively selecting samples and generating decision trees, these trees are merged to form a comprehensive forest. As a collection of individual decision trees, RF is inspired by the Classification and Regression Trees (CART) algorithm developed by Breiman and colleagues in 1984 [35]. The basic components of a decision tree are root nodes, intermediate nodes, leaf nodes, and connecting branches of each node. Data is categorized at root nodes and intermediate nodes in the tree and passed down to the next node until all data is categorized into one category. The difference lies in the root node‘s point of origin for input data. A network of paths, called branches, emanates from the top nodes and converges at the bottom nodes, each encapsulating a unique rule. The relationships between the nodes encapsulate data partitioning rules, culminating in the identification of optimal partitioning points at each node. The data is sequentially bifurcated according to the chosen partitioning criteria until a complete classification is obtained. The results of this classification process are encapsulated in leaf nodes.
The rule at the branch between each node is used to split the data by minimizing the sum of squared errors. Given a data sequence ( x 1 , y 1 ) , ( x 2 , y 2 ) , , ( x i , y i ) where x i is the input data in vector form and y i is the categorization or group for the input data, i = 1,2 , , N where N is the number of the input dataset. If the data x i has a number P of attributes, which is ( x i 1 , x i 2 , , x i P ) . According to a certain attribute j , where j ( 1,2 , , P ) , with s as the splitting point, if x i j is larger than s , the data is distributed to the R 1 region; if x i j is less than s , it is distributed to the R 2 region. The individual mean values of the data in the R 1 and R 2 regions are shown as:
c 1 = m e a n ( y i | x i R 1 )
c 2 = m e a n ( y i | x i R 2 )
In order to determine the optimal split point s for the data in each nodes, calculating the sum of squared errors for the j -th input attributes when split at point s . The optimal split point, which leads to the lowest sum of squared errors, is determined by the pairing of j and s . This can be shown as:
( j * , s * ) = arg m i n j , s [ m i n c 1 Σ x i R 1 ( j , s ) ( y i c 1 ) 2 + m i n c 1 Σ x i R 2 ( j , s ) ( y i c 2 ) 2
This iterative splitting process continues until further splitting is no longer possible, such as when all the data belongs to a certain category, thus completing the decision tree. The term random is the method of constructing decision trees in a random process. Within the building process of the decision tree, a subset of samples is chosen at random from the dataset. At each node of the decision tree, the attribute j used to partition the data is also randomly selected. This selection involves picking attributes from the total P attributes present in the dataset. Each decision tree is constructed by randomly selecting samples and attributes. The final output is a combination of the synthesized results from each individual decision tree. The architecture of both RF and a single decision tree is shown in Figure 1.

2.2. Support Vector Machine (SVM)

The Support Vector Machine (SVM) is a well-known machine learning technique first introduced in 1995 [36]. SVM is based on the concept of structural risk minimization and is used to identify and distinguish the most appropriate data points within a dataset by using a linearly separable function to optimize the margin between the two groups of data. In SVM, the data points used to determine the function are referred to as support vectors. The hyperplane is the separable function that is strategically placed to maximize the margin between the two groups of data, which defines the optimal separation between them. The objective of SVM is to determine the hyperplane that maximizes the distance between the support vectors and the hyperplane itself, to achieve the most effective separation of the two groups of data. If the dataset cannot be separated using a linearly separable function, SVM utilizes a technique known as the kernel trick. This technique can map the data into a higher-dimensional space by using a kernel function, which ultimately leads to identifying the hyperplane with the greatest margin between the two groups of data. Thus, the function used to separate data in SVM is known as a hyperplane. It is defined by an n-dimensional vector w and a scalar b , such that all data can be separated into two categories by the linear equation w T x + b > 0   o r < 0 . The hyperplane can be described as:
w T x + b = 0
where w T is the normal vector of the hyperplane, x is the input data and b is a scalar bias. Within the framework of SVM, the process involves the determination of two distinct boundaries that effectively separate the two groups of data. The boundaries are noted by two parallel functions originating from the hyperplane: w T x + b = 1 and w T x + b = 1 . In this context, the values 1 and −1 signify the respective group or classes to which the data points belong. The positions of the support vectors are located on lines that are parallel to the hyperplane. SVM’s objective is to determine the maximum value of the distance ( d m a x ) between each support vector and the hyperplane. The function responsible for calculating this maximum distance is expressed as follows:
d m a x = m a x { 2 y ( w T x + b ) w } = m a x ( 2 w )
where w = w 1 2 + + w n 2 . The function is derived from the formula used to calculate the distance between a point and a line, which is then extended to n-dimensional space. The primary objective is to solve for the maximum distance of each support vector from the hyperplane. For the data is not possibly linearly separable, SVM addresses this by transforming the data into a higher-dimensional vector space using a kernel function, noted as x φ ( x ) . This kernel function can take various forms, including the polynomial kernel function, sigmoid kernel function, and radial basis kernel function (RBF). SVM aims to identify the optimal hyperplane that effectively separates two groups of data, and it employs the kernel trick to handle scenarios involving non-linearly separable data. When inputting new data, the SVM model can classify it based on the hyperplane that was found during training, resulting in binary prediction. This concept is illustrated in Figure 2.

2.3. Artificial Neural Network (ANN)

Artificial Neural Network (ANN) was originally proposed in 1943 when McCulloch and Pitts introduced an artificial information processing idea inspired by the structure and process of the human brain [37]. This idea relied on using basic mathematical operations to build a neural network of interconnected neurons. The emergence of backpropagation neural networks in 1986 created the foundation for training ANN models [38]. ANN consists of multiple neurons to have computational and information-processing capabilities. The architecture of an ANN includes an input layer, one or several hidden layers, and an output layer. The input data enters the input layer, where it is processed within the hidden layer(s), and the output layer produces results. The hidden layer acts as the principal computational core, accommodating several neurons. Neurons act as the essential part of an ANN, enabling the transmission of information between them via weighted connections. Each neuron takes input from previous layer neurons and multiplies them with corresponding weights. The aggregated result, with a bias, enhances calculation flexibility. The outcome is then substituted into an activation function which restricts it to a specific range and passes it to other neurons. This process within an ANN imitates the information processing capacity of the human brain, thus enabling the network to address tasks like human brain activities.
f ( i = 1 y i l w i ( l + 1 ) + b i ( l + 1 ) ) = y i ( l + 1 )
where y i l are the inputs from the previous layer’s neurons, w i ( l + 1 ) are the weights between previous layer’s neurons and the current neuron, b i ( l + 1 ) is the bias for adding to the result of computation before substituting into the activation function, and f is the activation function. The functions such as the sigmoid function, hyperbolic tangent function (tanh), and rectified linear unit function (ReLU) are commonly used in ANN. ANN is trained using the gradient descent technique, which necessitates iterative adjustments to align network weights with the data. The error gradient guides the search for the direction that reduces the error. The current weights are updated by multiplying the error gradient and the degree of adjustment is based on the learning rate. This process continues until the error is minimized when the training process is finished. The adjustment of the weight can be expressed as:
w t + α e = w t + 1
where w t is the weight of the neural network, α is the learning rate serves as the training parameters that influence the degree of weight updates during the training process. e is the gradient of the error, and w t + 1 is the updated weight. The structure of an ANN and the computation process of a neuron are shown in Figure 3.

2.4. Probability-Based Prediction Output

Machine learning algorithms used for event prediction are typically formulated as classification models that provide binary outputs, indicating whether a specific event occurs or not. In the context of CFW prediction, such models usually produce outputs of either 0 or 1 to represent the absence or occurrence of CFW events. However, binary predictions alone are insufficient for effective hazard forecasting and risk assessment. For coastal safety management, it is more informative to quantify the likelihood of an event rather than simply indicating its occurrence. Probability-based predictions provide a more meaningful representation of the potential risk level and can support decision-making processes in early warning systems.
To address this limitation, the machine learning models used in this study are designed to produce probabilistic outputs instead of simple binary classifications. The predicted probability represents the likelihood of CFW occurrence under given environmental conditions. Different machine learning algorithms employ different mechanisms to convert classification outputs into probabilities.
For RF model, the probability prediction is derived from the ensemble voting results of individual decision trees. Each decision tree produces a classification result, and the final probability is calculated as the proportion of trees predicting the occurrence of the event. The probability of CFW occurrence can therefore be expressed as
P = 1 N n = 1 N D n
where D n represents the output of the n-th decision tree and N denotes the total number of trees in the forest. This ensemble-based probability reflects the consensus among the decision trees and provides a probabilistic interpretation of the classification result.
For the SVM model, probability estimation is achieved using the method proposed by Platt (1999), commonly referred to as Platt scaling [39]. In this approach, the decision function output of the SVM is first computed as the distance between each data point and the separating hyperplane. These values are then transformed into probabilities using a sigmoid function:
P = 1 1 + e x p ( A f + B )
where f represents the original SVM decision function output, and A and B are parameters estimated using maximum likelihood estimation. This transformation allows the SVM model to produce probability values ranging from 0 to 1.
For the ANN model, probability outputs are generated through the activation function used in the output layer. In this study, the sigmoid activation function is applied to the final neuron in the output layer to constrain the output values within the range of 0 to 1, thereby representing the predicted probability of CFW occurrence. The probability output can be expressed as.
P = y ^ = f ( i = 1 y i ( l 1 ) w i l + b i l )
where y i ( l 1 ) represents the input from neurons in the previous layer, w i ( l ) and b i l denote the corresponding weights and bias of the output neuron, and f   is the sigmoid activation function. During the training process, the network adjusts these parameters to minimize prediction error and improve the reliability of probability estimates.
Through these probability-based prediction mechanisms, the machine learning models are able to quantify the likelihood of coastal freak wave occurrence, which provides more informative outputs for hazard assessment and early warning applications.

2.5. Hyperparameter Tuning

Machine learning models aim to learn patterns and relationships from training data to produce accurate predictions for unseen samples. During the training process, model parameters are automatically adjusted to minimize prediction errors. However, in addition to these trainable parameters, machine learning models also contain a set of predefined parameters that control the model structure and learning process. These parameters, known as hyperparameters, must be specified prior to model training and can significantly influence the performance and generalization capability of the model.
The key hyperparameters differ among machine learning algorithms. For the ANN model, important hyperparameters include the number of hidden layers, the number of neurons in each layer, the learning rate, and the activation function. These parameters determine the network architecture and control the learning process during model training. For the SVM model, the major hyperparameters include the penalty parameter controlling the tolerance for classification errors, the kernel function used to transform data into a higher-dimensional feature space, and the associated kernel parameters that determine the distribution of the mapped data. For the RF model, the primary hyperparameters include the number of decision trees in the ensemble, the maximum number of features considered at each split, and the splitting criteria used to construct the trees.
The selection of appropriate hyperparameter values plays a critical role in achieving optimal model performance. In this study, hyperparameters were optimized using a grid search strategy combined with k-fold cross-validation, following the approach widely adopted in previous studies [40]. For each hyperparameter, a predefined range of candidate values was specified, and all possible combinations were evaluated through systematic model training and validation. The model performance corresponding to each hyperparameter combination was compared, and the configuration yielding the best validation performance was selected as the optimal model. This procedure ensures that the machine learning models are properly tuned for predicting CFW occurrences.

2.6. Performance Metric

To evaluate the predictive performance of the machine learning models, the confusion matrix was employed as the primary evaluation framework. The confusion matrix summarizes the classification results into four categories: true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). In the context of coastal freak wave prediction, TP represents cases where CFW events are correctly predicted, while FP indicates instances where the model incorrectly predicts a CFW event that does not occur. TN corresponds to correctly predicted non-occurrence events, whereas FN represents missed detections where an actual CFW event is not identified by the model.
Based on the confusion matrix, three commonly used evaluation metrics were adopted to assess model performance: accuracy, recall, and precision. Accuracy measures the overall proportion of correctly classified samples and is defined as
A c c u r a c y = T P + T N T P + F P + T N + F N
Recall, also known as sensitivity, measures the ability of the model to correctly detect actual CFW events and is defined as
R e c a l l = T P T P + F N
Precision quantifies the reliability of positive predictions by measuring the proportion of correctly predicted CFW events among all predicted occurrences:
P r e c i s i o n = T P T P + F P
These evaluation metrics provide complementary perspectives for assessing model performance. In particular, recall reflects the model’s ability to capture rare but critical hazard events, while precision evaluates the reliability of warning predictions. Together, these indicators provide a comprehensive assessment of the prediction capability of the machine learning models for coastal freak wave events.

3. Data and Study Area

3.1. Study Area and CFW Event Dataset

The study area is located along the Longdong Coast on the northeastern coast of Taiwan, with a particular focus on the Longdong Bay Cape region, as illustrated in Figure 4. The coastal landscape of this area consists primarily of sandstone cliffs formed by long-term marine erosion, with steep coastal topography on one side and open sea on the other. Due to its geographical orientation, the Longdong coast is frequently influenced by the northeast monsoon, which generates persistent swells and energetic wave conditions.
Despite the hazardous wave environment, the area is a popular destination for recreational activities such as fishing, sightseeing, and coastal tourism. Numerous incidents have been reported in which anglers or tourists were unexpectedly swept into the sea by sudden CFWs, highlighting the potential danger of this location.
This study selected the Longdong coast as the research site because of the frequent occurrence of CFW events and the availability of nearby marine meteorological buoy data. To capture the timing of CFW occurrences, on-site coastal monitoring cameras were deployed at locations known for frequent CFW activity. When a CFW event was visually detected, the corresponding timestamp was recorded. Marine and meteorological data corresponding to these timestamps were then retrieved from nearby buoy stations. Through this approach, a dataset of CFW events and their associated environmental conditions was constructed for subsequent machine learning analysis. The dataset used in this study contains 717 CFW events recorded between 2016 and 2018, together with 717 non-CFW samples extracted from buoy observations. Non-CFW samples were randomly selected from the 2016–2018 observation period. To ensure data integrity, we only included intervals where the CCTV system was fully operational and ambient lighting was sufficient, thereby excluding nighttime periods and instances of hardware malfunction.

3.2. Environmental Factors for CFW Prediction

To identify the environmental conditions associated with coastal freak wave occurrences, this study collected marine and meteorological observations from nearby buoy stations corresponding to the recorded CFW events. These data provide valuable information that can be used by machine learning models to identify patterns and relationships related to CFW formation. Based on previous studies and the physical characteristics of wave dynamics, several environmental factors that may influence CFW occurrence were considered. These variables include sea state conditions, meteorological forcing, wave and wind directions, swell characteristics, wave grouping behavior, and nonlinear wave interactions.
Sea state conditions are represented by significant wave height, mean wave period, and peak wave period, which describe the general energy level and temporal characteristics of the wave field. Meteorological forcing is represented by wind conditions, including mean wind speed and maximum gust speed, which influence wave generation and the intensity of waves approaching the coastline. The direction of waves and wind is also included as an important factor because waves propagating toward the shore, particularly when accompanied by onshore winds, may significantly enhance coastal wave impacts.
Swell-related parameters are also considered, as long-period swells are often associated with the occurrence of extreme waves near the coast. Swell characteristics are represented by parameters such as wave steepness, swell height, and swell period, which can be derived from buoy observations using wind–wave and swell separation techniques. In addition, wave grouping behavior is considered, since previous studies suggest that freak waves may be associated with the concentration of wave energy within wave groups. Wave grouping characteristics are quantified using the narrowness factor and the peakedness factor (Qp), both derived from wave spectral analysis.
Furthermore, nonlinear wave interactions are represented by the Benjamin–Feir Index (BFI), which quantifies the degree of wave instability and nonlinear interactions within the wave field. The BFI is calculated using wave steepness and spectral peakedness parameters and has been widely used as an indicator of conditions favorable for freak wave formation. By incorporating these environmental factors into the machine learning models, the dataset provides comprehensive information about the marine and meteorological conditions associated with CFW occurrences.

3.3. Data Preprocessing and Dataset Preparation

Before training the machine learning models, the collected dataset was carefully preprocessed to ensure data quality and improve model performance. In machine learning applications, both the quantity and quality of data play a crucial role in determining the predictive capability of the models. Poorly distributed or unbalanced datasets may lead to biased model training and unreliable predictions. Therefore, appropriate data preprocessing procedures were applied to prepare the dataset for model development.
In general, machine learning datasets are divided into training, validation, and testing subsets. The training dataset is used to train the model and adjust internal model parameters. The validation dataset is used to evaluate model performance during training and to determine optimal hyperparameter settings. Finally, the test dataset is used to evaluate the final predictive performance of the trained model.
However, due to the limited availability of recorded CFW events, dividing the dataset into three independent subsets would significantly reduce the amount of training data. To address this limitation, k-fold cross-validation was adopted to generate validation data. In k-fold cross-validation [41], the training dataset is divided into k equal subsets. During each iteration, k − 1 subsets are used for training while the remaining subset is used for validation. This process is repeated until each subset has been used as validation data, and the overall performance is obtained by averaging the results across all folds.
In this study, the entire dataset was first divided into training and testing datasets, with 70% of the data used for training and 30% reserved for testing. The validation process was then performed using k-fold cross-validation within the training dataset.
To prevent model bias caused by imbalanced data, an equal number of non-CFW samples were included in the dataset to represent conditions without CFW occurrence. This balanced dataset allows the machine learning models to better learn the distinguishing features between CFW and non-CFW events. In addition, all input variables were normalized to a range between 0 and 1 to ensure consistent scaling among different features and improve the stability of model training.

4. Results and Discussion

4.1. Hyperparameter Tuning Results

To optimize the predictive performance of the machine learning models, hyperparameter tuning was conducted using grid search combined with k-fold cross-validation. This approach systematically evaluates different combinations of hyperparameters to identify the configuration that yields the best validation performance. Table 1 describes the search ranges of hyperparameters for each machine learning model.
For the RF model, the number of decision trees and the maximum number of features considered at each split were explored. The search range included 10 to 500 trees and 1 to 13 features. The results indicate that RF achieved stable performance even when the number of trees was relatively small, suggesting that the ensemble structure of RF provides robustness against overfitting and allows reliable learning from limited datasets.
For the SVM model, several kernel functions were evaluated, and different combinations of the penalty parameter C and kernel parameter γ were explored. The search range for both parameters was set between 2−8 and 28. The results show that the polynomial kernel outperformed the linear kernel, indicating that the relationship between environmental variables and CFW occurrences is strongly nonlinear. This finding is consistent with the complex hydrodynamic interactions that contribute to extreme wave formation.
For the ANN model, multiple hidden layer configurations, activation functions, and learning rates were tested. The network architecture consisted of three hidden layers, with the number of neurons ranging from 4 to 10 per layer and learning rates between 0.0005 and 0.05. The results indicate that networks with larger hidden layers achieved better performance, suggesting that a higher model capacity is beneficial for capturing the nonlinear relationships embedded in the environmental dataset.
Each method produced three graphs comparing the tuning results of different hyperparameter combinations, shown in Figure 5. The figure indicates the accuracy obtained under varying hyperparameter combinations for each method.
The optimal hyperparameter configurations for the three models are summarized in Table 2. After tuning, the validation accuracy of all models exceeded 70%, whereas models trained with random hyperparameter combinations often achieved accuracy close to 50%, which is comparable to random guessing. This result demonstrates the critical importance of hyperparameter tuning in improving the learning capability and predictive performance of machine learning models.

4.2. Probability Threshold and ROC Analysis

In this study, the machine learning models generate probability-based predictions for the occurrence of coastal freak waves rather than simple binary classifications. To evaluate the reliability of these probabilistic predictions, Receiver Operating Characteristic (ROC) curves were constructed for each model by varying the probability threshold used for classification.
The ROC curve illustrates the trade-off between the true positive rate and the false positive rate at different probability thresholds. The area under the ROC curve (AUC) is widely used as a quantitative indicator of classification performance. An AUC value of 0.5 corresponds to random prediction, whereas values approaching 1.0 indicate strong discriminative capability.
The ROC curves of the three machine learning models are shown in Figure 6. All models achieved AUC values close to 0.80, indicating good predictive capability. Specifically, the RF model achieved an AUC of 0.82, the SVM model obtained an AUC of 0.79, and the ANN model produced an AUC of 0.80. These results suggest that all three models have a comparable ability to distinguish between CFW and non-CFW conditions.
The optimal classification threshold was determined based on the ROC curves. The results indicate that a probability threshold of 50% provides the best balance between true positive and false positive rates for all three models. This threshold corresponds to a point near the upper-left region of the ROC space, where the classification performance is maximized. Therefore, a probability threshold of 50% was adopted for generating confusion matrices and computing evaluation metrics in the following analyses.

4.3. Comparison of Machine Learning Models

4.3.1. Prediction Performance

After determining the optimal hyperparameters and probability threshold, the predictive performance of the models was evaluated using the independent test dataset. The evaluation metrics include accuracy, recall, and precision, as summarized in Table 3.
The results show that all three machine learning models achieved similar levels of prediction accuracy, with values around 74%. Although the difference in accuracy among the models is relatively small, their recall and precision values reveal important differences in prediction behavior.
The ANN model achieved the highest recall (78.2%), indicating that it is particularly effective in detecting the occurrence of CFW events. A high recall is desirable in hazard prediction applications because it reduces the likelihood of missing dangerous events. However, the ANN model also exhibited lower precision (71.3%), suggesting that it tends to produce more false positive predictions compared to the other models.
In contrast, the RF and SVM models show a more balanced relationship between recall and precision. Both models achieved precision values higher than their recall values, indicating that their predictions are slightly more conservative. These models are therefore less likely to generate false alarms while still maintaining reliable detection capability.
The differences in model performance can be attributed to the underlying learning mechanisms. The ANN model, with its multilayer structure, has a strong capacity for capturing complex nonlinear relationships but may be more sensitive to limited training data. In contrast, RF benefits from its ensemble structure, which reduces variance and enhances robustness. Similarly, SVM identifies optimal decision boundaries based on a subset of representative samples, allowing stable performance even when the dataset is relatively small.
Overall, the comparable performance of these models indicates that machine learning methods are capable of capturing meaningful relationships between environmental conditions and the occurrence of coastal freak waves.
To provide a more robust statistical foundation for these comparative observations and ensure the reliability of our conclusions, we extended the analysis to include formal significance testing. We conducted McNemar’s test to compare the predictive accuracy among the models. The calculated p-values are 1.000 for the ANN versus SVM comparison, 0.822 for the ANN versus RF comparison, and 0.905 for the SVM versus RF comparison. These results indicate that there is no statistically significant evidence of differences in predictive performance among the three models on our current dataset. However, this does not imply that the models are functionally equivalent, but rather that no significant differences could be detected given the current data scale.
Beyond the evaluation of discriminative power, the practical implementation of these models within operational disaster early-warning systems necessitates a thorough assessment of their probabilistic reliability. Therefore, we calculated the Brier Scores for the three ML models. The results yield Brier Scores of 0.1738 for the RF model, 0.1829 for the SVM model, and 0.2003 for the ANN model. Given that a Brier Score of 0.25 represents a non-informative model in a balanced 1:1 dataset, all values are significantly below this threshold, thereby demonstrating that the models provide not only accurate but also stable and reliable probabilistic forecasts.

4.3.2. Probability Prediction of CFW Events

An advantage of the machine learning models developed in this study is their ability to generate probabilistic predictions rather than simple binary classifications. These probability outputs provide more informative results for hazard forecasting and allow the estimation of the likelihood of CFW occurrence.
Figure 7 shows the probability predictions produced by the three models for actual CFW events. Although all models successfully identify most CFW events above the 50% probability threshold, differences in probability distributions can be observed.
The RF model generally produces higher probability values, frequently exceeding 80% when the model predicts a high likelihood of CFW occurrence. The SVM model typically outputs probabilities above 70%, whereas the ANN model tends to generate slightly lower probability values around 60%. These differences are mainly related to the distinct probability calibration mechanisms used by the three models.
Despite these differences, the probability outputs consistently exceed the 50% threshold for most recorded CFW events, demonstrating that the models are capable of providing reliable probabilistic predictions. Such probability-based outputs are particularly useful for coastal safety management, as they allow the establishment of risk thresholds for issuing hazard warnings.

4.3.3. Importance of Factors for CFW Prediction

To evaluate the contribution of individual factors to the model’s performance in predicting CFWs, a sensitivity analysis was conducted by systematically excluding specific input factors during the testing phase.
The results, shown in Figure 8, indicate that the exclusion of individual factors does not lead to a substantial degradation in predictive performance. Even when critical factors (e.g., wave direction) were removed, the overall accuracy decreased by a maximum of only 3%. This suggests that the models are capable of capturing CFW-related signatures from the remaining variables, indicating that the input features are not mutually independent but exhibit a high degree of interdependence. It shows that CFW characteristics are implicitly embedded across multiple physical features.
Interestingly, a marginal improvement in accuracy was observed upon the removal of certain factors, suggesting that some features might cause minor noise interference under specific conditions. Nevertheless, to ensure the integrity of the physical feature and the robustness of the prediction, utilizing the full set of factors remains the most comprehensive and appropriate approach for CFW prediction.

4.3.4. Influence of Training Data Quantity

To investigate the influence of dataset size on model performance, additional experiments were conducted using different amounts of training data. Multiple models were trained using subsets of the dataset, and their predictive accuracy was evaluated using the same test dataset.
The results, shown in Figure 9, indicate that the predictive accuracy of all three models increases as the amount of training data grows. When the dataset is small, the RF and SVM models outperform the ANN model. This result suggests that RF and SVM are more robust under limited-data conditions.
The RF model benefits from its ensemble structure, which combines multiple decision trees to reduce variance and improve stability. Meanwhile, the SVM model relies on a subset of representative support vectors to determine optimal decision boundaries, allowing efficient learning even with limited data.
In contrast, the ANN model shows weaker performance when the training dataset is small. This is likely due to the large number of parameters in neural networks, which require sufficient data to effectively learn the underlying relationships in the dataset. However, as the amount of training data increases, the performance of the ANN model gradually improves and approaches that of the other models.
These results suggest that the current dataset size may still be insufficient to fully exploit the learning capability of ANN models. With larger datasets in future studies, neural network models may potentially achieve superior predictive performance due to their ability to capture complex nonlinear relationships.

4.3.5. Future Works

The occurrence of CFWs is fundamentally linked to local coastal bathymetry and structures. As the model was trained in this study using data from the Longdong coast, it may reflect the specific hydrodynamic conditions of that region. Consequently, the model is site-specific and requires recalibration before application to other coastal environments. To ensure broader applicability, future deployments should utilize fine-tuning or transfer learning with localized datasets to adapt pre-trained weights. This study thus provides a technical framework for establishing a multi-site CFWs monitoring system. Regarding the limitations of CCTV monitoring, particularly during nighttime or extreme weather conditions. These challenges can be mitigated by utilizing specialized hardware, such as low-light/thermal cameras or equipment with anti-droplet, anti-fog, and stabilization capabilities.
While this study has successfully integrated swell, wave groups, and non-linear factors to achieve promising performance in CFW prediction, some limitations still exist, providing directions for future work. Recent research highlights that advanced feature engineering, such as incorporating extreme wave statistics and time-series descriptors, can substantially enhance forecasting quality. Moreover, as prediction models must overcome spatial and temporal scale challenges and complex seabed interactions, adopting advanced adaptive normalization methods has been proven to effectively boost accuracy [42,43,44]. Consequently, advanced feature engineering and data normalization will be the core focus of our future work. We plan to explore more complex spatial domain features and real-time energy distribution modeling, integrated with explainable machine learning frameworks, to further address the generalization challenges in ocean phenomenon prediction.

5. Conclusions

This study investigated the application of machine learning techniques for predicting the occurrence of coastal freak waves using observational environmental data. Three widely used machine learning algorithms, the Random Forest, Support Vector Machine, and Artificial Neural Network, were developed and evaluated to generate probability-based predictions of CFW events.
The results demonstrate that machine learning models are capable of capturing meaningful relationships between environmental conditions and the occurrence of coastal freak waves. All three models achieved comparable predictive performance, with AUC values close to 0.80 and overall prediction accuracy around 74%. These results indicate that machine learning approaches can provide reliable predictions even when the physical mechanisms governing the phenomenon are not fully understood.
Among the three models, the ANN model exhibited the highest recall, indicating superior ability to detect CFW events and reduce missed hazard occurrences. In contrast, the RF and SVM models showed more balanced precision and recall, suggesting greater robustness in avoiding false alarms. The differences in prediction characteristics highlight the importance of selecting appropriate machine learning models depending on the operational objectives of coastal hazard warning systems.
The probabilistic prediction framework developed in this study provides a more informative representation of coastal hazard risk compared with traditional binary classification approaches. By estimating the probability of CFW occurrence under given environmental conditions, the models can support risk-based decision-making and the development of coastal early warning systems.
Analysis of high-probability prediction events further reveals that CFW occurrences are associated with specific environmental characteristics. In particular, swell-dominated wave conditions, strong wave grouping behavior, and increased nonlinear wave interactions appear to play important roles in the formation of coastal freak waves. These findings suggest that CFW events are likely governed by complex interactions among multiple environmental factors rather than a single dominant parameter.
In addition, the results show that the predictive performance of machine learning models improves with increasing training data. While RF and SVM models perform well under limited-data conditions, the ANN model exhibits greater potential for improvement as more training data become available. This suggests that future studies incorporating larger datasets may further enhance the predictive capability of neural network-based models.
Despite these promising results, several limitations remain. The number of recorded CFW events used in this study is relatively limited, and the dataset is derived from a single coastal location. Future research should aim to expand the dataset by incorporating longer-term observations and additional coastal sites. Integrating machine learning approaches with physical wave models may also provide further insights into the mechanisms governing CFW formation.
Overall, this study demonstrates that machine learning offers a promising framework for predicting coastal freak wave occurrences and provides a foundation for developing data-driven coastal hazard forecasting systems.

Author Contributions

Conceptualization, D.-J.D. and C.-H.T.; Methodology, D.-J.D. and W.-C.C.; Software, W.-C.C.; Validation, W.-C.C.; Formal analysis, W.-C.C.; Investigation, D.-J.D., W.-C.C. and C.-H.T.; Resources, C.P.; Data curation, W.-C.C. and F.-J.L.; Writing—original draft, W.-C.C.; Writing—review & editing, D.-J.D., F.-J.L., C.P. and C.-H.T.; Supervision, D.-J.D., F.-J.L., C.P. and C.-H.T.; Project administration, D.-J.D.; Funding acquisition, F.-J.L. and C.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Central Weather Administration and the National Science and Technology Council of Taiwan. The APC was funded by the National Science and Technology Council of Taiwan (NSTC 113-2221-E-006-171-MY2).

Data Availability Statement

The data supporting the findings of this study were provided by the Central Weather Administration of Taiwan. Restrictions apply to the availability of these data, which were used under license for the current study. Data is available from the authors with the permission of the CWA.

Acknowledgments

The authors would like to express their sincere gratitude to the Central Weather Administration and the National Science and Technology Council of Taiwan (NSTC 113-2221-E-006-171-MY2) for their financial support and the provision of research data, which made this study possible. The authors also sincerely thank the anonymous reviewers for their valuable comments and constructive suggestions, which have significantly improved the quality and clarity of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lavrenov, I.V. The wave energy concentration at the Agulhas current off South Africa. Nat. Hazards 1998, 17, 117–127. [Google Scholar] [CrossRef]
  2. Divinsky, B.V.; Levin, B.V.; Lopatukhin, L.I.; Pelinovsky, E.N.; Slyunyaev, A.V. A freak wave in the Black Sea: Observations and simulation. Dokl. Earth Sci. 2004, 395, 438–443. [Google Scholar]
  3. Toffoli, A.; Waseda, T.; Houtani, H.; Cavaleri, L.; Greaves, D.; Onorato, M. Rogue waves in opposing currents: An experimental study on deterministic and stochastic wave trains. J. Fluid Mech. 2015, 769, 277–297. [Google Scholar] [CrossRef]
  4. Tamura, H.; Waseda, T.; Miyazawa, Y. Freakish sea state and swell-windsea coupling: Numerical study of the Suwa-Maru incident. Geophys. Res. Lett. 2009, 36, L01607. [Google Scholar] [CrossRef]
  5. Waseda, T.; Tamura, H.; Kinoshita, T. Freakish sea index and sea states during ship accidents. J. Mar. Sci. Technol. 2012, 17, 305–314. [Google Scholar] [CrossRef]
  6. Cavaleri, L.; Bertotti, L.; Torrisi, L.; Bitner-Gregersen, E.; Serio, M.; Onorato, M. Rogue waves in crossing seas: The Louis Majesty accident. J. Geophys. Res. Oceans 2012, 117, C00J10. [Google Scholar] [CrossRef]
  7. Pelinovsky, E.; Talipova, T.; Kharif, C. Nonlinear-dispersive mechanism of the freak wave formation in shallow water. Phys. D 2000, 147, 83–94. [Google Scholar] [CrossRef]
  8. Mori, N.; Liu, P.C.; Yasuda, T. Analysis of freak wave measurements in the Sea of Japan. Ocean Eng. 2002, 29, 1399–1414. [Google Scholar] [CrossRef]
  9. Kharif, C.; Pelinovsky, E. Physical mechanisms of the rogue wave phenomenon. Eur. J. Mech. B-Fluids 2003, 22, 603–634. [Google Scholar] [CrossRef]
  10. Janssen, P.A.E.M. Nonlinear Four-Wave Interactions and Freak Waves. J. Phys. Oceanogr. 2003, 33, 863–884. [Google Scholar] [CrossRef]
  11. Lavrenov, I.; Porubov, A. Three reasons for freak wave generation in the non-uniform current. Eur. J. Mech. B-Fluids 2006, 25, 574–585. [Google Scholar] [CrossRef]
  12. Zakharov, V.; Dyachenko, A.; Prokofiev, A. Freak waves as nonlinear stage of Stokes wave modulation instability. Eur. J. Mech. B-Fluids 2006, 25, 677–692. [Google Scholar] [CrossRef]
  13. Waseda, T.; Kinoshita, T.; Tamura, H. Evolution of a Random Directional Wave and Freak Wave Occurrence. J. Phys. Oceanogr. 2009, 39, 621–639. [Google Scholar] [CrossRef]
  14. Soomere, T.; Engelbrecht, J. Weakly two-dimensional interaction of solitons in shallow water. Eur. J. Mech. B-Fluids 2006, 25, 636–648. [Google Scholar] [CrossRef]
  15. Didenkulova, I.; Rodin, A. Statistics of shallow water rogue waves in Baltic Sea conditions: The case of Tallinn Bay. In Proceedings of the 2012 IEEE/OES Baltic International Symposium (BALTIC), Klaipeda, Lithuania, 8–10 May 2012; pp. 1–6. [Google Scholar] [CrossRef]
  16. Ankiewicz, A.; Bokaeeyan, M.; Akhmediev, N. Shallow-water rogue waves: An approach based on complex solutions of the Korteweg–de Vries equation. Phys. Rev. E 2019, 99, 050201. [Google Scholar] [CrossRef]
  17. Didenkulova, E.; Didenkulova, I.; Medvedev, I. Freak wave events in 2005–2021: Statistics and analysis of favourable wave and wind conditions. Nat. Hazards Earth Syst. Sci. 2023, 23, 1653–1663. [Google Scholar] [CrossRef]
  18. Chien, H.W.A.; Kao, C.C.; Chuang, L.Z. On the characteristics of observed coastal freak waves. Coast. Eng. J. 2002, 44, 301–319. [Google Scholar] [CrossRef]
  19. Tsai, C.; Su, M.; Huang, S. Observations and conditions for occurrence of dangerous coastal waves. Ocean Eng. 2004, 31, 745–760. [Google Scholar] [CrossRef]
  20. James, S.C.; Zhang, Y.; O’Donncha, F. A machine learning framework to forecast wave conditions. Coast. Eng. 2018, 137, 1–10. [Google Scholar] [CrossRef]
  21. Gramstad, O.; Bitner-Gregersen, E. Predicting extreme waves from wave spectral properties using machine learning. Int. Conf. Offshore Mech. Arct. Eng. 2019, 58783, V003T02A005. [Google Scholar] [CrossRef]
  22. Callens, A.; Morichon, D.; Abadie, S.; Delpey, M.; Liquet, B. Using Random forest and Gradient boosting trees to improve wave forecast at a specific location. Appl. Ocean Res. 2020, 104, 102339. [Google Scholar] [CrossRef]
  23. Sinha, A.; Abernathey, R. Estimating ocean surface currents with machine learning. Front. Mar. Sci. 2021, 8, 672477. [Google Scholar] [CrossRef]
  24. Zhang, J.; Zhao, X.; Jin, S.; Greaves, D. Phase-resolved real-time ocean wave prediction with quantified uncertainty based on variational Bayesian machine learning. Appl. Energy 2022, 324, 119711. [Google Scholar] [CrossRef]
  25. Chen, S.T. Probabilistic forecasting of coastal wave height during typhoon warning period using machine learning methods. J. Hydroinform. 2019, 21, 343–358. [Google Scholar] [CrossRef]
  26. Lee, J.W.; Irish, J.L.; Bensi, M.T.; Marcy, D.C. Rapid prediction of peak storm surge from tropical cyclone track time series using machine learning. Coast. Eng. 2021, 170, 104024. [Google Scholar] [CrossRef]
  27. Afzal, M.S.; Kumar, L.; Chugh, V.; Kumar, Y.; Zuhair, M. Prediction of significant wave height using machine learning and its application to extreme wave analysis. J. Earth Syst. Sci. 2023, 132, 51. [Google Scholar] [CrossRef]
  28. Viñes, M.; Sánchez-Arcilla, A., Jr.; Epelde, I.; Mösso, C.; Franco, J.; Sospedra, J.; Gracia, V.; Sánchez-Arcilla, A. Morphodynamic predictions based on Machine Learning. Performance and limits for pocket beaches near the Bilbao port. Front. Environ. Sci. 2025, 13, 1600473. [Google Scholar] [CrossRef]
  29. Kim, T.; Lee, W.-D. Prediction of wave runup on beaches using interpretable machine learning. Ocean Eng. 2024, 297, 116918. [Google Scholar] [CrossRef]
  30. Al Najar, M.; Wilson, D.G.; Almar, R. Interpretable machine learning for shoreline forecasting. Sci. Rep. 2026, 16, 11457. [Google Scholar] [CrossRef]
  31. Vadivel, M.; Sundar, A.S.; Murthy, P.V.R.K.; Soundararajan, M.; Rajan, D.; Priya, V. Dynamic coastal vulnerability index: A machine learning approach to predict future impacts of climate change and human activity on coastal environments. J. S. Am. Earth Sci. 2025, 165, 105692. [Google Scholar] [CrossRef]
  32. Doong, D.J.; Peng, J.P.; Chen, Y.C. Development of a warning model for coastal freak wave occurrences using an artificial neural network. Ocean Eng. 2018, 169, 270–280. [Google Scholar] [CrossRef]
  33. Doong, D.J.; Chen, S.T.; Chen, Y.C.; Tsai, C.H. Operational probabilistic forecasting of coastal freak waves by using an artificial neural network. J. Mar. Sci. Eng. 2020, 8, 165. [Google Scholar] [CrossRef]
  34. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  35. Breiman, L. Classification and Regression Trees; Routledge: New York, NY, USA, 1984. [Google Scholar] [CrossRef]
  36. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  37. McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
  38. Rumelhart, D.; Hinton, G.; Williams, R. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
  39. Platt, J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 1999, 10, 61–74. [Google Scholar]
  40. Lin, S.; Ying, K.; Chen, S.; Lee, Z. Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst. Appl. 2008, 35, 1817–1824. [Google Scholar] [CrossRef]
  41. Dietterich, T.G. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 1998, 10, 1895–1923. [Google Scholar] [CrossRef]
  42. Durap, A. Predicting ocean parameters with explainable machine learning: Overcoming scale and time challenges. Reg. Stud. Mar. Sci. 2025, 90, 104424. [Google Scholar] [CrossRef]
  43. Durap, A. Explainable machine learning for bathymetric mapping: Adaptive normalization and feature engineering in complex seabed terrains. Ocean Sci. J. 2025, 60, 52. [Google Scholar] [CrossRef]
  44. Durap, A. From Black Box to Transparency: An Explainable Machine Learning (ML) Framework for Ocean Wave Prediction Using SHAP and Feature-Engineering-Derived Variable. Mathematics 2025, 13, 3962. [Google Scholar] [CrossRef]
Figure 1. Conceptual illustration of the Random Forest (RF) model. (a) Ensemble prediction generated by aggregating the outputs of multiple decision trees. (b) Structure of an individual decision tree, including root nodes, intermediate nodes, leaf nodes, and decision branches used for data partitioning. Individual predictions from each decision tree are combined through an ensemble process to produce the final comprehensive RF result.
Figure 1. Conceptual illustration of the Random Forest (RF) model. (a) Ensemble prediction generated by aggregating the outputs of multiple decision trees. (b) Structure of an individual decision tree, including root nodes, intermediate nodes, leaf nodes, and decision branches used for data partitioning. Individual predictions from each decision tree are combined through an ensemble process to produce the final comprehensive RF result.
Jmse 14 00689 g001
Figure 2. Conceptual illustration of the Support Vector Machine (SVM). (a) Optimal separating hyperplane for two classes of data in feature space, where support vectors define the maximum-margin boundary. (b) Kernel transformation is used to map nonlinearly separable data into a higher-dimensional feature space to enable linear separation.
Figure 2. Conceptual illustration of the Support Vector Machine (SVM). (a) Optimal separating hyperplane for two classes of data in feature space, where support vectors define the maximum-margin boundary. (b) Kernel transformation is used to map nonlinearly separable data into a higher-dimensional feature space to enable linear separation.
Jmse 14 00689 g002
Figure 3. Architecture of the Artificial Neural Network (ANN). (a) Typical ANN structure consisting of an input layer, multiple hidden layers, and an output layer. (b) Computational process of a neuron, where inputs are multiplied by weights, summed with a bias term, and transformed through an activation function. The network is composed of multiple neurons, which serve as the fundamental building units of the entire structure.
Figure 3. Architecture of the Artificial Neural Network (ANN). (a) Typical ANN structure consisting of an input layer, multiple hidden layers, and an output layer. (b) Computational process of a neuron, where inputs are multiplied by weights, summed with a bias term, and transformed through an activation function. The network is composed of multiple neurons, which serve as the fundamental building units of the entire structure.
Jmse 14 00689 g003
Figure 4. Study area of this research located at Longdong Cape on the northeastern coast of Taiwan. (a) Geographic location of the study site. (b) Example of a coastal freak wave event observed in the study area. Surveillance cameras were used to record the occurrence time of CFW events, which were subsequently matched with buoy-based marine and meteorological observations.
Figure 4. Study area of this research located at Longdong Cape on the northeastern coast of Taiwan. (a) Geographic location of the study site. (b) Example of a coastal freak wave event observed in the study area. Surveillance cameras were used to record the occurrence time of CFW events, which were subsequently matched with buoy-based marine and meteorological observations.
Jmse 14 00689 g004
Figure 5. Hyperparameter tuning results obtained using the grid-search procedure for the three machine learning models: (a) Random Forest (RF), (b) Support Vector Machine (SVM), and (c) Artificial Neural Network (ANN). The plots illustrate the validation accuracy achieved under different hyperparameter configurations.
Figure 5. Hyperparameter tuning results obtained using the grid-search procedure for the three machine learning models: (a) Random Forest (RF), (b) Support Vector Machine (SVM), and (c) Artificial Neural Network (ANN). The plots illustrate the validation accuracy achieved under different hyperparameter configurations.
Jmse 14 00689 g005
Figure 6. Receiver Operating Characteristic (ROC) curves for the three machine learning models (RF, SVM, and ANN). The curves illustrate the trade-off between the true positive rate and false positive rate under different probability thresholds. The area under the curve (AUC) provides a quantitative measure of the overall classification performance.
Figure 6. Receiver Operating Characteristic (ROC) curves for the three machine learning models (RF, SVM, and ANN). The curves illustrate the trade-off between the true positive rate and false positive rate under different probability thresholds. The area under the curve (AUC) provides a quantitative measure of the overall classification performance.
Jmse 14 00689 g006
Figure 7. Probability predictions of coastal freak wave (CFW) events generated by the three machine learning models. The predicted probabilities represent the likelihood of CFW occurrence under the corresponding environmental conditions, with most CFW events are correctly identified with predicted probabilities above 50% (the red dashed line).
Figure 7. Probability predictions of coastal freak wave (CFW) events generated by the three machine learning models. The predicted probabilities represent the likelihood of CFW occurrence under the corresponding environmental conditions, with most CFW events are correctly identified with predicted probabilities above 50% (the red dashed line).
Jmse 14 00689 g007
Figure 8. Sensitivity analysis of input factors on the predictive accuracy of the models. The results show minimal performance degradation (max 3% accuracy drop) when excluding individual factors, indicating high feature interdependence in CFW.
Figure 8. Sensitivity analysis of input factors on the predictive accuracy of the models. The results show minimal performance degradation (max 3% accuracy drop) when excluding individual factors, indicating high feature interdependence in CFW.
Jmse 14 00689 g008
Figure 9. Influence of training dataset size on the prediction accuracy of the three machine learning models (RF, SVM, and ANN). The results demonstrate how model performance improves as the amount of training data increases.
Figure 9. Influence of training dataset size on the prediction accuracy of the three machine learning models (RF, SVM, and ANN). The results demonstrate how model performance improves as the amount of training data increases.
Jmse 14 00689 g009
Table 1. Hyperparameter search range for machine learning models. These configurations were determined through a grid search method and k-fold cross-validation (k = 10) to ensure optimal model tuning results.
Table 1. Hyperparameter search range for machine learning models. These configurations were determined through a grid search method and k-fold cross-validation (k = 10) to ensure optimal model tuning results.
Hyperparameters Search Range
RF N t r e e s 10~500
f m a x 1~13
CriteriaGini, Log-loss, Entropy
SVMγ2−8~28
C2−8~28
Kernel functionLinear, Poly, RBF
ANNHidden layer3 Layers, 4~10 Neurons
η 0.005, 0.001, 0.005, 0.01, 0.05
Activation functionLogistic, Tanh, ReLU
Table 2. Optimal hyperparameter configurations obtained from the grid-search tuning procedure for the three machine learning models (RF, SVM, and ANN). The reported validation results correspond to the best-performing hyperparameter combination for each model based on cross-validation accuracy.
Table 2. Optimal hyperparameter configurations obtained from the grid-search tuning procedure for the three machine learning models (RF, SVM, and ANN). The reported validation results correspond to the best-performing hyperparameter combination for each model based on cross-validation accuracy.
RFSVMANN
Hyper-parameters N t r e e s 60γ4Hidden layer(9, 10, 10)
f m a x 12C0.156 η 0.001
CriteriaEntropyKernel functionPolyActivation functionReLU
Validation result71.0%73.3%73.2%
Table 3. Prediction performance of the three machine learning models (RF, SVM, and ANN) evaluated using the independent test dataset. Model performance is quantified using three evaluation metrics: accuracy, recall, and precision.
Table 3. Prediction performance of the three machine learning models (RF, SVM, and ANN) evaluated using the independent test dataset. Model performance is quantified using three evaluation metrics: accuracy, recall, and precision.
RFSVMANN
Accuracy74.0%74.5%73.3%
Recall72.7%72.2%78.2%
Precision74.8%75.7%71.3%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Doong, D.-J.; Chen, W.-C.; Lin, F.-J.; Pan, C.; Tsai, C.-H. Machine Learning Approaches for Probabilistic Prediction of Coastal Freak Waves. J. Mar. Sci. Eng. 2026, 14, 689. https://doi.org/10.3390/jmse14080689

AMA Style

Doong D-J, Chen W-C, Lin F-J, Pan C, Tsai C-H. Machine Learning Approaches for Probabilistic Prediction of Coastal Freak Waves. Journal of Marine Science and Engineering. 2026; 14(8):689. https://doi.org/10.3390/jmse14080689

Chicago/Turabian Style

Doong, Dong-Jiing, Wei-Cheng Chen, Fan-Ju Lin, Chi Pan, and Cheng-Han Tsai. 2026. "Machine Learning Approaches for Probabilistic Prediction of Coastal Freak Waves" Journal of Marine Science and Engineering 14, no. 8: 689. https://doi.org/10.3390/jmse14080689

APA Style

Doong, D.-J., Chen, W.-C., Lin, F.-J., Pan, C., & Tsai, C.-H. (2026). Machine Learning Approaches for Probabilistic Prediction of Coastal Freak Waves. Journal of Marine Science and Engineering, 14(8), 689. https://doi.org/10.3390/jmse14080689

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop