Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations

Balas, Egemen Ander; Balas, Can Elmar

doi:10.3390/jmse13050939

Open AccessArticle

Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations

by

Egemen Ander Balas

^1,*

and

Can Elmar Balas

²

¹

Department of Civil Engineering, Faculty of Engineering, Başkent University, Ankara 06790, Türkiye

²

Sea and Aquatic Sciences Application and Research Center, Gazi University, Ankara 06570, Türkiye

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(5), 939; https://doi.org/10.3390/jmse13050939

Submission received: 22 March 2025 / Revised: 27 April 2025 / Accepted: 7 May 2025 / Published: 11 May 2025

(This article belongs to the Special Issue Recent Advances in Maritime Safety and Ship Collision Avoidance)

Download

Browse Figures

Versions Notes

Abstract

In this study, a Hybrid Maritime Risk Assessment Model (HMRA) integrating automated machine learning (AML) and deep learning (DL) with hydrodynamic and Monte Carlo simulations (MCS) was developed to assess maritime accident probabilities and risks. The machine learning models of Light Gradient Boosting (LightGBM), XGBoost, Random Forest, and Multilayer Perceptron (MLP) were employed. Cross-validation of model architectures, calibrated baseline configurations, and hyperparameter optimization enabled predictive precision, producing generalizability. This hybrid model establishes a robust maritime accident probability prediction framework through a multi-stage methodology that ensembles learning architecture. The model was applied to İzmit Bay (in Türkiye), a highly jammed maritime area with dense traffic patterns, providing a complete methodology to evaluate and rank risk factors. This research improves maritime safety studies by developing an integrated, simulation-based decision-making model that supports risk assessment actions for policymakers and stakeholders in marine spatial planning (MSP). The potential spill of 20 barrels (bbl) from an accident between two tankers was simulated using the developed model, which interconnects HYDROTAM-3D and the MCS. The average accident probability in İzmit Bay was estimated to be 5.5 × 10⁻⁴ in the AML based MCS, with a probability range between 2.15 × 10⁻⁴ and 7.93 × 10⁻⁴. The order of the predictions’ magnitude was consistent with the Undersecretariat of the Maritime Affairs Search and Rescue Department accident data for İzmit Bay. The spill reaches the narrow strait of the inner basin in the first six hours. This study determines areas within the bay at high risk of accidents and advocates for establishing emergency response centers in these critical areas.

Keywords:

maritime risk assessment; automated machine learning; deep learning; İzmit Bay; hydrodynamic coupled Monte Carlo simulation; maritime accident; HYDROTAM-3D

1. Introduction

The accident records from the Undersecretariat of the Maritime Affairs Search and Rescue Department of Türkiye [1] compiled by the İstanbul Port Authority for 2000–2020 were utilized in this paper to train and validate the following machine learning models: Light Gradient Boosting (LightGBM), XGBoost, Random Forest, and Multilayer Perceptron (MLP).

Maritime transportation constitutes a pivotal sector within global trade; it is responsible for a substantial portion of the world’s oil transport, holding considerable economic significance. Notably, over 80% of international trade in goods volume is conducted via maritime routes, and this figure is higher in several developing nations [2]. Nonetheless, the maritime conveyance of oil is burdened with substantial risks. Oil tankers in maritime incidents engender severe ecological ramifications and environmental pollution [3]. Consequently, the maritime transportation industry must adhere to stringent safety protocols and procedures, particularly for conveying oil and natural gas. Maritime accidents result in ecological disasters, loss of human life, and vast economic damage [4]. Turkish Straits and the Marmara Sea have experienced circa 200,000 tons of oil spills from shipping accidents [5]. The procedures employed to mitigate the environmental damage from ship accidents are expensive. The International Petroleum Industry Environmental Conservation Association (IPIECA) indicates that cleaning up one barrel of oil in marine environments can cost between USD 700 and 3000 [6]. The International Maritime Organization (IMO) expects to prevent maritime accidents using risk assessment models [7]. These models comprise Fault Tree Analyses (FTA), multi-objective network flow [8], and machine learning models [9]. These models aim to minimize the environmental risks associated with oil transportation [10]. Stranding and collision are high-risk accident categories for oil tankers due to the influence of human errors, determined based on data from the IMO’s Global Integrated Shipping Information System [11]. Previous studies [12,13] highlight decision errors according to the COLREG Regulations for Preventing Collisions at Sea [14]. The existing analysis of shipping accidents highlights the need to recognize the underlying factors and establish infrastructure to reduce accidents by estimating relations between environmental, ship, and maneuvering parameters [4].

Stochastic models are used to estimate the probability of oil contamination using the database of marine accidents in Türkiye [15]. A study of the transit materials from the Marmara Sea and Turkish Straits showed that 13% of vessels carried hazardous goods. Of these goods, 63.5% contained flammable fluids, 10.6% toxic gases, and 9.4% corrosive materials, half of which transported petroleum products [16]. A comparable effort is made in [17], the authors of which adopt a fuzzy-logic-based multi-criteria decision-making system, integrating expert judgment with GIS-based spatial visualization to prioritize the removal of navigational obstacles.

This study emphasizes predictive modeling using machine learning algorithms trained on historical accident data and environmental simulations, thus offering a more quantitatively driven approach using automated machine learning (AML). AML techniques are used for maritime accident predictions, focusing on tabular datasets derived from sources such as weather reports and accident records. Gradient boosted decision trees (GBDTs), such as XGBoost and LightGBM, have proven to be effective methods for structured data due to their ability to handle both categorical and numerical parameters, making them appropriate for evaluating maritime incidents. Studies such as [18,19] have confirmed the strong predictive abilities of GBDTs, with findings indicating that gradient boosting outperforms other methods such as Support Vector Machines (SVMs) in predicting accident risks associated with environmental conditions and ship characteristics. However, despite their effectiveness, challenges such as overfitting and the need for hyperparameter tuning persist. Deep learning (DL) models have a high learning capacity, but they are still advancing in terms of the competence of GBDTs in managing tabular data related to maritime accidents. The authors of [20] applied DL methods and reported their inadequate prediction capacity for ship accidents in New Orleans, indicating their restrictions. Although AML methods can show acceptable success rates, they can also produce high false positive rates due to the likelihood of extreme environmental factors.

Another study [21,22] utilized Bayesian Networks to analyze maritime accidents in Chinese coastal waters and found a strong correlation between weather conditions and extreme events. These limitations suggest that including 3D hydrodynamic models to concurrently predict currents under the influence of wind and waves, when interrelating with AML, can improve the prediction capacity of artificial intelligence models. Previous studies highlighted the difficulty of maritime accident prediction and the importance of incorporating environmental data to refine predictive models for accuracy improvements and overfitting reductions.

Ref. [23] revealed the key gaps in predicting the risk of marine accidents. The authors introduced a two-stage feature selection (FS) method to identify and rank risk influential factors (RIFs), advancing the accuracy and interpretability of AML models. They compared six AML models and determined that LightGBM performed better than other models in predicting accident risks. Another study [24] integrated weather data with accident records using AML techniques. The authors analyzed data from the Norwegian Maritime Authorities from 1981 to 2021, including 51 weather parameters from visual crossing. They identified LightGBM with Early Stopping as the best-performing model, accomplishing a five-fold cross-validation accuracy of 70.23% when weather data were included in AML.

Another study [25] evaluated the performance of an additional 29 AML algorithms and obtained the same result, identifying the LightGBM as the best-performing model for the same database. The authors also obtained enhanced accuracy in predicting five major accident types: grounding, contact, fire, collision, and weather damage. Studies such as [26,27] emphasized the importance of model selection and fine-tuning to enhance predictive performance, where ALM’s flexibility and vigor made it a preferred choice. FTA and multiple correspondence analyses of data from major organizations, such as the UK Marine Accident Investigation Branch (MAIB) and IMO Global Integrated Shipping Information System (GISIS), concluded that maneuvering and perception errors were the most critical factors causing ship collisions [28]. Similarly, [5] presented the Human Factor Analysis and Classification System for Passenger Vessel Collisions (HFACS-PV) to study human factors in passenger vessel accidents, signifying its effectiveness in determining the reasons behind grounding, contact, sinking, and collision incidents.

The authors of [29] showed that weather conditions impact marine risks, and Gradient Boosting outperformed other AML models in predicting accident risk related to visibility conditions based on regional weather data. Another study [30] assessed the performance of AML models using AIS data to identify challenges related to data finding. The authors suggest future research directions for integrating artificial intelligence with simulation models. The authors of [31] integrated AIS and weather data using a combination of fuzzy logic and DL models. They proposed an Enhanced Collision Risk Index with Weather (ECRI-W), which combined the Collision Risk Index (CRI) with environmental factors to provide a risk assessment framework. They employed the Self-Attention and Intersample Attention Transformer (SAINT), a deep learning model for tabular data. The Analytic Hierarchy Process (AHP) was applied to identify and rank factors affecting navigational safety in İzmit Bay [32].

In this paper, the Hybrid Maritime Risk Assessment Model (HMRA) Integrating Automated Machine Learning (AML) and Deep Learning (DL) with Hydrodynamic and Monte Carlo Simulations (MCS) was developed to meet these research gaps in the literature. The Undersecretariat of the Maritime Affairs Search and Rescue Department accident documents were analyzed to identify marine accident risks. The flow chart of the HMRA model is given in Figure 1, which explains the procedure for estimating accident probabilities, potential risks, and the average volume of spills in an accident.

The model developed in this study establishes a robust methodological framework for maritime accident risk prediction, balancing precision with computational efforts through a multi-stage methodology that integrates ensemble and deep learning architectures using the dual-aspect evaluation.

Six AML algorithms, LightGBM, XGBoost, Random Forest, Multilayer Perceptron (MLP), and Histogram-Based Gradient Boosting (HGB), are deployed using cross-validation, with the model architectures calibrated based on a dual strategy of baseline configurations and hyperparameter optimization. Figure 2 summarizes this integrative model structure by explaining its operational workflow that addresses continuous probability estimation and discrete risk categorization.

The hybrid model utilizes a bifurcated assessment procedure using the regression metrics of the root mean squared error (RMSE), mean absolute error (MAE), and determination coefficient (R²) and assessing binary risks according to the F1-score, balanced accuracy, and logarithmic loss at a decision threshold of 0.5. This dual evaluation metric focuses on two connected objectives in risk modeling: continuous probability estimation and discrete risk categorization. The accident probability prediction involves regression errors; however, risk management decisions often impose binary thresholds for warnings, such as high-risk versus low-risk scenarios.

The RMSE, MAE, and R² regression metrics measure the model’s ability to determine the accuracy of predictions. RMSE penalizes deviations proportionally, and MAE specifies scale-invariant error analysis. The R² coefficient guarantees predictive reliability by measuring variance explained relative to the mean. These metrics reinforce the model’s calibration with the continuous nature of accident probabilities necessary for risk prioritizing. The classification metrics of the F1-score, balanced accuracy, and logarithmic loss estimate the model’s effectiveness by reflecting the equilibrium between type I and II errors. The dual approach evaluates how well probabilistic predictions stratify cases into risk categories [33]. Balanced accuracy accounts for potential class unevenness by averaging recall across these categories [34]. Logarithmic loss enforces a penalty derived from the Kullback–Leibler divergence [35], penalizing the wrong classifications and thus verifying uncertainty estimation within the classification patterns [36,37].

This dual approach enables accurate risk predictions to serve dual stakeholders: engineers requiring probability estimates and emergency response teams needing explicit warnings. The hybrid model can be adjusted to local regulations of risk acceptance and can provide a standard comparison for binary risk assessments. To confirm reproducibility, the model combines five-fold cross-validation for variance estimation, regression, and confusion matrices [38]. Computational efficiency is established using early stopping criteria for gradient boosting [39]. This multilayered strategy calculates the predictive performance of 1540 model configurations developed in this paper to identify the maritime risks in İzmit Bay. The best-performing artificial intelligence configurations from the 1540 AML and DL models designed in this paper were identified during the pre-application stage of the HMRA.

2. The Hybrid Maritime Risk Assessment (HMRA) Model

The new Hybrid Maritime Risk Assessment (HMRA) Model integrates AML and DL with Hydrodynamic and Monte Carlo Simulations and is applied to the İzmit Bay of the Marmara Sea. The three-dimensional hydrodynamic transport model HYDROTAM-3D developed by [40] is employed to simulate the impact of environmental factors. The HMRA model used the database of maritime accidents and simulated environmental data of current, wind, and waves from the HYDROTAM-3D model.

Integrating MCS, Three-Dimensional Hydrodynamic Transport, AML, and DL, this model enabled accident risk prediction and categorization to determine the possible location of the main emergency response center. The Hydrodynamic Transport Model provides a computational framework with geographic information and a cloud computing system [41]. This model uses bathymetric data from the Turkish maritime regions to simulate marine currents and pollution transport phenomena [42,43]. High-resolution wind from the European Centre for Medium-Range Weather Forecasts Operational Archive [44] is incorporated. The wind data display a spatial resolution of 0.1 degrees within a horizontal grid and is resolved at six-hour intervals. The dataset spans two decades, covering the period from 2000 to 2024 and Turkish coastal waters, providing the input for the model to simulate environmental conditions. The three-dimensional hydrodynamic transport model addresses the impact of environmental conditions on maritime safety [45,46,47]. The efficiency of the hydrodynamic model has been verified by scenarios situated along the Turkish coast, thereby providing proof of its operational success [48].

2.1. AML Sub-Models

The probabilistic framework for accident categorization uses the database of the Ministry of Transport and Infrastructure, Main Search and Rescue Coordination Center (MSRCC) in the hybrid model [1,49,50]. The main dataset includes 2115 records of maritime incidents recorded by the İstanbul Port Authority for the 20 years of 2000–2020, specifying the location, time, environmental conditions, and severity of accidents according to the number of missing, deceased, and rescued individuals. On average, 0.48 people are reported missing per incident, while the mean number of fatalities is 0.83, indicating that a significant portion of reported incidents involve at least one casualty. The number of rescued individuals per incident varies, with an expected value of 16 people. The detailed sub-dataset contained 248 accident data (Figure 3), and 169 randomly selected data that comprise vessel characteristics, environmental conditions, and maneuvering data; these were used for training and testing.

The mean gross register tonnage (GRT) for the first vessel involved in accidents is approximately 6400, while, for the second vessel, it is 5420, with a high standard deviation, suggesting high variability in vessel sizes. Environmental conditions, such as the daily average wind speed and daily total precipitation, have mean values of 3 m/s and 9 mm, respectively, with a level of variability indicating diverse meteorological conditions during recorded accidents.

The dataset captures key maneuverability parameters, such as maneuver difficulty, which has a mean value of 0.45, suggesting that most cases involved moderate difficulty in navigation based on the Guide Specifications and Commentary Vessel Collision Design of Highway Bridges [51]. Considering the environmental conditions, site characteristics were incorporated using the wind, wave, and current sub-models. The severity of the accident, rated on a scale of 1 to 5, has a mean of 1.7, indicating that most incidents were considered to be of moderate severity [52,53].

The model conducts accident experiments using multiple AML methods, including LightGBM, XGBoost, Random Forest, and Multilayer Perceptron (MLP). It processes vessel and environmental data such as wind speed, precipitation, the number of foggy days in a month, and vessel characteristics to train and evaluate models for regression and classification tasks.

The model splits the data into training and test sets, performs cross-validation to assess model performance, and fine-tunes hyperparameters to optimize predictive accuracy. It evaluates the model’s performance using metrics such as the RMSE, MAE, and R² for regression and accuracy, precision, recall, F1-score, and log-loss for classification. Performance plots are generated to visualize trends and potential areas of improvement. The model aims to identify the most effective method and feature set for predicting accident probabilities accurately in İzmit Bay while preventing overfitting and ensuring generalization to unseen data.

The Light Gradient-Boosting Machine (LightGBM) is a gradient-boosting method that uses a tree-based learning algorithm to boost prediction accuracy. The model allows for the analysis of large-scale datasets at higher computational speeds and lower memory runtimes. Light GBM, a leaf-wise algorithm, follows the direction of the largest loss to converge faster than the level-wise algorithms of other boosting algorithms. The key hyperparameters of LightGBM include categorical prediction and early stopping to prevent overfitting due to its ability to efficiently deal with high-dimensional data with accurate predictions. It is preferred for regression and classification problems such as accident probability and risk estimation.

XGBoost (Xtreme Gradient Boosting) is another famous regularization-typing gradient-boosting algorithm with parallel processing and memory management techniques. XGBoost creates a sequence of decision trees, with each new tree improving on the residual errors of the previous models. It utilizes Lasso (L1) and Ridge (L2) regularization to mitigate overfitting and enhance generalization. Histogram-based optimization and sparsity procedures for handling scarce data are the reasons for the efficiency. XGBoost is very useful for sparse datasets due to its flexibility in generalization.

During training, Random Forest creates numerous decision trees and combines their predictions to enhance precision. In contrast to boosting algorithms, which create trees one at a time, Random Forest fits trees independently from parallel samples of training samples. For regression and most classification tasks, the final prediction is the average of the results. Random Forest avoids overfitting by building random trees.

Multi-Layer Perceptron (MLP) is a type of artificial neural network (ANN) that consists of multiple layers of units (neurons), including one input layer, one or more hidden layers, and one output layer. Each neuron performs a weighted addition of inputs, followed by a nonlinear activation function. Using these functions and weights, ANN learns complex relationships between input parameters and object variables. MLPs can capture nonlinear representations of accident incidents and are commonly fitted via backpropagation, minimizing the prediction error by iteratively training segment weights. Although MLPs prevail in classification and regression tasks, they require fine-tuning hyperparameters such as the number of layers, neurons, and learning rate to realize optimal implementation without overfitting.

The input parameters used in accident probability predictions require several factors that can be classified into four main classes: vessel characteristics, accident severity, environmental parameters, and maneuvering difficulties. Regarding vessel characteristics, the model considers the gross tonnage (GRT), overall length (LOA), and type of ship. The accident’s severity works as the categorical indicator of incidents. Environmental parameters perform a significant role in influencing accident probability and incorporate the meteorological variables of the monthly average wind speed (m/s), monthly foggy days, monthly total precipitation (mm), Beaufort wave scale of sea conditions, daily maximum wind speed and direction, daily average wind speed, daily total precipitation (mm), hourly wind speed and direction, and average current velocity (cm/s). Maneuvering difficulties mainly influence the occurrence of accidents. These parameters allow the model to capture the data patterns of maritime accident scenarios, thereby improving the predictive capacity of the AML and DL methods. The flowchart of the AML model (Figure 2) presented the workflow for accident risk probability predictions, breaking down the procedure into five main stages: data preprocessing, model training, evaluation, optimization, and implementation. During preprocessing, the model handles missing values, applies predictive scaling, and divides the data into training and testing sets to sustain balanced training and assessment. Normalization is employed in tree-based LightGBM, XGBoost, and Random Forest models. Normalization is also necessary for the convergence of ANN models in the gradient descent of the MLP.

Decision trees are built in the training phase of the gradient-boosting models, considering the paths of the maximum reduction in residual errors. Random Forest trains multiple decision trees independently using bootstrap sampling to increase generalization. MLP employs backpropagation with gradient descent to revise neuron weights, making it computationally more time-consuming than decision tree models. Then, early stopping is employed to prevent overfitting and increase the computational performance of the model. Early stopping is applied in MLP by monitoring the validation loss across epochs to stop learning when the prediction reaches a steady state to prevent overfitting. In LightGBM and XGBoost, early stopping controls validation loss and terminates training when a specified number of iterations attain no progress. Random Forest does not demand early stopping due to its statistically independent multiple trees. Hence, tree-based models require fewer iterations than ANNs, which demand extensive computations for each training cycle.

Accuracy signifies a positive ratio of correctly categorized occurrences out of the total number of accidents. Precision measures the ratio of correctly predicted positive cases out of total positive cases. Higher precision implies fewer false positives in the predictions. The F1-score is the harmonic means of precision and recall, providing a balanced measure when there is an uneven class distribution. It is used when false positives and negatives have significant consequences, balancing the trade-off between precision and recall.

Recall measures the proportion of correctly identified positive instances among all positive ones. These metrics imply model performance, with accuracy giving an overall evaluation, precision indicating the reliability of positive predictions, and the F1-score designating the balance of categorizations:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N} Precision = \frac{T P}{T P + F P} Recall = \frac{T P}{T P + F N}

(1)

F 1 - Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(2)

where:

True Positives (TP): correctly predicted positive cases.

True Negatives (TN): correctly predicted negative cases.

False Positives (FP): negative cases incorrectly predicted as positive.

False Negatives (FN): positive cases incorrectly predicted as negative.

Division of the Dataset

The developed model divides the dataset into training and test sets to evaluate the performance of multiple models, including LightGBM, XGBoost, Random Forest, and Multilayer Perceptron (MLP). The dataset, which consists of environmental, ship, and maneuvering characteristics influencing accident probability, is split using an 80–20 ratio, allocating 135 samples for training and 34 samples for testing. This allocation confirms that all models have sufficient data to learn patterns while retaining a portion for evaluation from the detailed data set. While tree-based models (LightGBM, XGBoost, Random Forest) handle raw data without feature scaling, MLP requires normalization to optimize gradient-based learning. The splitting strategy supports model evaluation by ensuring fair comparisons across algorithms.

2.: Training Models

LightGBM and XGBoost use the gradient-boosting algorithm, sequentially generating decision trees to minimize residual errors. LightGBM uses a leaf-wise scheme to increase accuracy by maintaining computational efficiency. XGBoost utilizes a level-wise approach to simplify complex parametric relations. Random Forest forms multiple decision trees independently using Bootstrap to avoid overfitting but increasing the computations. MLP is an ANN model that depends on forward and backward propagation to adjust neuron weights iteratively, thus requiring more computational efforts than tree-based methods.

3.: Multi-Classification Machine Learning

LightGBM and XGBoost classify risk levels into discrete categories, such as low, medium, and high, using the SoftMax objective function to estimate the probability of categories. MLP can capture complex and nonlinear interactions thanks to its ANN structure. Random Forest employs the majority voting method across decision trees to classify data. These models allow for the combination of regression and classification assessments to evaluate maritime risks.

4.: Breaking point

The breaking point confirms that models balance performance and computational efficiency and it is determined when additional iterations do not improve model accuracy. The breaking point is identified for LightGBM and XGBoost by monitoring validation error trends, while, in Random Forest, performance stabilizes when adequate trees are generated. MLP uses learning curves to identify the reduction rate of model errors to prevent overfitting.

5.: Testing and Evaluation

Performance metrics measure the prediction accuracy of models. Generally, LightGBM and XGBoost demonstrate lower prediction errors because of their ability to capture interactions. Random Forest provides stable but less accurate results due to its random tree structure. MLP’s performance depends on tuning hyperparameters, such as the number of layers and the learning rate.

6.: Cross-Validation

K-fold cross-validation is used to assess model strength due to the small size of the accident dataset. The dataset is split into k equal parts, and each model is trained on k-1 sets and tested on the remaining one for all data points. The cross-validation of LightGBM and XGBoost incorporates built-in algorithms to enhance model learning, whereas Random Forest and MLP depend on recurring random divisions. Cross-validation generalizes unseen data and provides hyperparameter modifications, improving performance by reducing overfitting.

2.2. DL Sub-Models

2.2.1. The CNN-DL Model

The Convolutional Neural Network (CNN) Deep Learning Model is implemented in PyTorch version v2.0.0+cpu for deep learning prediction and classification [54]. The model architecture is shown in Figure 4, in which normalization layers, activation functions, combining operations, and associated layers work together to learn input data characteristics. The model starts with a convolutional layer [1,3,4], employing four filters for the input data. A batch normalization layer follows, using activation functions to improve convergence. After convolution and normalization, the rectified linear unit (ReLU) activation function is employed to learn non-linear data patterns. Then, an additional convolutional layer with eight filters is used to refine learning, followed by a second batch normalization layer for stable learning. The predictions are produced from these connected layers with the bias terms. The activation function at the output scales the predictions, and the gradients for each weight and bias parameter are accumulated. The weights are adjusted iteratively to improve training stability and convergence. The CNN model is designed with a multi-layer extraction approach by combining convolutional layers, batch normalization, and pooling operations as deep learning.

2.2.2. The MLP-DL Model

The PyTorch implementation of Multi-Layer Perceptron Deep Learning (MLP-DL) is connected to smart maritime safety [55]. The model consists of five input layers, which connect to several dense layers with weight and bias parameters, batch normalization applied, activation functions, and optimization layers, as represented in Figure 5. This layer is the input layer that consists of 128 neurons, to which a linear transformation is applied to the input data that are passed through a batch normalization layer. Hence, the rectified linear unit (ReLU) activation serves multiple purposes, such as capturing non-linearity and enhancing learning ability.

A second layer consists of 128 neurons, another normalization, and a ReLU activation. The following layers continue with 64 neurons, reducing dimensionality. Each layer applies batch normalization and ReLU activation to stabilize the gradients. The output at the last intermediate layer is compressed to 32 dimensions, which are then fed to the output layer, which has a single neuron to perform regression. Using batch normalization, ReLU activations, and the Adam optimizer yields a model with advanced learning capacity. The architecture is developed to arrange data for predicting accident probability.

2.2.3. The TensorFlow CNN-DL Model

The TensorFlow Convolutional Neural Network (TF CNN) is used for sequential data analysis [56]. It consists of convolutional layers, batch normalization, activation functions, pooling layers, dropout regularization, and densely connected layers, which makes it fit for learning patterns from structured data sets. The first convolutional layer has 32 filters for separating patterns from input data. The batch normalization layer follows stabilizing activations for better convergence.

The dimension is reduced to 64 neurons through the first layer, and non-linearity is addressed through the ReLU activation function. The overfitting layer randomly terminates the execution of neurons during training. The final layer has a single neuron, giving rise to accident probability. The model combines the convolutional and pooling layers with batch normalization and overfitting layers to enhance the DL model’s predictability.

2.2.4. The TensorFlow LSTM-DL Model

The TensorFlow LSTM model is a deep recurrent neural network (RNN) structure with multiple long short-term memory (LSTM) layers for detecting data variability [57]. It is utilized for structured time series. The first four LSTM layers have 64 neurons with batch normalization and overfitting regularization for extracting long-term parameter relationships. The output is sent through a 16-neuron connected layer, further processing extracted data features. The final dense layer projects output into one neuron for accident probability prediction. The LSTM architecture can identify time-dependent causal relationships. Fully connected layers optimize learning patterns further before the final prediction, making it a powerful model for time series.

2.2.5. The TensorFlow MLP-DL Model

The TensorFlow Multi-Layer Perceptron (MLP) model has a feedforward network structure [58] with five input features. The model consists of ReLU activations, batch normalization, and overfitting layers. The input layer maps five input parameters onto 128 neurons with a dense layer. The overfitting layer randomly shuts down neurons during training. The second layer maps down from 128 neurons to 64 to refine the patterns learned. The TensorFlow MLP model combines batch normalization for stable learning, dropout for regularization, and ReLU activations for enhancing accident probability predictions.

2.3. MCS Sub-Model

The MCS model developed in [46] estimates the likelihood of incidents caused by navigational and meteorological conditions, verified using numerous databases of accidents and spills [59,60,61,62,63,64]. During the 2020s, there have been 27 oil spills, each surpassing 7 tons, cumulatively leading to 28,000 tons of oil being lost. Five major events are responsible for 91% of the spilled oil, with the remaining 9% coming from 22 smaller incidents. Figure 6a represents the spills exceeding 7 tons (1970–2023), indicating the volume of tanker trade in million metric tons and demonstrating a slight increase in the incidence rate of oil spills over the last five years [65]. In 2023 alone, a significant spill of over 700 tons was recorded, in addition to nine medium-sized spills with volumes ranging from 7 to 700 tons. The total amount of oil that leaked into the environment from tanker spills in 2023 was approximately 2000 tons.

Figure 6b displays the causes of oil spills categorized by spill size for the given time frame [66]. This figure shows that most oil spills exceeding seven tons, spanning the period from 1970 to 2023, predominantly resulted from contact/collisions and strandings/groundings, as per the International Tanker Owners Pollution Federation report [65].

The report suggests that larger spills are more likely to result from navigational incidents, while smaller spills have a broader range of causes. When examining the frequency and magnitude of oil spills, a small number of high-volume spill incidents significantly influence the total volume of oil discharged per decade, as illustrated in Figure 7 [66].

Integrating the MSC model with the hydrodynamic model enabled the determination of the probabilities and severities of maritime accidents under a range of oceanographic and meteorological conditions. This study pinpoints areas within the bay at high risk of specific accidents and promotes the establishment of emergency response centers in these critical areas. The research shows the significance of emergency response plans for İzmit Bay, considering the estimated increase in hazardous material traffic due to the Black Sea Natural Gas and Oil Project of Türkiye. The Monte Carlo Simulation (MCS) model emerges as a fundamental tool in MSP [67,68,69], enabling the evaluation of potential risk scenarios. The model employs random value assignments from probability density functions across numerous simulations of possible accidents.

This approach aids in identifying the most likely accident scenarios and facilitating preemptive measures. Hence, this study leveraged the MCS model with the hydrodynamic model to ascertain the probability and severity of accidents from contact/collision incidents under challenging oceanographic and meteorological conditions.

The risk involved generating 30,000 maneuver simulations within the MSC model and incorporating probability distributions modeled within the model. The HYDROTAM-3D model was interrelated with MSC to simulate wind, wave, current, and weather conditions [45,46,70] and was interrelated with accidents modeled by probability distributions derived from the data.

The MCS model was instrumental in determining the annual probability distribution of the mean volume of hazardous spills per vessel. The HMRA model provided a comprehensive accident risk evaluation methodology for various accident types.

Recurring simulations using probability distributions facilitate the detection and moderation of potential accidents before their occurrence, which is fundamental for protecting infrastructure and reducing environmental impacts in MSP.

2.4. Wind Climate Model

Long-term and extreme wind analyses are performed for meteorological stations and the European Centre for Medium-Range Weather Forecast Operation Archive (ECMWF-OA) data, which is provided at 0.1-degree horizontal grids with 6 h intervals for 2000–2024. Yearly and seasonal wind roses are obtained to demonstrate the directional distribution of winds. Maximum winds and their directions are explored, and the dominant wind directions for the bay are revealed. The Fisher Tippet Type 1 (Gumbel) Probability distribution of yearly maximum winds is obtained using the database of hourly wind data.

2.5. Wave Climate Model

Long-term and extreme wave statistics are generated to obtain the annual and seasonal wave roses, significant wave heights, and periods. Wave transformations are performed to determine the effect of combined refraction and diffraction on ship maneuvering. The sub-model predicts the 6 h significant wave heights and periods at every 0.1-degree horizontal grid for 2000–2024 using the ECMWF database. The wave sub-model gives statistics on extreme waves and the bay’s yearly and seasonal spatial wave distributions.

2.6. Current Climate Model

The three-dimensional modeling of currents caused by density stratification, wind, tide, and storm surges is accomplished. The currents and pollutant transport associated with oil spills that occurred during accidents are predicted and the spatial distribution is obtained using Geographic Information Systems (GIS) and cloud computing modules. The current climate sub-model utilizes its database, which includes the Turkish coasts’ bathymetries and hourly wind data of meteorological stations. The three-dimensional k-ε turbulence model predicts oil transport processes in the bay.

2.7. Pollutant Transport Model

The dissolution, dispersion, emulsification, evaporation, and sedimentation processes of crude oil are simulated to determine the transport of oil spills in the bay. All mechanisms jointly act in oil spread; however, the weathering mechanisms that have a strong effect in the first hours, such as evaporation and emulsification, are generally solely included in most modeling studies. On the other hand, this model can simultaneously include all the transport mechanisms. Environmental factors such as temperature, salinity, wind, waves, currents, turbulence, and the specific type of oil influence the oil dispersion and weathering processes. The main parameters influencing spill transport in the hydrodynamic simulations are the spill volume, American Petroleum Institute (API) gravity, viscosity, and density parameters of specific types of crude oil [48].

3. Application of the Hybrid Model

The İzmit Refinery has a capacity of 11.9 million tons, including 2.0 million tons of semi-finished products and 9.9 million tons of crude oil. The consequences of maritime accidents would be severe for İzmit Bay since it is the most vulnerable industrial area, retaining Türkiye’s major factories, refineries, industries, and ports. İzmit Bay is the most important industrial and transportation hub, with a high population density. Türkiye has focused its exploration efforts on oil and natural gas in the western Black Sea, where its fleet of production rigs is actively working. The Turkish Petroleum Corporation (TPAO) acquired a new Floating Production Storage and Offloading (FPSO) platform. The Black Sea Natural Gas Project of Türkiye will transport oil and gas to İzmit Bay. The Sakarya Gas Field, established offshore of the northwestern Zonguldak province, was discovered in August 2020, and, since then, production has reached 4.5 million cubic meters per day.

Three main types of crude oil, namely, Russian, Iraqi, and Kazakhstani, are commonly handled in the bay. The hybrid risk assessment HMRA model is implemented in İzmit Bay, enabling the risk assessment of such huge projects. The spill simulations are based on an accident case in the outer basin where a tanker transported Kazakhstani crude oil to the refinery.

The İzmit Vessel Traffic Services (VTS) system has been operational since June 2016, and its Automatic Identification System (AIS) was established under the Maritime Traffic Management framework. The system is mandatory for vessels over 20 m in length or those carrying dangerous cargo, categorized as active vessels, while smaller local traffic vessels are designated as passive participants. The İzmit VTS provides information, navigational assistance, and traffic organization services required for managing vessel movements according to regulatory frameworks. The VTS area is divided into three operational sectors—Yalova, Hereke, and Körfez—each with individual VHF communication channels. To ensure full-time supervision, the center incorporates Radar, AIS, Radio Direction Finder, CCTV, and VHF Radio Position Reporting.

The vessel traffic density map is presented in Figure 8 [32,49]. The vessel traffic patterns within İzmit Bay can be observed from the density map. The east–west directional movements correspond to vessels arriving at the VTS area from the Marmara Sea, proceeding towards industrial and commercial hubs. These movements belong to the cargo ships, tankers, and bulk carriers, highlighting the significance of the maritime trade corridor of İzmit Bay. The map’s north–south direction density indicates the local traffic with short-distance crossings taken by service vessels, tugboats, pilot boats, and smaller commercial vessels that operate between the anchorage and industrial zones. Such localized traffic patterns indicate the dynamic nature of maritime operations within the bay, involving frequent maneuvering and docking activities. The density map is a tool for maritime authorities to monitor vessel traffic and enhance navigational safety by identifying high-risk areas with heavy traffic concentrations and potential congestion points.

The İzmit Vessel Traffic Services Center covers 36 nautical miles along the east–west axis and a liability area of approximately 200 square nautical miles. Within its area of responsibility, the following are included: three regional port authorities (Kocaeli, Yalova, and Tuzla Regional Port Authorities) and nine anchorage areas (Tuzla 1, Tuzla 2, Yalova 1, Yalova 2, Eskihisar, Hereke, Hereke Barge, İzmit Inner Bay, and İzmit anchorage areas). A total of 269 coastal facilities, containing 40 ports, 59 shipyards, 21 passenger and 5 ferry piers, 113 boat manufacturing locations, 22 fishing harbors, 3 marinas, 1 maritime school pier, 5 military facilities, and the Osmangazi Bridge is located within this area.

3.1. Wind Climate

The Hydrotam 3D model [40] utilizes hourly wind data from meteorological stations throughout Türkiye, encompassing the period from their establishment to the present. In addition, it combines six-hourly wind data derived from the European Centre for Medium-Range Weather Forecasts (ECMWF) Operational Archive (OA) with a resolution of 0.10°, ECMWF ERA-Interim reanalysis data at a resolution of 0.25°, and the Climate Forecast System Reanalysis (CFSR) from the National Centers for Environmental Prediction (NCEP) with a resolution of 0.50°.

Wind and wave data are obtained from the ECMWF and the İzmit Meteorological Station. Wind predictions from the ECMWF-OA at the nearest offshore grid point (40.7° N, 29.6° E) were analyzed to characterize the wind at the spill location (Figure 9).

The yearly and seasonal wind roses are presented in Figure 9. Winds blowing from NE and ENE dominate the secondary directions of SE–SSE. The monthly distribution of mean and extreme winds is illustrated in Figure 10, in which the annual extreme mean wind is circa 7 m/s. During the winter months, prevailing winds originate from the SSE and NE directions, while, during the other seasons, they mainly blow from the NNE to ENE sectors.

3.2. Wave Climate

Wave climate analyses for İzmit Bay are carried out using the ECMWF Operational Archive (OA) predictions at the coordinate 40.7° N, 29.6° E. Long-term and extreme wave statistics were obtained using the wave climate sub-model for 2000–2024. The annual and seasonal wave roses are presented in Figure 11. WAM is the third-generation wave model developed by the Wave Model Development and Implementation Group [71] and is used worldwide. The dominant wave directions are WSW–NW, indicating that waves are traveling from the Marmara Sea towards İzmit Bay. Waves in the inner Bay are observed from the NE sector during summer and spring. The extreme-value Fisher Tippet I (Gumbel) probability F(H_s) distribution is determined according to the ECMWF Operational Archive (OA) at the coordinates of 40.7° N–29.6° E for the years 2000–2024 and is illustrated in Figure 12.

3.3. Current Climate

The current roses are determined at the accident coordinate of 40.7546° N–29.5443° E based on the hydrodynamic simulations of HYDROTAM-3D, as shown in Figure 13. Bathymetric data for İzmit Bay were obtained from the Department of Navigation and Hydrography at a scale of 1:50,000 (Mercator ED 50 291, 41°).

The study area extends from 40°48.733′ N, 29°15.240′ E to 40°39.6426′ N, 29°56.7679′ E and was discretized into 120 × 36 grids, with each grid measuring 65 m by 467 m, covering a total area of 58.398 km by 16.824 km. The circulation within İzmit Bay is determined as turbulent and irregular (Figure 13).

The hydrodynamic model integrates eddy viscosities and diffusivities to characterize turbulent flows, as the k-ε turbulence model computes the vertical eddy viscosity. Wind force is included to generate a time series of current fields, from which seasonal current roses (Figure 13) are obtained to characterize the circulation pattern of the bay. The resulting three-dimensional velocity field then functions as the input for the pollutant transport model, simulating the dispersion process of oil spills within the bay. The dominant surface currents are in the NNE, WSW, and WNW directions, with maximum velocities of 22.2 cm/s, 19.3 cm/s, and 20.1 cm/s, respectively. It has been determined that the prevailing currents in the winter season are in the NNE and NE directions, with maximum current velocities of 20.2 cm/s and 19.6 cm/s, respectively. In other seasons, the directions of the dominant current change. In spring, they are in the SW and NNE directions; in summer, SW and WSW; in autumn, the WSW, WNW, and NNE directions. Maximum current velocities vary between 16.6 cm/s and 11 cm/s. Recent research [47,48] has proven that HYDROTAM-3D successfully simulates real-life scenarios in İzmit Bay.

3.4. The AML and DL Sub-Models

The accident prediction experiments contain multiple AMLs, including LightGBM, XGBoost, Random Forest (RF), MLP (Multilayer Perceptron), and Histogram-Based Gradient Boosting (HistGB). Each model has undergone baseline and tuned versions to evaluate performance based on 17 key maritime accident parameters in Figure 14, including maneuvering, environmental parameters, and ship characteristics. The dataset is split into training and test fractions, with an 80–20% split used consistently across all experiments. Cross-validation results and test set evaluations made the model effective in both regression and classification tasks, using metrics such as accuracy, precision, recall, F1-score, balanced accuracy, and log loss. The SHAP (Shapley Additive exPlanations) summary plot of influence, presented in Figure 15 for the RF model, explains the relative influence and directional impact of 17 maritime parameters on predictive outcomes, synthesizing interpretability with algorithmic precision. Interpretability focuses on understanding the algorithm of an AML model, while explainability aims to provide reasons for the model’s outputs. This method explains the gap in black-box models by identifying critical factors in predictions [72,73].

The SHAP plot of influence is illustrated in Figure 16 for the baseline LightGBM with 17 parameters. Maneuver difficulty emerges as the primary factor, showing a high positive SHAP value peaking near 0.2, emphasizing its role in risk predictions. Following closely, the average current velocity (cm/s) and average turn angle of the vessel display moderate-to-high positive contributions, while environmental variables such as monthly total precipitation (mm) and hourly wind direction demonstrate less of an effect in İzmit Bay, based on their SHAP values. Vessel-specific attributes such as tonnage (GRT) and length (LOA) affect maneuvering metrics. The asymmetry in SHAP distributions, ranging from 0.3 to 0.2, reveals a strong interaction between maneuvering and environmental factors and highlights the importance of dynamic navigational challenges over vessel characteristics. This visualization validates the model’s alignment with maritime domain expertise and develops its interpretative strength, offering stakeholders a rational framework for identifying the multivariate drivers of maritime risk. The SHAP plot of influence is illustrated in Figure 16.

The SHAP plot of influence illustrated in Figure 16 for the baseline LightGBM with 17 parameters explains the directional influence of the input parameters on predictive outcomes. Such a pattern aligns with LightGBM’s tendency to highlight splits that minimize prediction errors through gradient-based optimization, potentially attenuating the magnitude of positive contributions observed in ensemble methods such as RF. LightGBM shows smaller positive contributions from maritime parameters because LightGBM’s optimization method focuses on minimizing errors collectively [74], distributing impact evenly across features. The AML baseline models were trained on a dataset for all 17 parameters. Upon comparing the true versus predicted plots across all models in Figure 17, we found that XGBoost delivers the best performance in accident probability predictions. The scatter plot of XGBoost demonstrates a closer alignment of the predicted values with the ideal fit, indicating higher accuracy and lower error variance. Most data points are tightly clustered around the diagonal, suggesting that XGBoost effectively captures the underlying patterns in the dataset with minimal deviations.

Additionally, fewer extreme outliers are observed compared to other models, reinforcing its ability to generalize well across the test data. This performance can be attributed to XGBoost’s advanced gradient-boosting technique, which iteratively minimizes residual errors, leveraging regularization to avoid overfitting and optimizing feature contributions. The Random Forest and LightGBM show successful performances, with data points following the ideal prediction line well with small scatter patterns.

The Histogram-Based Gradient Boosting (HistGB) model shows deviations from the diagonal, particularly in lower values, implying underfitting where the model may have modest difficulties capturing certain patterns.

The Multilayer Perceptron (MLP) model exhibits higher variability, with data points scattered from the ideal fit, suggesting challenges in learning feature interactions with the given parameter set. Upon analyzing the confusion matrices for all baseline models in Figure 18, the LightGBM and RF models have successful predictive accuracy for 17 key parameters.

Both models correctly classified 12 out of 14 negative instances (true negatives) and all 20 positive events (true positives), achieving the highest precision and recall among all models; this indicates their strong ability to distinguish accidents with minimal misclassification errors. The XGBoost model performed well, with 11 correct negative classifications and 3 false positives, suggesting a decrease in specificity. However, this model achieved good recall by accurately predicting all the positive cases. The HistGB model is also balanced for classifying accidents.

On the other hand, the MLP (Multilayer Perceptron) model exhibited the weakest performance, with five false positives, misclassifying nearly half of the negative cases. Although it maintained recall efficiency by appropriately predicting accident cases, the high number of false positives reduces its reliability in real applications where false predictions can be costly in terms of the emergency response. Based on the confusion matrices, LightGBM, RF, and XGBoost emerge as balanced models, offering the best trade-off between precision and recall, making them suitable choices for accident probability predictions in maritime safety applications.

The analysis of these results shows that these tree models tend to make better predictions, with XGBoost generating the highest test R² values and lowest MSE in both regression and categorization. The calibrated model of LightGBM and RF also show compelling performance due to hyperparameter optimization, such as modifying the number of leaves, learning rates, and trees. The Multilayer Perceptron (MLP) models demonstrate comparatively reduced regression performance but sustain reasonable classification precision. The prediction strategy behind artificial intelligence models highlights the importance of environmental and operational causes of accidents, with the tuned models succeeding in increased accuracy compared to the baseline models. These discussions underline the importance of hyperparameter tuning and cross-validation in developing reliable maritime risk assessment models.

The four AML models of RF, XGBoost, LightGBM, and HistGB were trained using 17 key factors. Among these, RF performed best for numerical predictions by regression, scoring 0.907 in accuracy according to the R², while LightGBM excelled at classifying outcomes, hitting 94.1% accuracy and a high F1-score of 0.952. All models showed strong consistency, as their performance during training and cross-validation closely matched the testing results. XGBoost’s test error of MSE = 0.0065 was better than its training-phase average of 0.0108, showing good adaptation to new data.

The sensitivity analysis was carried out for maneuvering and environmental parameters, and ship characteristics were calibrated using a dual strategy of baseline configurations and hyperparameter optimization, as given in Appendix A Table A1 and Table A2, where the accuracy is ordered according to the evaluation metrics. While the comprehensive risk assessment incorporated all 17 parameters identified in the preliminary studies, practical constraints, including the scarcity of high-quality datasets and the inherent difficulty in ensuring the correctness of all parameters, necessitated a focused approach. For this reason, the hydrodynamic model [41] provided validated inputs for dynamic variables such as “average current velocity”, whereas structural metrics such as “LOA” were sourced from standardized maritime registries. Hence, five core parameters were selected from this sensitivity study based on their operational relevance, computational tractability, and consistency with established maritime frameworks, considering the ordered accuracy of dual prioritized parameters.

Table A2 in Appendix A provides only a sample of the dual prioritization study in which maneuver difficulty, average current velocity, length overall ship (LOA), direction of daily maximum wind, and monthly foggy days were highlighted as the five key input parameters according to the accuracy of evaluation metrics calculated for each AML simulation.

The hybrid model also utilizes DL models, and a comparison of these applications based on the training history, RMSE, F1-score, and balanced accuracy is illustrated in Figure 19 for five input parameters tuned according to the accuracy of evaluation metrics.

These figures highlight MLP as the most effective DL model because it achieved the lowest RMSE, indicating its ability to minimize prediction errors (Figure 19). Moreover, it attained the highest balanced accuracy and F1-score, suggesting a better balance between precision and recall than the other models.

The TensorFlow MLP model demonstrated the lowest performance, underperforming in all evaluation metrics and showing the highest RMSE and the lowest F1-score and balanced accuracy, which suggests poor generalization to test data (Figure 19). Regarding training times, CNN was significantly slower than all other models, with high variance, making it less computationally efficient despite its moderate predictive performance in terms of the F1-score and balanced accuracy, but slightly higher RMSE. MLP, which performed best in the accuracy metrics, also showed the shortest training time, making it the most efficient option for accident prediction tasks. Hence, these evaluation metrics identified the Multi-Layer Perceptron (MLP) as the best DL model. Even the MLP DL model demonstrated a moderate prediction capability, as seen from the scatter plot and confusion matrix provided in Figure 17 and Figure 18; the true versus predicted scatter plot shows a fair alignment along the diagonal, indicating that the best DL model captures only general trends.

The MLP DL model correctly predicts 13 out of 14 non-accident cases (true negatives) in the confusion matrix but misclassifies 7 out of 20 accident cases as non-accidents (false negatives). These misclassifications suggest that the DL model is more conservative in predicting accidents, potentially leading to an underestimation of risk.

The TensorFlow models of CNN and LSTM had greater training times, but their lower predictive performance relative to MLP makes them less favorable. As a result, the poor performance of DL models is determined by scatter plots where several points with deviations are suggested to be underfitting in specific cases and by the high false negative rates that imply additional hyperparameter optimization. Although the MLP DL model performed reasonably well in differentiating between accident and non-accident cases, its precise probability estimation and accident classification limitations reduced its reliability for high-risk maritime safety applications. MLP is the best DL model for accident prediction, but its performance is worse than that of the AML models.

3.5. The MCS Sub-Model

The MCS sub-model was used to obtain the ship accident probabilities and consequences through the new risk assessment approach by understanding the key factors of a spill. The methodology of the sub-model is illustrated in Figure 20. The novel aspect of this study is the integration of the MCS with the 3-D hydrodynamic numerical model, HYDROTAM-3D. The analysis consists of five model steps: (1) classifying spill data from marine accidents using the database of the General Directorate of Maritime Trade and the marine accident data for leakage per ship based on the type of accident using the Maritime Accidents Database of the General Directorate of Shipping; (2) obtaining ship characteristics based on accident type; (3) determining the amount of leakage based on the type of accident; (4) estimating the amount of leakage per ship according to the type of accident; and (5) calculating the total amount of leakage per accident type as shown in Figure 20. Then, the probable spill volume is simulated using the Monte Carlo Simulation sub-model. Hence, the probabilities of accidents resulting from contact/collisions based on a consideration of hydrographic and meteorological conditions were evaluated by integrating the hydrodynamic model with the MSC.

The MCS sub-model substantiated the model’s predictive robustness through ten million random simulations for İzmit Bay. The parameter distributions reflected the outcomes of real-time scenarios and operational constraints inherent to maritime systems. Probability density functions model hydrodynamic and meteorological variables, as visualized in Figure 21. The Average Current Velocity for the calm sea state is modeled by a normal distribution with a mean of μ = 2.5 cm/s and a standard deviation of σ = 0.5 cm/s in the range of [1, 4] cm/s. Monthly foggy days are modeled by a normal distribution with μ = 5.0, σ = 2.0 employing the variability in central tendencies. Circular parameters such as the direction of daily maximum wind, which are modeled by a uniform distribution over [0°, 360°], and ordinal metrics such as maneuver difficulty, which is modeled by a uniform distribution across [1.0, 5.0] employ non-parametric distributions to avoid imposing artificial structures on the phenomena. Operational parameters, exemplified by the length overall ship (LOA) with μ = 50 m and σ = 5 m, represent the coaster vessel size distribution of domestic trade in the range of [30, 70] m. Using these simulation ranges, the model produced consistent predictions in the 95% confidence interval, while the computational feasibility of processing 10⁷ samples in 541 s underscores the methodological scalability of the model.

In the next step, the average current velocity is modeled by a normal distribution for the storm sea state with a mean of μ = 45 cm/s in the range of [40, 50] cm/s to observe the effect of currents on the occurrence of accidents. The LOA with μ = 150 m and σ = 10 m characterizes the standard vessel size distribution of international trade in the range of [100, 200] m. These new probability density functions of the input distributions are visualized in Figure 22.

The pre-trained RF classifier, chosen for its balance between non-linear pattern recognition and operational tractability, generated a probability distribution with the central tendency of µ = 3.026 × 10⁻⁴ and a constrained variability of σ = 0.75 × 10⁻⁵ under the modeled conditions, as shown in Figure 23. A sensitivity analysis via Spearman’s partial rank correlation coefficients for the pre-trained RF exposed the noticeable parameter influence of the average current velocity (ρ = 0.618, p < 0.001). At the same time, monthly foggy days demonstrated a statistically significant association (ρ = 0.113, p < 0.001).

Hence, the sensitivity study quantified that the current velocity and foggy days can increase the probability of accidents with correlations of ρ = 0.618 and ρ = 0.113, respectively, when the pre-trained RF model is considered. The comparative evaluation of AML methods within the MCS framework reveals the behavioral patterns with maritime risk parameters presented in Figure 24. Among all models, LightGBM, RF, and XGBoost emerged as the best performers based on the evaluation metrics.

XGBoost achieved the highest correlation with wind direction (ρ = 0.438) but maintained low feature relationships across other predictors. LightGBM relied on vessel size (ρ = 0.830), making it an effective model for scenarios where ship size dominates in accident probability. MCS determined the accident probabilities in İzmit Bay by considering factors such as wind, waves, and currents in the hybrid model, and the annual accident probability distribution for the outer section of İzmit Bay is presented in Figure 24. The average accident probability is 5.5 × 10⁻⁴ in the MCS of AML, with a standard deviation 2.5 × 10⁻⁵. The annual accident probability range is between 2.15 × 10⁻⁴ and 7.93 × 10⁻⁴.

When examining the frequency and magnitude of oil spills (Figure 7), 37 recorded 7 tons or more, with a cumulative oil spill of 38,000 tons between 2020 and 2024. In total, 91% of the spilled volume was associated with 10 large spills, while the remaining 9% was distributed across 27 smaller incidents. Hence, an average of 3800 tons of spill per tanker is statistically associated with the current data. In addition, an outflow spill model [75] is used by considering the geometric center height of 6 m for the side collision damage on a double-hull ship with a reference oil tanker draught of 13.5 m. The vessel’s ballast tanks were assumed to be empty, and the side damage was shaped like a rectangular prism during the spill.

The outflow spill model gives a release rate of 3800 tons per hour that continues throughout the emergency response [75]. It is modeled in MCS by a normal distribution with a mean value of 3800 tons and a variation coefficient of 10%. The estimated emergency response time is six hours until the ship’s tank can be welded to stop the leak.

The spill risk is the multiplication of annual accident probability with the expected amount of spill per tanker per year when the accident occurs [76]. The spill risk for a tanker can be determined using the MCS sub-model, as in Figure 25. The MCS sub-model assesses the average spill risk of approximately 2.8 tons (20 bbl) per tanker per year for Tier 1 (Figure 25). The estimated spill risk is consistent with the study [77]. The MCS sub-model developed by [46] can offer a practical and reliable approach for assessing accident and spill probabilities.

4. Hydrodynamic Transport Spill Simulation

The HYDROTAM 3D environmental simulation model was verified by extensive experimental and analytical comparison studies published since 2000 [78]. The effectiveness of this model has been confirmed by projects and applications along the Turkish coastline, providing evidence for its successful implementation. The model’s Geographic Information System (GIS) was employed to detect hazard zones in İzmit Bay, ranking high-risk areas for oil spills from tanker accidents. According to the AIS data of the vessel traffic density map, the vulnerable black spot location of high collision risk is shown in Figure 26. The HYDROTAM-3D model was employed to simulate wind, waves, current, and weather conditions, while MCS simulated accidents.

Russian, Iraqi, and Kazakhstani oils were the most frequently handled/imported crude oil types in 2024 [79]. From January to December 2024, Türkiye imported 2,945,452 tons of crude oil from Kazakhstan.

The American Petroleum Institute (API) gravity metric measures how heavy or light a petroleum liquid is compared to water. The hydrodynamic model was calibrated using the studies of [48,80]. MCS obtains the expected spill scenario for Tier 1 as 2.8 tons of Kazakhstan light crude oil of API 42 [81]. The oil discharge continued for six hours, during which emergency response measures were implemented. The ship’s tank was ultimately welded to seal the leak and terminate the release. This simulation scenario investigated the oil transport and degradation mechanisms in İzmit Bay, as outlined in Table 1.

The 3D hydrodynamic transport model for the scenario shown in Table 1 simulates the density and evaporation process. Upper-layer currents influenced the spill movement since oil has a lower density than seawater. The critical current direction is NE towards the specially protected Hersek lagoon. Figure 27 illustrates the transport of the spill for six hours later, after the start of the spill of 20 bbl (2.8 tons) of Kazakhstan oil in the outer basin of the bay, transported by currents created from a moderate breeze in the NE direction with a Beaufort Scale of 4 (7 m/s). The oil pollution transport and degradation simulations are presented in Figure 27, six hours after the spill, as time is critical for the emergency response of the authorities. Figure 27 illustrates the concentration of spilled Kazakhstan light oil C, where C is the oil concentration after six hours of the spill scenario. The variations in evaporation, density, and viscosity parameters for the spill scenario six hours after the 20 bbl spill started are given in Table 2.

The simulation study determined the area affected by the spill at the end of the sixth hour. Figure 28 describes the variations in the Kazakhstan oil parameters, which were affected by a moderate breeze in the NE direction with a mean extreme wind speed of 7 m/s. The change in oil parameters after six hours is given as follows:

The evaporation rate of the spill is 19.6%,
The density increase is 0.863 g/cm³,
Viscosity is increased to 144 cSt,
The spatial extent of the oil spill is 2 km.

As a result, the annual probability distribution of the contact/collision type of accidents was determined through MCS using the hybrid model with the probable spill volume per vessel. The novel, extensive risk model introduced represents the first step towards identifying and instituting requisite preventative and protective measures against pollution from conceivable maritime accidents within İzmit Bay. The possible spill location in the bay was illustrated in Figure 26, in which the black spot coordinate of collision risk are shown.

5. Discussion of Results

The hybrid model predicts accident probabilities based on environmental and meteorological variables such as monthly foggy days, sea conditions (Beaufort wave scale), daily total precipitation, hourly wind speed, and average current velocity, ensuring the comprehensive coverage of critical risk factors. The hybrid model compares AML and DL model performance utilizing multiple AML and DL frameworks, including LightGBM, XGBoost, Random Forest, and Multilayer Perceptron (MLP). These models were chosen because of their distinct approaches to pattern recognition and computational efficiency. LightGBM and XGBoost are decision-tree-based learning methods that implement gradient boosting, while RF constructs multiple independent decision trees. The DL MLP model processes data through interconnected layers of neurons, enabling it to capture complex nonlinear relationships.

The comparison of AML models based on the mean squared error (MSE) metric, as shown in Figure 29a, highlights significant variations across different models and feature sets. XGBoost and RF exhibit the lowest MSE values when utilizing all features, respectively, demonstrating their robust predictive capabilities when leveraging comprehensive data. In contrast, the MLP model shows a significantly higher MSE, suggesting a weaker generalization ability under the same conditions. When transitioning to pairwise and single-feature-based analyses, the MSE increases across all models, indicating that reducing the feature set decreases predictive accuracy.

XGBoost maintains relatively low MSE values in the pair and single-feature conditions compared to other models, reinforcing its efficiency in handling feature-reduced datasets. LightGBM and HistGBM also demonstrate competitive performance, with MSE values remaining under 0.071 in the reduced feature conditions; this suggests that tree-based models perform consistently well across different feature set conditions. In contrast, MLP exhibits greater sensitivity to feature selection.

The accuracy-based evaluation in Figure 29b underscores the distinction among these models. RF achieves the highest accuracy (0.956) when all features are included, slightly outperforming XGBoost (0.941) and LightGBM (0.941), highlighting its strong classification performance. However, the accuracy of all models significantly declines when moving from all features to pairwise and single-feature-based approaches. For instance, the accuracy of XGBoost drops from 0.941 to 0.671 (pair) and 0.631 (single), indicating that feature reduction adversely impacts its classification strength. RF’s accuracy decreases from 0.956 to 0.678 and 0.638 under the same conditions, suggesting a similar trend. MLP exhibited the highest MSE and the lowest accuracy, especially in pairwise feature selection (0.587), confirming its weak adaptability to reduced feature sets.

HistGBM and LightGBM maintain slightly better accuracy levels, reinforcing their stability under constrained feature conditions. Overall, tree-based ensemble models, particularly RF and XGBoost, outperform MLP in both MSE and accuracy, demonstrating moderate generalization across different feature selection scenarios. The model choice hinges on operational priorities, such as accuracy, interpretability, and computational efficiency, emphasizing the need for context-driven model selection. The hybrid model streamlines AML and DL by enabling faster model selection and deployment and reducing the dependency on human expertise for hyperparameter tuning and feature selection.

The speed of model estimation and its relative accuracy are the main reasons for selecting AML models. The speed with which 1000 predictions are made is compared to the accuracy of the models in Figure 30. The comparison of AML models in terms of prediction times and accuracy showed distinct performances among their algorithms. The fastest model is the baseline MLP, achieving the lowest prediction time of approximately 0.8 ms; however, this speed comes at the cost of low accuracy. The tuned versions of the models generally showed a marked improvement in accuracy without major degradation in computational time, indicating the effectiveness of hyperparameter optimization. The tuned LightGBM model has balanced speed and accuracy, with a prediction time of around 2.5 ms and an accuracy of 94%. Similarly, the tuned XGBoost model offers high accuracy at a moderate computational cost of approximately 1.8 ms, making it competitive for scenarios requiring a trade-off between speed and accuracy.

When focusing on maximizing predictive accuracy, the tuned RF model emerges as the top performer, achieving the highest test accuracy of nearly 96%, but at the expense of a significantly longer prediction time of approximately 9.5 ms. This trade-off suggests that, while RF can be more accurate, its computational framework may not be suitable for resource-constrained applications. On the other hand, the tuned HistGB model offered a good compromise, providing about 90% accuracy with moderate computational demands of circa 5.6 ms.

The computational experiments were conducted on an Alienware X15 R2 system with a 12th Gen Intel^® Core^TM i9-12900H processor [Dell Technologies, New Orleans, USA ], an NVIDIA GeForce RTX 3070 Ti Laptop GPU, and 32 GB of RAM. Overall, the results indicate that the tuned LightGBM and XGBoost are optimal selections when both fast implication and high accuracy are required. In contrast, the tuned RF is preferable for applications where predictive performance is the key criterion, and computational resources are less critical.

As a result, the presented framework cross-examined feature subset efficacy across three hierarchical dimensions: individual variable significance, pairwise interactions, and multivariate combinatorial effects. These models were employed to predict accident probabilities based on environmental, vessel-, and maneuvering-related parameters. LightGBM and XGBoost showed high performance, with their gradient-boosting structures optimized for speed and accuracy. RF revealed generalization abilities and MLP-acquired relationships within the data. The proposed hybrid risk model combined machine and deep learning algorithms with the three-dimensional hydrodynamic model HYDROTAM-3D, which simulated environmental conditions such as wind, waves, and currents to generate probability distributions of accident scenarios in MCS.

This incorporation permits a more complete and dynamic risk assessment of accidents by quantifying the influence of environmental factors using simulation-based approaches. The MCS develops the model’s ability to assess uncertainty and variability in maritime safety, advancing data-driven risk mitigation policies.

6. Conclusions

The growth in dangerous tanker statistics and capacities across the Marmara Sea and Turkish Straits poses a considerable risk to maritime safety in İzmit Bay. The new hybrid model is a complete model that can include environmental risks in Marine Spatial Planning (MSP). The risk analyses showed that the model is a valuable tool for assessing risks associated with onshore oil transportation and can be used to improve MSP efforts to ensure that oil transport operates sustainably to protect the environment.

Hence, the new model can identify and mitigate environmental risks in oil and gas transportation. The proposed model combines the Three-Dimensional Hydrodynamic Transport and Water Quality model HYDROTAM-3D with the MCS risk assessment model to identify the stochastic nature of environmental parameters causing accidents, including the effects of waves, currents, and winds. The model can improve maritime safety through its data-driven risk evaluation, aiding planners and decision makers in risk mitigation policies. The AML and DL models can increase the accuracy of accident predictions. This study shows the importance of integrating environmental and ship factors in the models and offers a complete perspective compared to previous studies.

The comparative analysis of various AML and DL models gives an up-to-date spectrum of maritime safety assessments. Artificial intelligence recognizes the key factors affecting maritime accidents by highlighting the potential of AML and DL models in decision making. Hence, integrating AML and DL models with the MC and hydrodynamic simulations provides policymakers with the necessary tools. This research bridges the gap between traditional statistical methods and modern machine learning approaches in maritime risk analysis by demonstrating how automated approaches can improve model accuracy and efficiency while reducing human intervention. The study reveals that critical weather variables significantly impact maritime accident risk predictions. Maneuver difficulty and the average current quantify dynamic navigational challenges grounded in spatially resolved hydrodynamic simulations. LOA and monthly foggy days represent ship and environmental characteristics, the former being a ship registry parameter and the latter derived from long-term meteorological analyses by Hydrotam 3D.

The direction of daily maximum wind was utilized due to its verified impact on the vessel’s docking maneuver, where transient wind forces in storms dominate risk scenarios. By leveraging these five parameters, the models balanced practicality and predictive accuracy. This streamlined approach mitigated overfitting risks associated with excessive feature complexity, which would have arisen by including 17 parameters despite their availability or precision uncertainties.

Since the detailed accident sub-dataset used for training and testing was limited, the further hyperparameter optimization of DL models would lead to overfitting. Although DL models can be satisfactorily applied to accident prediction, they require large, detailed datasets for predictions due to their complex learning nature. DL models require excessive training and learning times [82]. Hence, AML models were selected for MCS.

As a result, DL techniques, though promising, are still evolving to match the interpretability and efficiency of GBDTs in processing tabular data, as observed in maritime accident studies. The spill probability was determined by MCS using environmental factors such as wind, waves, and currents obtained from Hydrotam 3D for İzmit Bay. The average accident probability is 5.5 × 10⁻⁴ in the MCS of AML, with a standard deviation 2.5 × 10⁻⁵. The annual accident probability range is between 2.15 × 10⁻⁴ and 7.93 × 10⁻⁴. Accident reports and annual vessel-port data from the past five years, compiled by the Undersecretariat of the Maritime Affairs Search and Rescue Department and İzmit Vessel Traffic Services, were utilized to verify the annual accident probabilities. The accident statistics for this period are presented in Table 3, where 37 maritime accidents occurred in the last five years.

A typical collision accident occurred on 10 March 2020, and the 110 m long, 5586 DWT cargo vessel collided with the 222 m long, 37,938 DWT, fully cellular containership, en route from Rodaport to Evyap port in İzmit Bay, near Darıca (Figure 31). The cargo vessel from Evyap to Aliağa hit the port side of the stern of the container ship, which sustained a tear in the hull above the waterline; in contrast, the cargo vessel had significant bow damage [83].

Considering the data from the last five years, the order of magnitude of the accident probabilities in İzmit Bay was consistent with the hybrid model predictions. The pre-trained RF classifier generated a probability distribution with the central tendency of µ = 3.026 × 10⁻⁴ and a variability of σ = 0.75 × 10⁻⁵, consistent with the present situation in the bay.Other AML and DL models predicted the accident probabilities with a higher range, revealing the increasing tendency with the increasing international ship traffic. Then, the MCS of the risk index for the potential spill resulted in 20 bbls from the collision with two oil tankers (Tier 1).

The hybrid model revealed the possible effects of an oil spill on the vulnerable Hersek lagoon in İzmit Bay. The Tüpraş refinery handles 11.9 million tons of oil and oil products in the bay yearly. The hydrodynamic model evaluated the spill’s expected travel time, trajectory, and probable effects. Under the assessed spill scenario, the mean arrival time on the shore was 274 min after the spill commenced. Determining the expected arrival time can assist in response measures and containment attempts. The model indicated that the spill would drift to the southern coastline, causing a risk to the sensitive Hersek Lagoon, as illustrated in Figure 32.

Crude oil and petroleum products shipped through the Gulf of İzmit account for 16% of Türkiye’s total cargo amount. Therefore, interruptions to refinery operations and port activities during emergency response efforts have economic consequences. As a result, the possible location for an Emergency Response Center in İzmit Bay would be near the southern anchorage of the Osman Gazi Bridge, on the western side of the bridge’s approach in the Hersek region (Figure 32).

The simulation results designate that the oil spill is likely to drift toward ecologically sensitive areas near the southern embankment of the bridge. Hersek Lagoon, formed by alluvial deposits from the Yalakdere River, characterizes a key geomorphological feature along the southern coastline of outer İzmit Bay. This lagoon covers an area of 1.4 km² with a shallow depth of 0.5 to 0.6 m. It is a critical breeding and feeding habitat for 232 bird species, making it an internationally important wetland (Photograph 1). Flamingos arriving at the lagoon in the fall spend the winter in the area before migrating to breed. During the winter months, the number of flamingos in the area for feeding and resting can reach up to 1500 individuals. The greater flamingo (Phoenicopterus roseus) is one of the six flamingo species in the world observed in the Hersek Lagoon among 232 bird species, as shown in Photograph 1 [84].

Hersek Lagoon is protected under the Ministry of Agriculture and Forestry General Directorate of Nature Conservation and National Parks, emphasizing its ecological significance. Two hundred thirty-two bird species, including waterfowl and flamingoes, utilize the lagoon for shelter, feeding, and breeding. The area also lies on a major migratory passage for hundreds of thousands of birds traveling between Europe and Africa during spring and autumn [85]. The interface between the sea–land and lagoon provides the breeding area for species, including sandwich tern, Mediterranean gull, Eurasian oystercatcher, little ringed plover, Caspian gull, and greater flamingo [86]. However, it is under pressure due to the deterioration of the natural water regime and increasing pollution [87].

In emergency response plans, it is fundamental to assess the weathering process of oil to monitor successful spill contingency acts. The model results are important in İzmit Bay, where the three commonly handled crude oils represent a substantial part of the bulk volume processed. The outcomes of this study provide critical inferences for marine spatial planners, policymakers, and researchers contributing to oil spill management and environmental protection actions. The hybrid model can assist in the development of effective mitigation strategies to minimize the ecological impacts of maritime accidents.

This strategic location may offer immediate access to both the bridge and the surrounding ports and shipyards, making it a suitable position for emergency response operations in the Gulf of İzmit. This location has been proposed considering only the effective and quick response principle as the center of operations, and it is consistent with the study of [88], in which the AHP-Topsis method was utilized for optimum location.

In summary, while the scenarios presented have a low probability of occurrence, the potential consequences are severe enough to warrant careful planning and consideration, especially considering the expected changes in shipping traffic volume and patterns. Establishing the Emergency Response Center in the outer bay is crucial for quickly and effectively responding to potential hazards. Hence, the proposed emergency response center is recommended to plan for team formation and readiness within 15 min of an incident.

7. Model Limitations

The complexity of the Hybrid Maritime Risk Assessment (HMRA) model, particularly with advanced neural networks and ensemble techniques, can hinder interpretability. Incorporating SHAP (Shapley Additive exPlanations) plots in the model’s predictions to provide insights into the contributions of different features, such as maneuvering difficulty and current velocity, improved the transparency of the model.

The model’s reliance on localized data may limit its generalizability to other regions. When applying the model to different geographic areas, there is a need for calibration, considering local variations in environmental conditions, vessel traffic, and accident types. The steps required to adapt the Hybrid Maritime Risk Assessment (HMRA) model for use in other regions while ensuring its predictive accuracy can be summarized as:

Data Collection and Preparation
- Environmental Data: HYDROTAM 3D is calibrated using local environmental data, including meteorological conditions (wind, waves, currents).
- Vessel Traffic Data: Region-specific data on vessel types, traffic patterns, and operational behaviors, such as frequency, routes, and maneuvers, are obtained from maritime authorities.
- Accident Data: Accident records, including the types, causes, and outcomes of incidents, such as collisions, groundings, and oil spills, are acquired from maritime authorities.
Calibration of the Hydrodynamic Model

The HYDROTAM-3D hydrodynamic model is calibrated with the local environmental data, including bathymetry, wind, wave, and current measurements reflecting the coastal features.

iii.

Modeling Local Accident Types and Risk Factors

Accident Categories: Accident types for the new region based on local maritime traffic and historical incident data are identified and categorized.
Maneuvering and Vessel Behavior: Vessel-specific features, such as GT, LOA, and maneuvering difficulty, are determined using accident records to reflect regional ship types and behaviors.

iv.

Calibration of AML and DL models

Feature Selection and Training: Environmental conditions, vessel characteristics, and operational factors are used to determine the most predictive features for the new region. AML models of LightGBM, XGBoost, Random Forest, and MLP are trained using region-specific accident data.
Hyperparameter Tuning: Hyperparameter optimization and cross-validation are performed to fine-tune model performance for the target region.

v.

Calibration of MCS

Local Spill Scenarios: MCS simulations of accident scenarios specific to the new region, including spill volumes, trajectories, and environmental impact based on regional maritime operations, are performed.
Environmental Variability: Regional variability in factors such as wind speed, currents, and sea conditions, which affect oil spill behavior and environmental risk, is incorporated by the sensitivity analyses of MCS.

vi.

Performance Evaluation and Validation

The adapted model is validated by comparing predictions to maritime incidents and spill data from the local region. Performance metrics such as RMSE, R², the F1-score, and confusion matrices are used for model validation.

8. Future Studies

Using the intuitive decision-support interface, such as a GIS-based dashboard, for real-time navigation risk management will improve the model’s practical capabilities. Integrating GIS tools with the HMRA model to visualize risk zones and guide emergency response will enhance the model’s serviceability and provide actionable awareness for maritime authorities.

Integrating the HMRA model with fuzzy-inference-based decision frameworks will improve decision making by combining quantitative predictions with qualitative perceptions. This integration can facilitate decision making to enhance both risk communication and policymaking.

Author Contributions

Data analysis, simulations, AML, and DL models: E.A.B.; writing—original draft preparation: C.E.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data supporting the findings of this study can be obtained from the HYDROTAM-3D model. Data are available from the authors upon reasonable request and with permission of HYDROTAM-3D http://www.hydrotam.com/ (accessed on 3 May 2023) and ECMWF https://www.ecmwf.int/ (accessed on 3 May 2023).

Acknowledgments

The authors thank DLTM Software Technologies at the Gazi University Technopark for the HYDROTAM-3D model and database that supported this study’s findings.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Major accidents that occurred in the İstanbul Port Authority (Board of Transportation Safety, 2025).

Ship Type	Accident Date	Time	Accident Location	Environmental Condition	GRT	LOA (m)	Port of Departure	Port of Arrival	Cargo Information	Crew	Severity
Bulk Cargo	30 December 2024	15:20	Tekirdağ	Wind and sea calm, good visibility	14,909	159.9	Dakar/Senegal	Aden/Yemen	Flour	16	Very Serious
Ferry	2 March 2023	6:05	40°52′39″ N–37°34′04″ E	Wind NE 4 BF, Weather sunny, Wave 0.5–1.25 m, good visibility	495	70	Bandırma-Çelebi Port	Tekirdağ Ceyport	Passenger/Car	6	Serious
Tug	31 August 2022	13:14	Tuzla/İstanbul	Wind N to NE 7–12 knot calm sea, good visibility, weather clear	103	18.7	Özkaradeniz Shipyard	Özkaradeniz Shipyard		4	Very Serious
Bulk Cargo	31 August 2022	13:14	Tuzla/İstanbul	Wind N to NE 7–12 knot calm sea, good visibility, weather clear	5632	108.2	İzmit	Tuzla		12	Very Serious
Petrol Chemical Tanker	28 August 2022	17:00	Bandırma	Calm sea and wind, good weather, good visibility	8391	135.6	Reni/Ukraine	Bandırma	Crude sunflower oil	15	Very Serious
Chemical Tanker	3 March 2022	12:15	Bosphorus South Entrance Area C (40°56′6″ N–28°51′0″ E )/İstanbul	Wind from N to NW 4 to 6 BS, Wave height 1–1.5 m, weather rainy-cloudy, sea calm, good visibility	2788	96	Sisam/Greece	Kulevi/Georgia		13	Very Serious
Agency Boat	3 March 2022	12:15	Bosphorus South Entrance Area C (40°56′6″ N–28°51′0″E)/İstanbul	Wind from N to NW 4 to 6 BS, Wave height 1–1.5 m, weather rainy-cloudy, sea 7 C, good visibility	24.97	14.3	Zeyport/İstanbul	Zeyport/İstanbul		2	Very Serious
Bulk Cargo	8 September 2021	14:10	İzmit Belde Port/İzmit	Wind from E 4 BS, weather partly cloudy, the Sea is at 2 Beaufort scale, good visibility	19,825	175.53	Shanghai China	Poti/Georgia	Roll Sheet/Chipboard/Sheet Metal	23	Very Serious
Container	17 June 2021	15:12	Bosphorus Northern Exit/İstanbul	Breeze north, calm sea, good visibility, weather clear	17,068	180.42	Haydarpaşa	Constanta/Romania	6840 MT containers	17	Very Serious
Fishing Boat	17 June 2021	15:12	Bosphorus Northern Exit/İstanbul	Breeze north, calm sea, good visibility, weather clear	3.09	7.3	Poyrazköy Port	Büyük Liman		3	Very Serious
Bulk Cargo	26 November 2019	19:15	Karabiga	Wind S to SE 3–5 BF, calm sea, Rainy weather, good visibility	784	54.9	Marmara Island	Offshore of Şarköy/Tekirdağ	Concrete block	6	Very Serious
Bulk Cargo	13 March 2019	17:40	Marmara Island	Wind NE 4 BF, Weather cloudy, Wave 2–2.5 m, Good visibility	994	76.15	Marmara Island/Turkey	İzmit/Turkey	1606 MT calcite	7	Very Serious
Bulk Cargo	7 April 2018	15:33	Bosphorus/İstanbul	Wind N to NE 4 BF, moderate sea, good visibility, overcast weather	38,732	225	Kavkaz Russia	Cidde Saudi Arabia		20	Serious
Split Barge	31 January 2018	4:30	Safiport Derince/İzmit	Wind from N 12–19 km/h, Wave height 0.2–0.3m, good visibility	376.31	48.5			Dirt	3	Occupational Accident
Bulk Cargo	10 January 2018	15:30	Derince Port/İzmit	Calm sea and wind, cloudy weather, good visibility	1249	79.3	Elefsis/Greece	İzmit/Turkey		11	Occupational Accident
Ferry	7 December 2017	0:13	Front of Port of Gestaş/Gelibolu	Wind SE direction 1–3 knot, sea calm, good visibility	466	47.66	Çardak/Çanakkale	Gelibolu/Çanakkale	Passenger/Car	5	Very Serious
Boat	7 December 2017	0:13	Front of Gestaş Port/Gelibolu	Wind SE direction 1–3 knot, sea calm, good visibility		4–5m				2	Very Serious
Bulk Cargo	5 December 2017	15:06	Marmara Ereğli/Tekirdağ	Wind from N 1–2 BS, weather clear, calm sea, good visibility	31,538	190	Kocaeli	Cristobal/Panama	33,378 MT steel rebar	20	Very Serious
Bulk Cargo	1 November 2017	3:52	Şile/İstanbul	Wind from W to NW 4 to 5 BS, Wave height 1.5–2.5 m, weather partly cloudy, calm sea, good visibility	1863	78.5	Gemlik	Karadeniz Ereğli	3150 MT tuff	9	Very Serious
LPG TANKER	29 April 2017	16:30	Habaş Platform/Yarımca	Wind East to SE 2–4 BF, Wave height 0.5–1 m, Good Visibility	6529	112.16	Temruk/Russia	Habaş Platform/İzmit	Air LPG mix	24	Very Serious
Agency Boat	29 April 2017	16:30	Habaş Platform/Yarımca	Wind East to SE 2–4 BF, Wave height 0.5–1 m, Good Visibility	10.96	9.5				11	Very Serious
Bulk Cargo	14 May 2017	15:50	Çelebi Bandırma Port/Sea of Marmara	Wind from NE 3 to 4 BS, clear weather, calm sea, good visibility	1998	77	Gemlik	Ambarlı	96 TEU containers	13	Serious

Table A2. Sample of the dual strategy of hyperparameter optimization for baseline configurations.

Model	Hyperparameters	Feature List	CV MSE	CV R²	Test MSE	Test RMSE	Test MAE	Test R²	CV Accuracy	Test Accuracy	Test Precision	Test Recall	Test F1	Test Balanced Accuracy	Test Log Loss
XgBoost	Objective: R² error Random state: 42 Learning rate: 0.05 Max depth: 6 N estimators: 200	Direction of daily Max Wind Maneuver Difficulty	0.00638	0.893	0.00700	0.0837	0.0597	0.907	0.919	0.941	0.909	1.000	0.952	0.929	0.348
XgBoost	Objective: R² error Random state: 42 Learning rate: 0.05 Max depth: 6 N estimators: 200	Monthly Average Wind Maneuver Difficulty	0.00645	0.893	0.01612	0.1269	0.0726	0.785	0.896	0.882	0.833	1.000	0.909	0.857	0.364
Random Forest	Random state: 42	Direction of daily max wind Maneuver Difficulty	0.00678	0.885	0.00696	0.0834	0.0601	0.907	0.926	0.941	0.909	1.000	0.952	0.929	0.347
XgBoost	Objective: R² error Random state: 42	Sea Condition Maneuver Difficulty	0.00684	0.888	0.01144	0.1070	0.0656	0.847	0.926	0.971	0.952	1.000	0.976	0.964	0.357
Histgb	Random state: 42, Max iteration: 200, Max depth: 6 Learning rate: 0.05 Min samples leaf: 10	Maneuver Difficulty Average Turn of Vessel	0.00684	0.886	0.00835	0.0914	0.0579	0.889	0.926	0.941	0.909	1.000	0.952	0.929	0.348
Random Forest	Random state: 42	Maneuver Difficulty	0.00720	0.877	0.00823	0.0907	0.0569	0.890	0.926	0.941	0.909	1.000	0.952	0.929	0.349
Random Forest	Random state: 42	Monthly Average Wind Average Current	0.00724	0.881	0.01117	0.1057	0.0608	0.851	0.896	0.912	0.870	1.000	0.930	0.893	0.356
Random Forest	Random state: 42	Monthly Foggy Days, Direction of daily max wind, Average Current, Maneuver Difficulty, LOA	0.00762	0.876	0.00678	0.0823	0.0554	0.910	0.911	0.941	0.909	1.000	0.952	0.929	0.342
XgBoost	Objective: R² error Random state: 42	Monthly Foggy Days, Direction of daily max wind, Average Current, Maneuver Difficulty, LOA	0.00827	0.865	0.00784	0.0885	0.0602	0.895	0.896	0.941	0.909	1.000	0.952	0.929	0.333
Random Forest	Random state: 42 N estimators: 200 Max depth: 10 Min sample split: 5 Min samples leaf: 2	Monthly Foggy Days Direction of daily max wind Average Current Maneuver Difficulty LOA	0.00925	0.842	0.00726	0.0852	0.0569	0.903	0.896	0.941	0.909	1.000	0.952	0.929	0.347
Histgb	Random state: 42	Monthly Foggy Days Direction of daily max wind, Average Current Maneuver Difficulty LOA	0.00963	0.837	0.01088	0.1043	0.0651	0.855	0.904	0.882	0.833	1.000	0.909	0.857	0.365
Histgb	Random state: 42 Max iteration: 200 Learning rate: 0.05 Max depth: 6 Min samples leaf: 20	Monthly Foggy Days Direction of daily max wind, Average Current Maneuver Difficulty LOA	0.00977	0.835	0.01038	0.1019	0.0642	0.862	0.896	0.912	0.905	0.950	0.927	0.904	0.363
LightGbm	Objective: Regression Random state: 42 Verbosity: −1	Monthly Foggy Days Direction of daily max wind, Average Current Maneuver Difficulty LOA	0.01215	0.785	0.01017	0.1009	0.0746	0.864	0.889	0.941	0.909	1.000	0.952	0.929	0.384
LightGbm	Objective: Regression, Random state: 42, Verbosity: −1 Num leaves: 31, Learning rate: 0.05, Max depth: 6, Min samples: 20, Subsample: 0.8, Co-sample by tree: 0.8	Monthly Foggy Days Direction of daily max wind, Average Current Maneuver Difficulty LOA	0.01580	0.727	0.01499	0.1225	0.0979	0.800	0.881	0.941	0.909	1.000	0.952	0.929	0.413
MLP	Random state: 42 Hidden layer sizes: 50 Max iteration: 500	Monthly Foggy Days, Direction of daily max wind, Average Current Maneuver Difficulty LOA	0.03775	0.381	0.04799	0.2191	0.1827	0.360	0.800	0.706	0.679	0.950	0.792	0.654	0.580
MLP	Random state: 42 Hidden layers 100 Max iteration: 1000 Learning rate: Adaptive Early stopping: True	Monthly Foggy Days Direction of daily max wind Average Current Maneuver Difficulty LOA	0.05964	0.033	0.01976	0.1406	0.0890	0.736	0.533	0.765	0.714	1.000	0.833	0.714	0.439

Table A3. Performance of deep learning models for five parameter inputs based on evaluation metrics.

Model	Training Time (s)	CV MSE (Mean)	CV MSE (std)	Test MSE	Test RMSE	Test MAE	CV Accuracy (Mean)	CV Balanced Acc (Mean)	CV Log Loss (Mean)	Test Accuracy	Test Precision	Test Recall	Test F1	Test Balanced Acc
CNN Tuned	43.9682	0.0320	0.0101	0.4882	0.0559	0.2365	0.1995	0.2537	0.7630	0.7835	0.5258	0.7353	0.7619	0.8000
CNN Baseline	319.3604	0.0382	0.0114	0.3787	0.0388	0.1971	0.1528	0.4816	0.7111	0.6938	0.5830	0.6176	0.6400	0.8000
TF LSTM Tuned	90.7243	0.0646	0.0108	−0.0271	0.0746	0.2731	0.2407	0.0048	0.6000	0.4983	0.6764	0.5882	0.5882	1.0000
TF CNN Tuned	63.8258	0.0719	0.0331	−0.0970	0.0812	0.2850	0.2473	−0.0843	0.5407	0.5706	0.6725	0.6176	0.6522	0.7500
MLP Tuned	17.8127	0.0820	0.0560	−0.4342	0.0447	0.2115	0.1820	0.4032	0.7037	0.7148	0.6027	0.7647	0.9286	0.6500
TF CNN Baseline	54.5599	0.0970	0.0701	−0.4581	0.0863	0.2938	0.2413	−0.1520	0.5333	0.6012	0.7529	0.5882	0.6875	0.5500
TF MLP Baseline	31.0620	0.1010	0.0553	−0.7347	0.0876	0.2960	0.2479	−0.1695	0.5556	0.5365	0.7792	0.4706	1.0000	0.1000
TF MLP Tuned	32.6604	0.1406	0.1089	−1.0670	0.0634	0.2519	0.2294	0.1533	0.4963	0.5387	0.9898	0.5882	0.5882	1.0000
MLP Baseline	4.3379	0.1602	0.1500	−1.9381	0.0555	0.2357	0.2126	0.2587	0.6963	0.6867	1.5766	0.7059	1.0000	0.5000

References

Main Search and Rescue Coordination Center. Accident and Events Statistics, 2016–2024; Main Search and Rescue Coordination Center: Ankara, Turkey, 2024. Available online: https://ulasimemniyeti.uab.gov.tr/uploads/pages/istatistikler/deniz-1.pdf (accessed on 17 April 2025).
UNCTAD. UNCTAD/RMT/2023 Review of Maritime Transport; United Nations Conference on Trade and Development: New York, NY, USA, 2023; ISBN 978-92-1-358456-9. Available online: https://unctad.org/publication/review-maritime-transport-2023 (accessed on 3 February 2024).
OECD. Globalisation, Transport and the Environment; Organisation for Economic Cooperation and Development OECD: Paris, France, 2010; ISBN 9789264079199. [Google Scholar]
Usluer, H.B.; Alkan, G.; Turan, O. A Ship Maneuvers Could be Predicted in the Turkish Straits by Marine Science Effects? Int. J. Environ. Geoinformatics 2022, 9, 95–101. [Google Scholar] [CrossRef]
Yildiz, S.; Uğurlu, Ö.; Wang, J.; Loughney, S. Application of the HFACS-PV approach for identification of human and organizational factors (HOFs) influencing marine accidents. Reliab. Eng. Syst. Saf. 2021, 208, 107395. [Google Scholar] [CrossRef]
IPIECA. Oil Spill Preparedness and Response: An Introduction Guidance Document for the Oil and Gas Industry; International Petroleum Industry Environmental Conservation Association: London, UK, 2019; Available online: https://www.ipieca.org/work/marine-spill-preparedness-and-response (accessed on 4 February 2024).
IMO. Guide to Maritime Security & The ISPS Code, International Maritime Organization; IMO: London, UK, 2021; ISBN 978-9280117400. Available online: https://www.imo.org/en/OurWork/Security/Pages/SOLAS-XI-2%20ISPS%20Code.aspx (accessed on 3 February 2024).
Kayıran, B.; Yazır, D.; Aslan, B. Data-driven Bayesian network approach to maritime accidents involved by dry bulk carriers in Turkish search and rescue areas. Reg. Stud. Mar. Sci. 2023, 67, 103193. [Google Scholar] [CrossRef]
Arslanoglu, B.; Elidolu, G.; Uyanık, T. Application of Machine Learning Methods for Prediction of Seafarer Safety Perception. Int. J. Marit. Eng. 2023, 164, 269–282. [Google Scholar] [CrossRef]
Korçak, M.; Balas, C.E. Reducing the probability for the collision of ships by changing the passage schedule in Istanbul Strait. Int. J. Disaster Risk Reduct. 2020, 48, 101593. [Google Scholar] [CrossRef]
Sakar, C.; Toz, A.C.; Buber, M.; Koseoglu, B. Risk Analysis of Grounding Accidents by Mapping a Fault Tree Into a Bayesian Network. Appl. Ocean Res. 2021, 113, 102764. [Google Scholar] [CrossRef]
Galieriková, A.; Dávid, A.; Materna, M.; Mako, P. Study of maritime accidents with hazardous substances involved: Comparison of HNS and oil behaviours in marine environment. Transp. Res. Procedia 2021, 55, 1050–1064. [Google Scholar] [CrossRef]
Ekici, C.V.; Öztürk, Ü.; Şenol, Y.E. A data-driven Ship Risk Profile model for Turkish Straits (TS-SRP) using Machine Learning. Ocean Eng. 2024, 311, 119002. [Google Scholar] [CrossRef]
IMO. Convention on the International Regulations for Preventing Collisions at Sea (COLREG); IMO: London, UK, 2016; Available online: https://www.imo.org/en/About/Conventions/Pages/COLREG.aspx (accessed on 4 February 2024).
Saçu, Ş.; Şen, O.; Erdik, T. A stochastic assessment for oil contamination probability: A case study of the Bosphorus. Ocean Eng. 2021, 231, 109064. [Google Scholar] [CrossRef]
Dimitrakiev, D.; Milev, D.; Gunes, E. The Risk Analysis of Chemical Tankers Passing Through the Turkish Straits between 2010–2022. Strateg. Policy Sci. Educ.-Strateg. Na Obraz. Nauchnata Polit. 2023, 31, 45–55. [Google Scholar] [CrossRef]
Lee, D.; Namgung, H.; Yoo, S.-L. Development of a Risk Assessment System for Navigational Obstacles Considering Collision and Pollution Risks. Appl. Sci. 2025, 15, 2325. [Google Scholar] [CrossRef]
Uyanık, T.; Karatuğ, Ç.; Arslanoğlu, Y. Machine learning based visibility estimation to ensure safer navigation in strait of Istanbul. Appl. Ocean Res. 2021, 112, 102693. [Google Scholar] [CrossRef]
Rawson, A.; Brito, M.; Sabeur, Z.; Tran-Thanh, L. A machine learning approach for monitoring ship safety in extreme weather events. Saf. Sci. 2021, 141, 105336. [Google Scholar] [CrossRef]
Merrick, J.R.W.; Dorsey, C.A.; Wang, B.; Grabowski, M.; Harrald, J.R. Measuring Prediction Accuracy in a Maritime Accident Warning System. Prod. Oper. Manag. 2022, 31, 819–827. [Google Scholar] [CrossRef]
Rawson, A.; Brito, M.; Sabeur, Z. Spatial Modeling of Maritime Risk Using Machine Learning. Risk Anal. 2022, 42, 2291–2311. [Google Scholar] [CrossRef]
Liu, K.; Yu, Q.; Yuan, Z.; Yang, Z.; Shu, Y. A systematic analysis for maritime accidents causation in Chinese coastal waters using machine learning approaches. Ocean. Coast. Manag. 2021, 213, 105859. [Google Scholar] [CrossRef]
Feng, Y.; Wang, X.; Chen, Q.; Yang, Z.; Wang, J.; Li, H.; Xia, G.; Liu, Z. Prediction of the severity of marine accidents using improved machine learning. Transp. Res. E Logist. Transp. Rev. 2024, 188, 103647. [Google Scholar] [CrossRef]
Brandt, P.; Munim, Z.H.; Chaal, M.; Kang, H.-S. Maritime accident risk prediction integrating weather data using machine learning. Transp. Res. D Transp. Environ. 2024, 136, 104388. [Google Scholar] [CrossRef]
Munim, Z.H.; Sørli, M.A.; Kim, H.; Alon, I. Predicting maritime accident risk using Automated Machine Learning. Reliab. Eng. Syst. Saf. 2024, 248, 110148. [Google Scholar] [CrossRef]
Madsen, P.M.; Dillon, R.L.; Morris, E.T. Using near misses, artificial intelligence, and machine learning to predict maritime incidents: A U.S. Coast Guard case study. Risk Anal. 2024, 45, 830–845. [Google Scholar] [CrossRef]
Yuzui, T.; Kaneko, F. Toward a hybrid approach for the risk analysis of maritime autonomous surface ships: A systematic review. J. Mar. Sci. Technol. 2025, 30, 153–176. [Google Scholar] [CrossRef]
Ugurlu, H.; Cicek, I. Analysis and assessment of ship collision accidents using Fault Tree and Multiple Correspondence Analysis. Ocean Eng. 2022, 245, 110514. [Google Scholar] [CrossRef]
Adland, R.; Jia, H.; Lode, T.; Skontorp, J. The value of meteorological data in marine risk assessment. Reliab. Eng. Syst. Saf. 2021, 209, 107480. [Google Scholar] [CrossRef]
Yang, Y.; Liu, Y.; Li, G.; Zhang, Z.; Liu, Y. Harnessing the power of Machine learning for AIS Data-Driven maritime Research: A comprehensive review. Transp. Res. E Logist. Transp. Rev. 2024, 183, 103426. [Google Scholar] [CrossRef]
Korupoju, A.K.; Kapadia, V.; Vilwathilakam, A.S.; Samanta, A. Ship Collision Risk Evaluation using AIS and weather data through fuzzy logic and deep learning. Ocean Eng. 2025, 318, 120116. [Google Scholar] [CrossRef]
Dağkıran, B.; Bolat, P. Weighting the factors affecting safety of navigation: A case study for the Gulf of İzmit, Türkiye. Marit. Technol. Res. 2024, 7, 274135. [Google Scholar] [CrossRef]
Altalhan, M.; Algarni, A.; Turki-Hadj Alouane, M. Imbalanced Data Problem in Machine Learning: A Review. IEEE Access 2025, 13, 13686–13699. [Google Scholar] [CrossRef]
Xiao, Y.; Jin, M.; Qi, G.; Shi, W.; Li, K.X.; Du, X. Interpreting the influential factors in ship detention using a novel random forest algorithm considering dataset imbalance and uncertainty. Eng. Appl. Artif. Intell. 2024, 133, 108369. [Google Scholar] [CrossRef]
Painsky, A.; Wornell, G. On the Universality of the Logistic Loss Function. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 936–940, ISBN 978-1-5386-4781-3. [Google Scholar] [CrossRef]
Tian, Y.; Meng, H.; Yuan, F. FREGNet: Ship Recognition Based on Feature Representation Enhancement and GCN Combiner in Complex Environment. IEEE Trans. Intell. Transp. Syst. 2024, 25, 15641–15653. [Google Scholar] [CrossRef]
Kamal, B.; Çakır, E. Data-driven Bayes approach on marine accidents occurring in Istanbul strait. Appl. Ocean Res. 2022, 123, 103180. [Google Scholar] [CrossRef]
Wang, M.; Wang, Y.; Ding, J.; Yu, W. Interaction aware and multi-modal distribution for ship trajectory prediction with spatio-temporal crisscross hybrid network. Reliab. Eng. Syst. Saf. 2024, 252, 110463. [Google Scholar] [CrossRef]
Bologna, G. A Rule Extraction Technique Applied to Ensembles of Neural Networks, Random Forests, and Gradient-Boosted Trees. Algorithms 2021, 14, 339. [Google Scholar] [CrossRef]
Balas, L.; Özhan, E. A Baroclinic Three Dimensional Numerical Model Applied to Coastal Lagoons. Comput. Sci. —ICCS 2003 2003, 2658, 205–212. [Google Scholar] [CrossRef]
HYDROTAM-3D. HYDROTAM-3D, Three Dimensional Hydrodynamic Transport and Water Quality Model. 2025. Available online: http://www.hydrotam.com/ (accessed on 3 May 2023).
Balas, L.; Inan, A.; Genc Numanoglu Asli. Modelling of Dilution of Thermal Discharges in Enclosed Coastal Waters. Res. J. Chem. Env. 2013, 17, 82–89. [Google Scholar]
Genc Numanoğlu, A.; Inan, A.; Yilmaz, N.; Balas, L. Modeling of Erosion at Goksu Coasts. J. Coast. Res. 2013, 65, 2155–2160. [Google Scholar] [CrossRef]
ECMWF. IFS Documentation CY41R2—Part III: Dynamics and Numerical Procedures. In IFS Documentation CY41R2; ECMWF: Reading, UK, 2016. [Google Scholar] [CrossRef]
Uğurlu, A.; Balas, E.A.; Balas, C.E.; Akbaş, S.O. Unleashing the Potential of a Hybrid 3D Hydrodynamic Monte Carlo Risk Model for Maritime Structures’ Design in the Imminent Climate Change Era. J. Mar. Sci. Eng. 2024, 12, 931. [Google Scholar] [CrossRef]
Balas, E.A. A hybrid Monte Carlo simulation risk model for oil exploration projects. Mar. Pollut. Bull. 2023, 194, 115270. [Google Scholar] [CrossRef]
Durap, A.; Balas, C.E.; Çokgör, Ş.; Balas, E.A. An Integrated Bayesian Risk Model for Coastal Flow Slides Using 3-D Hydrodynamic Transport and Monte Carlo Simulation. J. Mar. Sci. Eng. 2023, 11, 943. [Google Scholar] [CrossRef]
Ülker, D.; Burak, S.; Balas, L.; Çağlar, N. Mathematical modelling of oil spill weathering processes for contingency planning in Izmit Bay. Reg. Stud. Mar. Sci. 2022, 50, 102155. [Google Scholar] [CrossRef]
General Directorate of Coastal Safety. General Directorate of Coastal Safety. Vessel Traffic Service of İzmit Bay. Available online: https://www.kiyiemniyeti.gov.tr/gemi_trafik_bilgi_sistemleri (accessed on 17 December 2024).
Board of Transportation Safety. Board of Transportation Safety. Marine Accident Investigation Reports. Available online: https://ulasimemniyeti.uab.gov.tr/deniz (accessed on 17 February 2025).
AASHTO. Guide Specifications and Commentary for Vessel Collision Design of Highway Bridges with Interim Revisions; AASHTO: Washington, DC, USA, 2009; Available online: https://store.transportation.org/Item/PublicationDetail?ID=1740 (accessed on 3 February 2024)ISBN 978-1-56051-495-4.
Küçükosmanoğlu, A.; Arlı Küçükosmanoğlu, Ö. Estimation of Vessel Passage Through Bosphorus. Int. J. Eng. Appl. Sci. 2021, 13, 56–70. [Google Scholar] [CrossRef]
Küçükosmanoğlu, A. Maritime Accidents Forecast Model for Bosphorus. Ph.D. Thesis, Middle East Technical University, Ankara, Turkey, 2012. [Google Scholar]
Mazen, F.; Mazen, A. Airbus Ship Classification, Detection and Segmentation using Cutting-Edge Deep Learning Techniques. Fayoum Univ. J. Eng. 2025, 8, 68–78. [Google Scholar] [CrossRef]
El Mekkaoui, S.; Benabbou, L.; Caron, S.; Berrado, A. Deep Learning-Based Ship Speed Prediction for Intelligent Maritime Traffic Management. J. Mar. Sci. Eng. 2023, 11, 191. [Google Scholar] [CrossRef]
Zhang, X.; Lim, H.; Fu, X.; Wang, K.; Xiao, Z.; Qin, Z. Maritime-Context Text Identification for Connecting Artificial Intelligence (AI) Models. In Proceedings of the 2024 IEEE Conference on Artificial Intelligence (CAI), Singapore, 25–27 June 2024; pp. 899–904, ISBN 979-8-3503-5409-6. [Google Scholar] [CrossRef]
Li, Y.; Yu, Q.; Yang, Z. Vessel Trajectory Prediction for Enhanced Maritime Navigation Safety: A Novel Hybrid Methodology. J. Mar. Sci. Eng. 2024, 12, 1351. [Google Scholar] [CrossRef]
Jiang, X.; Dai, Y.; Li, S.; Ma, R.; Du, T.; Zou, Y.; Zhang, P.; Zhang, Y.; Sun, P. Research on ship speed prediction based on time series imaging and deep convolutional network fusion method. Appl. Ocean Res. 2025, 154, 104384. [Google Scholar] [CrossRef]
Turkish Chamber of Shipping. Maritime Sector Report; Turkish Chamber of Shipping: İstanbul, Turkey, 2025; Available online: https://www.denizticaretodasi.org.tr/en/publications/sectorreport (accessed on 11 March 2024).
Ministry of Transportation and Infrastructure. Turkish Straits Ship Passage Statistics. Available online: https://denizcilikistatistikleri.uab.gov.tr/turk-bogazlari-gemi-gecis-istatistikleri (accessed on 3 May 2025).
European Maritime Safety Agency. Annual Overview of Marine Casualties and Incidents 2024; European Maritime Safety Agency: Lisbon, Portugal, 2024. [Google Scholar]
Australian Maritime Safety Authority. Australian Maritime Safety Authority, National Plan to Combat Pollution of the Sea by Oil and Other Noxious and Hazardous Substances Annual Reports; Australian Maritime Safety Authority: Canberra, Australia, 2020. [Google Scholar]
Det Norske Veritas. Final Report Assessment of the Risk of Pollution from Marine Oil Spills in Australian Ports and Waters, Model of Offshore Oil Spill Risks, Appendix V—Offshore Oil Spill Risk Models; Det Norske Veritas: Sydney, Australia, 2011. [Google Scholar]
Australian Maritime Safety Authority. Great Barrier Reef Pilotage Fatigue Risk Assessment [Electronic Resource]; Australian Maritime Safety Authority: Canberra, Australia, 2001. [Google Scholar]
ITOPF Tanker Owners Pollution Federation. Oil Tanker Spill Statistics; ITOPF Tanker Owners Pollution Federation: London, UK, 2025; Available online: https://www.itopf.org/knowledge-resources/data-statistics/ (accessed on 4 February 2024).
Sackeyfio, N. Oil Tanker Spill Statistics Figures; Creative Commons Attribution License permission granted on 16 April 2025; ITOPF: London, UK, 2025. [Google Scholar]
Balas, E.A.; Genç, A.N.; Balas, C.E. Strategic Adaptation to Climate Change through Monte Carlo-Based Multi-Criteria Decision Model in Marine Spatial Planning. J. Coast. Res. 2024, 113, 169–174. [Google Scholar] [CrossRef]
Balas, E.A.; Uğurlu, A.; Balas, C.E. A Hybrid Probabilistic Design Model of Riverine Jetties Incorporating Three-Dimensional Numerical Simulations of Transport Phenomena in the Context of Emerging Climate Change. J. Coast. Res. 2024, 113, 220–224. [Google Scholar] [CrossRef]
Buyruk, T.; Balas, E.A.; Genç, A.N.; Balas, L. Exploring Renewable Energy on the Coastline of Türkiye: Wind and Wave Power Potential. J. Coast. Res. 2024, 113, 788–792. [Google Scholar] [CrossRef]
Balas Egemen, A. Numerical Modelling of Microplastics Transport in Mersin Bay. Master’s Thesis, Middle East Technical University, Ankara, Turkey, 2023. [Google Scholar]
Wamdi Group. The WAM Model—A Third Generation Ocean Wave Prediction Model. J. Phys. Ocean. 1988, 18, 1775–1810. [Google Scholar] [CrossRef]
Eun Choi, J.; Won Shin, J.; Wan Shin, D. Vector SHAP Values for Machine Learning Time Series Forecasting. J. Forecast. 2025, 44, 635–645. [Google Scholar] [CrossRef]
Shintani, A.; Taniguchi, N.; Nakayama, Y.; Tanaka, T.; Hamada, K. Development of marine accident probability prediction model for pleasure boats using ship accident database in central part of Seto Inland Sea. Ocean Eng. 2025, 322, 120460. [Google Scholar] [CrossRef]
Kriuchkova, A.; Toloknova, V.; Drin, S. Predictive model for a product without history using LightGBM. Pricing model for a new product. Mohyla Math. J. 2024, 6, 6–13. [Google Scholar] [CrossRef]
Mariano Trainiti, G.; Piscopo, V.; Cianelli, D.; Zambianchi, E. Modelling Oil Spill and Dispersion at Sea from a Double Hull Oil Tanker Following Collision Events. In Proceedings of the 2024 IEEE International Workshop on Metrology for the Sea; Learning to Measure Sea Health Parameters (MetroSea), Portorose, Slovenia, 14–16 October 2024; pp. 181–185, ISBN 979-8-3503-7900-6. [Google Scholar] [CrossRef]
Zhu, G.; Xie, Z.; Xu, H.; Wang, N.; Zhang, L.; Mao, N.; Cheng, J. Oil Spill Environmental Risk Assessment and Mapping in Coastal China Using Automatic Identification System (AIS) Data. Sustainability 2022, 14, 5837. [Google Scholar] [CrossRef]
Vidmar, P.; Perkovič, M. Update on Risk Criteria for Crude Oil Tanker Fleet. J. Mar. Sci. Eng. 2023, 11, 695. [Google Scholar] [CrossRef]
Asu, I.; Lale, B. Numerical Modelling of Oil Spill. In Proceedings of the 5th IASME/WSEAS Int Conf on Water Resources, Hydraulics & Hydrology/ Proceedings of the 4th IASME/WSEAS Int Conf on Geology and Seismology 2010, Cambridge, UK, 23–25 February 2010; pp. 62–73. [Google Scholar]
Republic of Turkey Energy Market Regulatory Authority. Oil Market Industry Report of Türkiye; Annual Report; Republic of Turkey Energy Market Regulatory Authority: Ankara, Turkey, 2024. Available online: https://www.epdk.gov.tr/Detay/Icerik/2-14417/2024-yili-subat-ayi-petrol-piyasasi-sektor-raporu (accessed on 26 April 2025).
Ülker, D. GIS-Based Modelling for Advection and Diffusion of Oil Pollution in Coastal Waters. PhD Thesis, İstanbul University, İstanbul, Turkey, 2022. [Google Scholar]
Boranbayeva, L.; Boiko, G.; Sharifullin, A.; Lubchenko, N.; Sarmurzina, R.; Kozhamzharova, A.; Mombekov, S. Analysis of the Processes of Paraffin Deposition of Oil from the Kumkol Group of Fields in Kazakhstan. Processes 2024, 12, 1052. [Google Scholar] [CrossRef]
Shrestha, A.; Mahmood, A. Review of Deep Learning Algorithms and Architectures. IEEE Access 2019, 7, 53040–53065. [Google Scholar] [CrossRef]
Haber Global National Newspaper; İhlas News Agency. Two Dry Cargo Ships Collided in the Gulf of Izmit. Habertakvimi.com. 2020. Available online: https://haberglobal.com.tr/gundem/izmit-korfezi-nde-iki-kuru-yuk-gemisi-carpisti-33455 (accessed on 17 April 2025).
Bülbül, F. Unpublished Nature Photographs; Altınordu, Türkiye, 2025. [Google Scholar]
Daily Sabah. Turkish Lagoon Gives Shelter to Flamingos Fleeing Drought. Editorial Daily Sabah. 2022. Available online: https://www.dailysabah.com/turkey/turkish-lagoon-gives-shelter-to-flamingosfleeing-drought/news (accessed on 12 March 2025).
Hisli, O.; Balkıs, H.; Mülayim, A. Macrobenthic Invertebrates of the Hersek Lagoon (Marmara Sea, Turkey) under Pollution Pressure. Acta Zool. Bulg. 2022, 74, 445–453. [Google Scholar]
Akay, E.; Dalkıran, N. Assessing biological water quality of Yalakdere stream (Yalova, Turkey) with benthic macroinvertebrate-based metrics. Biologia 2020, 75, 1347–1363. [Google Scholar] [CrossRef]
Koseoglu, B.; Buber, M.; Toz, A.C. Optimum site selection for oil spill response center in the Marmara Sea using the AHP-TOPSIS method. Arch. Environ. Prot. 2018, 44, 38–49. [Google Scholar] [CrossRef]

Figure 1. The integrated approach of the Hybrid Risk Assessment Model.

Figure 2. Flowchart of the AML models for accident prediction.

Figure 3. The detailed sub-dataset of maritime accident categories by the İstanbul Port Authority.

Figure 4. Flowchart of the Convolutional Neural Network (CNN) Deep Learning Model.

Figure 5. Flowchart of the Multi-Layer Perceptron Deep Learning (MLPDL) Model.

Figure 6. (a) The distribution of oil spills greater than seven tons from 1970 to 2024; (b) their causes categorized by spill size [65,66].

Figure 7. Distribution of the amount of oil spilled from tanker incidents that occurred between 1970 and 2024 [65,66].

Figure 8. The AIS vessel traffic density map of İzmit Bay [32].

Figure 9. Yearly and seasonal wind (V m/s) roses of ECMWF-OA 40.7° N–29.6° E for 2000–2024. (Y): Annual (A): Winter (B): Spring (C): Summer (D): Autumn [41].

Figure 10. Monthly distribution of mean and extreme winds.

Figure 11. Yearly and seasonal wave roses of significant wave heights (ECMWF-OA 40.7° N–29.6° E 2000–2024) [41].

Figure 12. Extreme significant wave height cumulative probability F(H_s) distribution of ECMWF OA 40.7° N–29.6° E for 2000–2024 [41].

Figure 13. Current roses at the (a) surface (b) sea bottom [41].

Figure 14. The effect of 17 maritime factors on accident risks according to the tree-based models (a) HistGB, (b) Random Forest (RF), (c) XGBoost, and (d) LightGBM.

Figure 15. SHAP plot of influence for the baseline RF model with 17 parameters.

Figure 16. SHAP plot of influence for the baseline LightGBM model with 17 parameters.

Figure 17. Comparison of the true vs. predicted plots across baseline AML and DL models: (a) LightGBM, (b) MLP, (c) XGBoost, (d) RF, (e) HistGB, and (f) Deep Learning MLP.

Figure 18. Comparison of the confusion matrices for baseline AML and DL models: (a) LightGBM, (b) MLP, (c) XGBoost, (d) RF, (e) HistGB, and (f) Deep Learning MLP.

Figure 19. Deep learning model comparisons: (a) the training history (b) error distribution (RMSE) (c) F1-score (d) balanced accuracy.

Figure 20. MCS spill assessment sub-model.

Figure 21. The key input parameters in MCS for the coaster vessels and calm sea state.

Figure 22. The key input parameters in the sensitivity study of MCS for international trade and storm sea state.

Figure 23. The pre-trained Random Forest AML within the MCS framework, which generated the accident probability density function and corresponding box plot.

Figure 24. The annual accident probabilities in İzmit Bay according to the MCS of AML and corresponding box plots.

Figure 25. The annual probability distribution of oil spills for collisions in tons by MCS.

Figure 26. The black spot location of collision risk at the spill coordinates of 40.7546° N–29.5443° E.

Figure 27. The spill trajectory driven by currents generated under moderate breeze conditions (7 m/s, Beaufort Scale 4) from NE six hours after the discharge of 20 bbl (2.8 tons) of Kazakhstan oil in the outer basin of the bay (arrows indicate current vectors).

Figure 28. Variations in the Kazakhstan oil plume characteristics under moderate breeze conditions (7 m/s, Beaufort Scale 4) from NE: (a) evaporation and (b) density.

Figure 29. Comparison of AML models by evaluation metrics at the testing stage: (a) MSE and (b) accuracy.

Figure 30. Comparison of the speed and accuracy of AML models according to the time taken to make 1000 predictions.

Figure 31. (a) Bow damage of cargo vessel and (b) tear in the hull above the waterline of a container ship [83].

Figure 32. The vulnerable Hersek Lagoon, spill location, and projected position of the main response center.

Photograph 1. Flamingos in the Hersek Lagoon (Bülbül, F. Nature Photographs, 2025) [84].

Table 1. Tier 1 spill scenario of 20 bbl (2.8 tons) crude oil.

Crude Oil	API	Density (g/m³) at 17 °C	Viscosity, cSt at 17 °C	Wind
Kazakhstan Light Oil	42.5	0.819	14.9	NE 7 m/s

Table 2. The evaporation rate, density, and viscosity assessed six hours after the 20 bbl spill.

Evaporation (%)	Density (g/cm³)	Viscosity (cSt)	Spill Length (m)
19.6	0.863	144	2000

Table 3. (a) Maritime Traffic and Accident Categorization in the Gulf of İzmit for 2019–2023 [49]. (b) Accident Types and Probabilities in the Gulf of İzmit for 2019–2023 [49].

(a)
Year	Vessel Movements	Cargo Handled (Tons)	Maritime Accidents	Vessel Failures	Emergency Incidents	Rule Violation	Collision
2023	43,145	90	12	33	66	126	2
2022	42,391	93	9	43	37	114	2
2021	42,167	85	4	36	53	143	0
2020	40,502	78	6	49	40	124	1
2019	39,150	75	6	43	27	120	3
(b)
Year	Fire	Flooding	Grounding	Man Overboard	Sinking	Conflict	Accident Probability
2023	1	1	1	1	1	5	2.78 × 10⁻⁴
2022	2	0	1	4	0	0	2.12 × 10⁻⁴
2021	0	1	1	0	0	2	9.49 × 10⁻⁴
2020	2	0	0	0	0	3	1.48 × 10⁻⁴
2019	0	0	0	0	0	3	1.53 × 10⁻⁴

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Balas, E.A.; Balas, C.E. Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations. J. Mar. Sci. Eng. 2025, 13, 939. https://doi.org/10.3390/jmse13050939

AMA Style

Balas EA, Balas CE. Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations. Journal of Marine Science and Engineering. 2025; 13(5):939. https://doi.org/10.3390/jmse13050939

Chicago/Turabian Style

Balas, Egemen Ander, and Can Elmar Balas. 2025. "Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations" Journal of Marine Science and Engineering 13, no. 5: 939. https://doi.org/10.3390/jmse13050939

APA Style

Balas, E. A., & Balas, C. E. (2025). Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations. Journal of Marine Science and Engineering, 13(5), 939. https://doi.org/10.3390/jmse13050939

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Maritime Risk Assessment: A Cutting-Edge Hybrid Model Integrating Automated Machine Learning and Deep Learning with Hydrodynamic and Monte Carlo Simulations

Abstract

1. Introduction

2. The Hybrid Maritime Risk Assessment (HMRA) Model

2.1. AML Sub-Models

2.2. DL Sub-Models

2.2.1. The CNN-DL Model

2.2.2. The MLP-DL Model

2.2.3. The TensorFlow CNN-DL Model

2.2.4. The TensorFlow LSTM-DL Model

2.2.5. The TensorFlow MLP-DL Model

2.3. MCS Sub-Model

2.4. Wind Climate Model

2.5. Wave Climate Model

2.6. Current Climate Model

2.7. Pollutant Transport Model

3. Application of the Hybrid Model

3.1. Wind Climate

3.2. Wave Climate

3.3. Current Climate

3.4. The AML and DL Sub-Models

3.5. The MCS Sub-Model

4. Hydrodynamic Transport Spill Simulation

5. Discussion of Results

6. Conclusions

7. Model Limitations

8. Future Studies

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI