Next Article in Journal
A Knowledge–Data Dual-Driven Groundwater Condition Prediction Method for Tunnel Construction
Previous Article in Journal
Generative AI in Education: Mapping the Research Landscape Through Bibliometric Analysis
Previous Article in Special Issue
SYNCode: Synergistic Human–LLM Collaboration for Enhanced Data Annotation in Stack Overflow
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

AI in Maritime Security: Applications, Challenges, Future Directions, and Key Data Sources

1
Department of Science and Engineering, Southampton Solent University, Southampton SO14 0YN, UK
2
Warsash Maritime School, Southampton Solent University, Southampton SO31 9ZL, UK
*
Authors to whom correspondence should be addressed.
Information 2025, 16(8), 658; https://doi.org/10.3390/info16080658 (registering DOI)
Submission received: 30 June 2025 / Revised: 26 July 2025 / Accepted: 29 July 2025 / Published: 31 July 2025
(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

Abstract

The growth and sustainability of today’s global economy heavily relies on smooth maritime operations. The increasing security concerns to marine environments pose complex security challenges, such as smuggling, illegal fishing, human trafficking, and environmental threats, for traditional surveillance methods due to their limitations. Artificial intelligence (AI), particularly deep learning, has offered strong capabilities for automating object detection, anomaly identification, and situational awareness in maritime environments. In this paper, we have reviewed the state-of-the-art deep learning models mainly proposed in recent literature (2020–2025), including convolutional neural networks, recurrent neural networks, Transformers, and multimodal fusion architectures. We have highlighted their success in processing diverse data sources such as satellite imagery, AIS, SAR, radar, and sensor inputs from UxVs. Additionally, multimodal data fusion techniques enhance robustness by integrating complementary data, yielding more detection accuracy. There still exist challenges in detecting small or occluded objects, handling cluttered scenes, and interpreting unusual vessel behaviours, especially under adverse sea conditions. Additionally, explainability and real-time deployment of AI models in operational settings are open research areas. Overall, the review of existing maritime literature suggests that deep learning is rapidly transforming maritime domain awareness and response, with significant potential to improve global maritime security and operational efficiency. We have also provided key datasets for deep learning models in the maritime security domain.

1. Introduction

The vastness of the world’s oceans and seas presents both a vital conduit for global trade and a complex frontier for national and international security. With more than 80% of international trade conducted through maritime routes, and up to 99% of the global data transmission taking place through undersea cables, maritime security has become an indispensable component of global stability [1]. While maritime security is of critical importance, it is facing an increasing number of security threats deriving from illicit activities in waters, competition for natural resources, and geopolitical rivalries. The most pressing security threats today include drug trafficking [2], human smuggling [3], unauthorised fishing [4], espionage and spy vessels [5], oil spills and other environmental violations [6].
The increasing sophistication and scale of these illegal activities in the waters pose a variety of threats that are extremely difficult to monitor, predict, and mitigate for coast guards and other law enforcement agencies using traditional surveillance methods. This is because the traditional surveillance methods, mainly relying on in-person observation, coastal radars, patrol vessels, satellite imagery and aerial reconnaissance, suffer from high costs, limited coverage and delayed response. Moreover, the sheer amount of maritime data from AIS (Automatic Identification Systems), SAR (Synthetic Aperture Radar), and drone footage is not feasible to monitor manually in a timely manner [7].
The recent developments in artificial intelligence (AI) and deep learning techniques present a promising solution for automated intelligent systems to monitor and detect maritime security threats and act promptly [7]. Applications of deep learning, a specialised subfield of machine learning, are emerging as a scalable transformative force in maritime domain awareness and threat detection that enables the processing of vast amounts of multimodal data coming from various sources with high accuracy.
Deep learning focuses on learning patterns in large datasets using hierarchical structures known as artificial neural networks (ANNs). These networks, inspired by the human brain, are implemented using layers of interconnected nodes where each layer transforms the input data in increasingly abstract ways. ANNs come in different types, for example, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM), just to name a few. Each type has its pros and cons and the choice depends on the application in hand. For example, a CNN can detect vessels from satellite and drone imagery [8,9,10], and recurrent neural networks (RNNs) can be used to analyse temporal patterns in a video feed showing vessel movements to detect anomalies [11]. Various object detection algorithms, such as “You Only Look Once” (YOLO), RetinaNet, Vision Transformers (ViT), and DeepSORT, enable real-time detection of illegal fishing boats, unflagged ships, or vessels engaged in smuggling [12,13,14,15,16].
In a typical deep learning model, such as a CNN for image data or an RNN for sequential or video data, the network is trained using large datasets and various optimisation algorithms to maximise the prediction accuracy. As the network processes data through multiple hidden layers, it automatically learns relevant features, for example, edges, shapes, or motion patterns. This ability to self-learn complex, nonlinear relationships makes deep learning a powerful tool for maritime security tasks such as detecting suspicious vessels or an object on the sea surface [3,16]. In addition to threat detection, deep learning can also be used for a number of related objectives, e.g., optimising patrolling routes for coastguards, and simulating potential threat scenarios for training purposes.

1.1. Maritime Security Threats and Illegal Activities

Modern maritime threats are multifaceted, often transnational, and highly dynamic. In the following, we review the most pressing threats.

1.1.1. Illegal Migration and Border Crossings

A number of maritime borders, including the Mediterranean Sea and the English Channel, have seen a rise in illegal migrations using overcrowded and unsafe ships [17]. The lack of effective and real-time surveillance systems makes it difficult for authorities to intervene promptly, which often leads to humanitarian crises and political issues.

1.1.2. Drug Trafficking and Human Smuggling

Organised crime groups are increasingly using maritime routes in many parts of the world. These organised crime groups are using sea routes to smuggle illegal drugs and migrants. Often, small and unregistered boats are used to avoid detection and exploit the limitations of manually operated surveillance systems. These unlawful activities not only put human lives at risk but also support other criminal networks [2].

1.1.3. Illegal Fishing

Illegal, Unreported, and Unregulated (IUU) fishing activities in the coastal and deep sea are a threat to fish stocks and a challenge to food security. These activities are costing the economy billions of USD annually [18]. These activities often occur at places in the sea which are protected or disputed, making detection and law enforcement a challenge. The illegal fishing boats use sophisticated evasion techniques to deceive or completely avoid the tracking and monitoring systems [19].

1.1.4. Environmental Threats and Marine Pollution

Environmental threats and marine pollution have become a major concern in today’s industrialised era. Oil spills and illegal dumping of waste products and hazardous materials into seawaters represent a critical environmental concern. These activities not only damage marine ecosystems but also threaten coastal economies and public health. Since these activities are often carried out far from shore, it becomes a challenge to detect such activities using traditional observation methods [20,21].

1.2. Limitations of Traditional Surveillance Approaches

Maritime security and monitoring have been relying on traditional mechanisms comprising coastal radar systems, satellite imagery, patrolling vessels of coastal guard agencies, AIS, and human observations and intelligence. These approaches worked well over time but are not sufficient with the rise in sophistication criminals are using to threaten others. The main limitations of traditional systems are:
  • Coverage Gaps: Traditional surveillance systems offer limited coverage, particularly in the deep sea where patrolling infrastructure is insufficient.
  • Evasion Tactics: Various malicious actors working in crime groups leverage technology to manipulate AIS data, operate without transponders, or exploit blind spots in satellite coverage.
  • Data Overload and Latency: The volume and speed of data generated by various sensors mounted in the sea and satellite systems are too much to be processed by human operators, causing delayed responses to threats or even missing them completely.

1.3. Scope and Objectives of the Paper

This survey paper aims to provide a comprehensive overview of the state-of-the-art deep learning techniques applied to maritime security. The objectives are threefold: Firstly, we explore applications of deep learning in detecting and mitigating maritime threats, including illegal trafficking, unauthorised fishing, environmental violations, and other anomalies at sea. Secondly, we focused on the challenges and limitations associated with deploying deep learning models in operational maritime contexts, such as data scarcity, model generalisation, real-time performance, and sensor fusion. Finally, we identify future research directions and technological advancements that can enhance the robustness and scalability of deep learning systems for maritime security.
In order to extract the relevant literature in recent years, we utilized the major databases such as Science Direct, IEEE Xplore, and Scopus. We mainly selected the period from 2020 to 2025 and included search terms including artificial intelligence, deep learning, maritime security, ship/vessel detection, object detection, maritime surveillance, anomaly detection, migrant boat, marine suspicious movement, IUU fishing detection, smuggling, and marine environment threat. We performed searches by incorporating several combinations of the search terms using the Boolean expressions “AND” and “OR”. Once collected all records, we removed duplicates and identified over 258 literature related to AI applications in maritime security. After analysing full-text articles, we identified 88 pieces of literature (original-articles, conference proceedings, and book chapters) for further discussion in our paper.
The rest of the paper is organised as follows: Section 2 reviews the key deep learning techniques and data sources used for object and anomaly detection and tracking. Section 3 discusses deep learning techniques for maritime surveillance and situational awareness. Section 4 outlines examples of maritime security applications, and Section 5 includes key data sources for the training and testing of deep learning methods in maritime domain. These datasets are either publicly available or available on request from the concerned authors. Section 6 outlines the main challenges and future direction of deep learning in maritime security. Finally, a discussion and conclusions are given in Section 7.

2. Deep Learning for Maritime Object Detection and Tracking

Maritime security is a critical global concern that demands advanced technological solutions to strengthen surveillance and response capabilities. In maritime research, deep learning has emerged as a promising tool for enhancing maritime object detection and tracking. Deep learning techniques enable automated analysis of extensive maritime datasets, integrating sensor data, unmanned aerial/surface/underwater vehicles (UxVs) and satellite imagery, and video feeds to detect ships, vessels, and anomalies in real-time with outstanding accuracy [22,23]. This capability not only enhances situational awareness but also supports timely decision-making crucial for maritime operations, security, and environmental protection [24].
This section explores the use of deep learning applications for problems such as vessel detection, anomaly detection in vessel behaviour, and tracking of maritime objects.

2.1. Vessel Detection

The use of deep learning applications for vessel detection is an important research area. Here, researchers have developed deep learning techniques using several types of data, for example, optical imagery from satellite, drones, surface vehicles, SAR, radar and AIS data.

2.1.1. Satellite Imagery

Optical satellite imagery (RGB) captures visible light and is effective at detecting vessels based on their visual appearance and characteristics. It provides detailed spatial information and can differentiate between different types of vessels. SAR is also equally valuable for vessel detection, especially in adverse weather conditions and during nighttime. It operates by transmitting microwave signals and capturing reflections, which are then processed to generate images. SAR can detect vessels based on their radar cross-section, allowing for all-weather surveillance and it provides valuable data for maritime surveillance due to its wide coverage and all-weather capabilities [25]. Generally, optical imagery captures visual details, while SAR is effective at detecting vessels regardless of weather conditions or lighting [26].
Figure 1 illustrates how SAR and optical imagery may be achieved through satellites during daylight and nighttime or under dark or unclear visual conditions. SAR uses radio waves, which can penetrate through clouds and operate both during the day and night. This makes SAR ideal for consistent ship detection under all weather and lighting conditions. On the other hand, optical imagery (RGB) relies on sunlight and visible spectrum. It provides high-resolution, colour images, but performance drops in cloudy or nighttime conditions.

2.1.2. Aerial Imagery

Aerial platforms, such as drones, unmanned aerial vehicles (UAVs), and other aircrafts offer high-resolution imagery suitable for detailed vessel detection and tracking [27]. These platforms enhance spatial resolution and temporal coverage, enabling real-time surveillance over vast maritime areas [28].

2.1.3. Surface Imagery

Images obtained from unmanned surface vehicles (USVs) and coastguard ships also play an important role in detecting ships or situations that may pose a risk to maritime safety (e.g., oil spills).

2.1.4. Radar Data

Radar systems play a crucial role in maritime surveillance by detecting vessels through radio waves [29]. Radar data integration with other sensor data improves detection accuracy and allows comprehensive situational awareness in maritime environments [30].

2.1.5. AIS Data

An AIS (Figure 2) is required to autonomously transmit and receive essential navigational and safety-related data that may include the identity, type, position, direction, and speed of the vessel to manage maritime traffic effectively and safely, even when the vessels are not within visual range of each other. Furthermore, AIS also facilitates seamless data exchange between ships and ports, thereby enhancing maritime situational awareness and operational safety [31].
Deep learning techniques that use AIS data have been rigorously developed, combined with imagery and radar, to enhance vessel tracking and identification capabilities for enhanced maritime surveillance [19,32].

2.1.6. Integration of Data from Different Sources

In deep learning techniques, multimodal data fusion approaches (as illustrated in Figure 3) for efficient maritime security can be used where satellite, UxV, radar, and AIS data are combined [33]. It enables insights to be drawn from multiple data sources, resulting in more accurate and valid predictions.

2.1.7. Challenges in Vessel Detection

Vessel detection in maritime security presents several challenges, including the accurate identification of small objects, adverse weather conditions, cluttered backgrounds, and reflections on the sea surface [34]. These factors can significantly reduce the performance of detection algorithms [35]. Effectively addressing these issues is essential to enhance the robustness and reliability of deep learning-based maritime surveillance systems.

2.2. Anomaly Detection in Vessel Behaviour

Anomaly detection in vessel behaviour—distinguishing between legitimate deviations and high-risk anomalies—has emerged as a critical component of maritime security, particularly given the growing volume of global maritime traffic and the sophistication of illicit activities. As highlighted in [36], the detection of abnormal vessel movements is essential to identify potential threats such as smuggling, illegal fishing, and piracy. Traditionally, rule-based systems were used to define what constitutes “normal” behaviour, but such methods often lack the flexibility to adapt to dynamic maritime environments [37]. Recent advances in deep learning have opened new avenues for modelling vessel trajectories and behavioural patterns with improved accuracy and robustness [38].
It is important to mention that, in the problem of anomaly detection using deep learning, the relevant datasets may be imbalanced since anomalous events are rare (minority class) as compared to normal operations. This may lead to bias in deep learning models towards the majority class, which is normal behaviours. To mitigate these issues, several strategies have been employed. For example, data augmentation is performed to generate synthetic samples for the minority class using techniques like the synthetic minority over-sampling technique (SMOTE) [39]. Otherwise, in case data augmentation is deemed infeasible, several other techniques may be used. For example, isolation forests (IF) are also effective in identifying anomalies without requiring balanced datasets. These algorithms isolate anomalies based on their distinct characteristics, making them suitable for sparse and imbalanced data scenarios [40]. Autoencoders and variational autoencoders (VAEs) learn representations of normal behaviour and can detect deviations indicative of anomalies. These models are particularly useful when labelled anomaly data is scarce [41]. Cost-sensitive learning assigns higher misclassification costs to minority classes to help models focus on correctly identifying anomalies.
Implementing these strategies enhances the capability of maritime anomaly detection systems to accurately identify rare events, thereby improving maritime safety and security.

2.2.1. Analysing AIS Data for Unusual Patterns

Unusual vessel behaviours may include abrupt speed variations, deviations from established routes, and loitering. In this regard, AIS data may be leveraged to detect anomalous vessel behaviours for maritime surveillance and security. These behavioural anomalies often signal potential threats, including illegal fishing, smuggling, or unauthorised maritime activities.
Speed anomalies, characterised by sudden accelerations or decelerations, can indicate evasive manoeuvres or mechanical issues. Route deviations, where vessels diverge from customary maritime paths, may suggest illicit intentions or navigational errors. Loitering behaviour, defined by prolonged stationary periods or repetitive movements within confined areas, is frequently associated with suspicious activities, especially when exhibited by cargo or tanker ships, which typically maintain consistent transit patterns [42].
To effectively identify these anomalies, researchers have developed various computational methods. For instance, [38] introduced GeoTrackNet, a deep learning model that learns probabilistic representations of AIS tracks to detect deviations from normal maritime behaviour. The authors [43] developed a semi-supervised deep learning approach for ship trajectory classification based on AIS data. They utilised both kinematic and static information of AIS messages to extract vessel trajectories for the classification task. Similarly, [44] employed RNN to model vessel trajectories, enabling the detection of outlier movements indicative of anomalous behaviour. These advanced methodologies underscore the importance of leveraging AIS data for real-time anomaly detection.

2.2.2. Combining AIS with Contextual Information

While AIS data provides essential insights into vessel movements, its standalone application may lead to false positives, as it does not account for external factors influencing vessel behaviour [45]. Incorporating contextual data—such as, meteorological conditions and port activities—allows for a more nuanced understanding of maritime operations, thereby improving anomaly detection efficacy [46]. For example, weather conditions significantly impact vessel navigation and behaviour. Adverse weather events, such as storms or high winds, can cause deviations in vessel speed and course, which, if unaccounted for, may be misclassified as anomalous. The authors in [47] investigated the integration of tide and weather data into vessel track prediction models and found that while the inclusion of such data did not universally enhance predictive accuracy, it provided valuable context for interpreting vessel movements, particularly in complex waterways.
Furthermore, port activity data offers another layer of contextual information that can refine anomaly detection. Vessels may exhibit a typical behaviour, such as loitering or sudden speed changes, due to port congestion, scheduling delays, or operational requirements. The authors in [48] developed metrics for assessing vessel and port efficiency by validating AIS data against port activity records, demonstrating that contextualising AIS data with port operations can elucidate reasons behind seemingly anomalous vessel behaviours [46]. The authors in [49] proposed a framework that incorporates environmental and contextual data to verify anomalies detected in AIS data, thereby filtering out false positives caused by benign factors such as weather-induced deviations.

2.2.3. Using Sequence-Based Models

In the context of maritime anomaly detection, sequence-based models such as RNN, LSTM, Gated Recurrent Units (GRUs), and Transformers have demonstrated significant efficacy in modelling the temporal dynamics inherent in AIS data [50]. These models are adept at capturing complex temporal dependencies, thereby facilitating the identification of anomalous vessel behaviours. For instance, [51] employed an encoder–decoder LSTM architecture to predict vessel trajectories, demonstrating improved performance over baseline models in capturing the spatiotemporal patterns of maritime movements.
Transformers, characterised by their self-attention mechanisms, have emerged as powerful tools for time-series analysis, particularly in capturing global dependencies within sequences. The authors in [47] utilised a Transformer model to predict vessel tracks in inland waterways, incorporating environmental factors such as tide and weather data to improve predictive performance.
Furthermore, hybrid models that integrate Transformers with variational autoencoders (VAEs) have been proposed to leverage the strengths of both architectures. Such models have demonstrated superior performance in unsupervised anomaly detection by effectively reconstructing normal vessel trajectories and identifying deviations indicative of anomalous behaviour [38].

3. Deep Learning for Maritime Surveillance and Situational Awareness

Surveillance and situational awareness in oceans with increasing complex maritime activities are challenges in the presence of traditional surveillance methods. Advanced deep learning techniques, using diverse maritime data sources such as UxV videos, satellite imagery, radar signals, and AIS data, have helped to develop automated, scalable, and highly accurate systems.

3.1. Maritime Image and Video Analysis

Deep learning has dramatically improved ship and object classification in maritime imagery. Modern CNN-based detectors (YOLO, Single Shot MultiBox Detector–SSD, RCNN, etc.) learn feature hierarchies from raw images. Two-stage detectors (e.g., Faster RCNN family) achieve high accuracy by generating region proposals and then classifying them, while one-stage detectors (YOLO, SSD) fuse detection and classification in one pass for real-time performance. For example, recent surveys note that YOLOv3/v4/v5 have been widely used for maritime ship detection and counting. In practical tests, YOLO-based detectors can reliably find multiple vessel types (e.g., bulk cargo, container, fishing, general cargo, passenger) with high accuracy [52].
The choice between these architectures involves a critical trade-off between speed, accuracy, and computational cost, which is paramount for real-time maritime operations. While one-stage detectors like the YOLO family offer high inference speeds (e.g., 20–50 FPS on modern GPUs) suitable for deployment on resource-constrained platforms like USVs and drones, this performance can come at the cost of reduced accuracy for detecting very small or partially occluded vessels compared to two-stage detectors. Conversely, architectures like the Faster R-CNN family, while often more accurate, have higher latency and are better suited for shore-based command centers with powerful processing capabilities. These benchmarks, however, are often established in clear conditions. As noted by several studies, real-world operational performance can be significantly impacted by adverse environmental factors, with accuracy known to drop in conditions of heavy clutter, fog, or high sea states.
Object classification must cope with challenging conditions. Studies report that lighting changes, sun glare, waves and reflections often obscure vessels and cause false positives [34,35]. For instance, the authors in [52] observe that night/day illumination variance and water reflections “obscure the outlines of vessels and lead to false positives, significantly reducing the effectiveness of image capture and analysis”. Consequently, state-of-the-art methods augment raw CNNs with attention modules or multi-scale features. One recent system YOLOv5-ASC adds an “attention-based receptive field enhancement” module and deformable convolutions to focus on multi-scale vessel features, yielding improved detection accuracy on remotesensing maritime images [10]. Another research in [9] introduced a multi-level, multi-head attention architecture on YOLOv8 to fuse EO (electro-optical) and infrared (IR) imagery; this scale-sensitive attention approach explicitly models different scales and lighting, giving a more robust detection of ships of various sizes.

3.1.1. Event and Activity Recognition

Beyond static objects, deep learning models are used to recognise events (e.g., collisions, loitering) and anomalous behaviour. Early work on maritime anomaly detection often relied on AIS trajectories, but recent research explores vision-based methods. For example, a graph-based anomaly detector was developed in [53] that tracks vessels via CCTV (using YOLOv7+StrongSORT) and constructs trajectory graphs to distinguish normal vs. anomalous motion. This approach identifies deviations in learned “normal” paths, flagging abnormal maneuvers (e.g., erratic course changes) that may indicate accidents or illicit activity.

3.1.2. Scene Understanding and Context

Deep neural networks are also used for semantic segmentation of the maritime scene, aiding contextual analysis (e.g., separating water, sky, land, and obstacles). For example, the research in [54] proposes a CNN encoder–decoder with superpixel refinement to segment water, sky, and obstacles for USV navigation. Their model (trained on publicly available maritime datasets) delivers precise obstacle masks even on complex backgrounds. Semantic segmentation yields a richer “understanding” of the scene—beyond classifying ships, it identifies navigable water vs. hazards—which can assist path planning and collision avoidance.

3.1.3. Video Surveillance for Tracking and Anomaly

Video analytics further extends image-based models with temporal processing. Deep trackers (e.g., DeepSORT) can maintain identity over frames after CNN detection, enabling counting and speed estimation. For example, a real-time port monitoring system combined YOLO detection with Kalman-tracking to count vessel types on land-based cameras [52]. Anomaly detection in video often uses deep autoencoders or graph models on trajectories as noted above [53]. For activity recognition (e.g., identifying suspicious boarding, search-and-rescue), some studies adapt human action networks (e.g., 3D-CNNs, LSTMs) to maritime scenes, although this remains underexplored. Case studies from port security exercises show CNN outputs overlaid on video feeds (bounding boxes, class labels) to alert operators about potential collisions or unauthorised vessels. In summary, deep video surveillance systems integrate detectors and trackers to continuously monitor maritime zones, with the goal of raising alerts on unusual or unsafe events.
Table 1 provides a comparative overview of state-of-the-art deep learning models applied to maritime image and video analysis in the context of surveillance and situational awareness. It summarises the model architecture, intended application domain, key strengths, known limitations, and the reported performance metrics.

3.2. Fusion of Multi-Sensor Data

Maritime surveillance increasingly relies on fusing heterogeneous sensors. No single sensor is sufficient in all conditions; for example, electro-optical (EO) cameras provide detailed imagery but struggle at night or in fog, whereas thermal cameras and radar are effective in poor light. Radar+vision fusion can track objects beyond visual range and refine classification (radar velocity plus camera shape). Hence, to exploit complementary strengths, modern systems fuse AIS, radar, EO/IR cameras, and even acoustic or LiDAR sensors via deep learning [61]. The general principle is that multi-modal fusion improves robustness: as one review notes, “no single sensor can guarantee sufficient reliability or accuracy in all different situations. Therefore, sensor fusion, which combines data from different sensors, is used to provide complementary information about the surrounding environment” [62]. The empirical results of [61] confirm the gains: RGB+IR fusion CNN significantly outperformed the single-sensor model alone in detecting ships under varying weather. In GMvA, spatiotemporal attention across AIS+CCTV improved target association by 10–20% over baselines [63].
However, effectively fusing heterogeneous sensor data presents significant practical and technical challenges. These include temporal synchronization (ensuring that data from a fast-scanning radar and a lower-frame-rate camera are aligned to the same moment in time), spatial alignment (accurately co-registering pixel data from a camera with spatial coordinates from AIS or radar), and handling sensor-specific noise and outages (e.g., AIS spoofing or signal loss in remote areas). Furthermore, the computational overhead of processing and fusing multiple high-bandwidth data streams in real-time places significant demands on both hardware and software, making it a non-trivial engineering task.

3.2.1. Fusion Architecture

Deep learning enables fusion at different stages. Fusion approaches are typically categorised as early (data/pixel-level), middle (feature-level) or late (decision-level). For example, [61] implemented three CNN architectures for RGB+IR fusion: early fusion (stacking raw images), feature fusion (merging CNN feature maps), and late fusion (combining separate detector outputs). Attention mechanisms can also learn to weight modalities. One maritime study [61] fused radar rangeDoppler maps with optical images via a deep network: region proposals from radar and EO were combined in a CNN for final detection, improving accuracy under clutter. Graph neural networks have been applied to fuse AIS and video: a graph learning-driven multi-vessel association model (GMvA) integrated AIS time-series with CCTV detections, using spatiotemporal graph attention to associate tracks [63]. By learning a graph embedding of vessel trajectories, GMvA achieved more reliable matching in high-traffic scenarios than either sensor alone. Similarly, multi-head attention modules have been designed to merge EO and IR features in YOLO, making detection sensitive to multi-scale objects [9,10].

3.2.2. Sensor-Specific Examples

AIS data broadcast by vessels provide useful information such as identity, type and route information. Fusing AIS with imagery yields richer situational awareness. For instance, [64] fused YOLOv5-based camera detections with AIS data: the visual system estimated a ship’s bearing and distance from a monocular camera, then matched these with AIS-reported positions to confirm vessel identity. This hybrid approach corrected misclassifications under poor visibility and achieved 75% association accuracy on real data. In another research [65], MIT’s AUV Lab released a dataset (“Philos” series) combining AIS, radar and video of a ferry; deep learning was used to project AIS fixes onto radar imagery, aiding object tracking.
In the maritime literature, radar and LiDAR fusion with vision have also been studied. Radar excels at long-range, all-weather detection; [61] developed a CNN that combines EO and radar proposals to classify vessels, showing decision-level fusion (CNN on merged proposals) outperforms single-modality detection. LiDAR is less common on ships but used on USVs; it provides 3D structure that complements camera colour. Acoustic sensors (buoys or hydrophones) are used for passive surveillance of fast vessels or submarines. A recent platform (OpenEar™ buoys) uses deep sound classification to detect propeller noise; network outputs are sent in real time to a cloud system that correlates tracks with AIS and radar for full maritime domain awareness (MDA) [13].
Table 2 categorises the main fusion strategies used in maritime surveillance, detailing sensor combinations, deep learning techniques employed, application contexts, and key performance highlights as reported in recent literature.

3.3. Maritime Domain Awareness Systems Using Deep Learning

Maritime domain awareness (MDA) refers to the integrated understanding of vessel activities for security and safety. Modern MDA platforms tightly couple deep learning analytics with operational decision support. Architecturally, an MDA system ingests streams from satellites, sensors, AIS, and social/OSINT data into a processing pipeline. Deep learning models are run onboard or in the cloud to detect/track vessels, classify behaviours, and flag anomalies in real time [62]. For example, USV platforms have sensor suites (cameras, LiDAR, sonar, radar) feeding a CNN+SLAM module that outputs a continuous situational map. These outputs (e.g., bounding boxes on video, classification labels, ship trajectories) are fused with AIS and Geographic Information System (GIS) databases to yield a unified maritime picture. In addition, real-time analysis is critical. Systems often use edge computing: lightweight CNNs (e.g., Tiny YOLO variants) run on embedded GPUs to detect objects at high frame rates, while heavier analysis (graph association, behaviour classification) can run on shore or cloud servers. Low-latency fusion and messaging frameworks (e.g., Data Distribution Service—DDS, Robot Operating System—ROS) synchronise multi-sensor data as in [67] so that deep algorithms use consistent inputs. An example platform (“eM/S Salama USV”) collects synchronised RGB, thermal, stereo, and LiDAR data, storing it for model training and processing it in situ for navigational awareness.
A critical aspect of modern MDA systems is the ability to perform analysis at the edge. In practical deployments on USVs, buoys, or aerial drones, real-time constraints and communication bottlenecks are paramount. It is often infeasible to transmit high-bandwidth raw video or radar data to a shore-based server. Therefore, systems increasingly rely on lightweight models (e.g., Tiny YOLO variants) running on specialized edge computing hardware (e.g., NVIDIA Jetson, Google Coral). These edge devices perform inference in situ, sending only low-bandwidth metadata—such as object coordinates, classifications, and critical alerts—back to a central command center. An exemplary case study is the deployment of AI-enabled acoustic buoys, which can detect fast-moving vessels by classifying propeller noise in real time and transmit only the final alert, demonstrating an effective edge-computing strategy for persistent surveillance.

Decision Support and Visualisation

The final stage is to present deep learning outputs to human operators. Automated alerts (e.g., collision warnings, no-AIS vessel detected) are generated when models detect critical events or anomalies. These are visualised on geo-referenced displays, e.g., live maps showing all tracked vessels with colour-coded statuses. Operators can click on a contact to view the latest camera feed, object classification (ship type), and behaviour history. Some systems overlay CNN detections on video (highlight vessel, show type, and confidence), or generate heatmaps of dense traffic. Importantly, explainability is considered: saliency maps or simple rule-checking (e.g., “vessel exhibited 3-knot speed change in 10 s”) are used to justify alerts. Importantly, explainability is considered to build operator trust and enable validation of alerts. Instead of treating models as “black boxes”, systems are beginning to integrate specific Explainable AI (XAI) techniques. For instance, class activation maps (CAMs) or Grad-CAM can be overlaid on video feeds to highlight the specific visual features, such as an unusual wake or objects being transferred between vessels, that led a CNN to flag an activity as anomalous. For trajectory data, techniques like SHAP (SHapley Additive exPlanations) can quantify which aspects of a vessel’s movement (e.g., a sudden speed change versus a deviation from a known route) contributed most to a high-risk score. This moves beyond simple rule-checking (e.g., “vessel exhibited 3-knot speed change in 10 s”) to provide nuanced, evidence-based justifications for automated alerts. For instance, a NATO naval system (VATOZ) integrates video analysis by displaying bounding boxes and text labels from deep learning models on a common operational picture. By integrating deep outputs into user-friendly dashboards and track management, Graphical User Interface (GUIs) and MDA platforms ensure that situational insights are actionable by controllers and maritime law enforcement [68]. Empirical case studies show benefit: enhanced MDA trials (e.g., AIS+satellite+UAV) report that deep learning reduced “contacts of interest” from hundreds of targets to a concise short list requiring human review. An experimental deployment of AI acoustic buoys achieved the detection of fast boats with 90% of precision, sending alerts to a command center [13]. In summary, deep learning enriches MDA architectures by automating perception and providing decision support, while interactive visualisation of CNN results helps operators monitor complex maritime domains.

4. Deep Learning for Specific Maritime Security Applications

Maritime literature presents several applications of deep learning techniques in numerous problem areas of maritime security challenges such as vessel detection, piracy prevention, illegal fishing, smuggling, and anomaly detection in ship behaviour. This section presents deep learning techniques applied to specific maritime security domains, highlighting their effectiveness.

4.1. Illegal Fishing Detection

In high seas and nationally controlled oceans, IUU fishing is a global pervasive problem causing the decline of ever increasing fish stocks and marine ecosystems disruption—undermining the ecological balance of the oceans [18]. The United Nations’ Food and Agriculture Organisation (FAO) estimates up to 26 million tons of IUU fishing worth USD 23 billion [69].
In the maritime literature, deep learning techniques have been applied to existing data on legal and illegal fishing activities to classify and identify vessels, detect suspicious behaviour, and predict potential violations of fishing regulations. Usually, spatiotemporal (fishing events over time and in different geographic locations) data with historical records of fishing vessel trajectories are analysed for fishing behaviour classification to detect patterns indicative of illegal activities. AIS and SAR imagery data are also widely used for the same purpose. The following are some of the exemplary research works that utilise deep learning techniques.
The study [19] proposed the so-called BiLSTM-CNN-Attention—a hybrid deep learning method of BiLSTM, CNN, and attention mechanism—to classify fishing vessels, including the ones catching fish illegally using AIS data in China’s offshore waters from 2018 to 2022. When compared with other machine learning techniques, such as support vector machines, random forest, XGBoost, and BiGRU, the authors found the proposed method has enhanced generalisation ability. In terms of accuracy, the proposed method was 18% and 10% more accurate than the base models and deep learning methods, respectively. BiLSTM-CNN-Attention is superior than others because it utilises BiLSTM for capturing both past and future context in the sequential data, CNN for extracting local patterns and hierarchical features from input sequences, and attention mechanism for focusing on key features to perform the classification task more effectively. However, the model failed to solve the data bias problem and misclassified stow-net vessels and gillnetters as illegal fishing trawlers due to large trawler data.
Using SAR data, the authors in [4] attempted to classify fishing vessels with focus on small vessels with minor inter-class differences, using a deep learning technique which they called FishNet. It is a novel combination of four different modules: multipath feature extraction (MUL)—as DenseNet inspired module—for feature extraction from SAR image; feature fusion (FF)—a CNN-based module—for combining the extracted features from different paths to create a comprehensive representation; multilevel feature aggregation (MFA)—a top-down pathway—for aggregating features at multiple levels of the deep learning network; and parallel channel and spatial attention (PCSA)—an attention mechanism—allowing the model to focus on the most relevant parts of the image, thereby improving classification performance. In comparison with 27 other deep learning models, FishNet achieved the highest classification performance in terms of accuracy, precision, recall, and F1-score. Overall, FishNet achieved the highest accuracy of 89.79%, which is 6.77% higher than the second-best method. On the other hand, the study faced challenges of small dataset with imbalanced class labels and longer training time of FishNet due to deeper and wider network structure compared to other simpler models leveraged in the experiments.
In the absence of an AIS system on some boats (“dark vessels”), combating illegal fishing through small fishing boats, especially at nighttime, is a challenge. The authors in [15] employed Stacked-YOLOv5 model—a small-target detection layer—to effectively detect lit fishing boats using satellite imagery from Luojia1-01 (LJ-01) with over 96% accuracy. Here, the authors effectively combined single-band nighttime light images with stretching methods to form a triband image for improved feature extraction and detection performance. At the same time, the use of radiation correction and masking of offshore oil and gas platforms also enhanced the quality of the input data. However, the sample dataset for target detection is relatively small, and the presence of lights from non-fishing vessels may introduce noise and affect detection accuracy. Additionally, the use of single-band images may pose difficulties in feature extraction.

4.2. Piracy and Armed Robbery Prevention

Piracy attacks and armed ship robberies are some of the most significant security threats to the shipping industry, leading to consequences for global economic activities. Using data on piracy attacks between 1994 and 2017, the study [70] suggests that small vessels, open registry vessels, vessels at berth or anchor, at night, in territorial waters and port areas are more vulnerable to piracy attacks and robbery than others. Any proactive approach by the Master and the crew may improve the situational awareness, enhance surveillance, and detect threats early to determine the appropriate response in time-sensitive stressful conditions.
When it comes to leveraging deep learning techniques for the detection and prevention of piracy and robbery threats, this area poses a significant research gap. Due to the lack of data—be it AIS, SAR, or any other form—there is an immense demand for research efforts [71].
Researchers in [71] performed spatiotemporal pattern mining to propose a novel dataset that contains maritime piracy incidents by merging data from three sources: ASAM (Anti-Shipping Activity Messages), IMO GISIS (International Maritime Organisation–Global Integrated Shipping Information System), and IMB (International Maritime Bureau). A comprehensive framework for analysing spatiotemporal, time-series, and clustering data was developed. They visualised and analysed piracy incidents over the past three decades, developed the fast adaptive dynamic time warping (FADTW) method for uncovering hidden temporal and spatial-temporal patterns, and a density-based clustering technique DBSCAN (density-based spatial clustering of applications with noise) for extracting spatial distribution patterns and identifying high-risk areas. It is a robust dataset but still focuses on high-risk areas, potentially overlooking other regions. Additionally, the inherent complexity of spatiotemporal patterns may require further refinement and validation.
Focusing on the Straits of Malacca and Singapore waters, the authors in [72] analysed piracy and armed robbery reports, obtained from GISIS, IMO, and Annual Reports on Piracy and Armed Robbery from the Regional Cooperation Agreement on Combating Piracy and Armed Robbery against Ships in Asia (ReCAAP), using BERTopic—an advanced natural language processing (NLP)-based transformer model. BERTopic generated 13 main topics from the incident reports, which were clustered into three categories: ship’s risk, geographical constraints, and crew and authorities’ responses. The research identified several key factors influencing piracy and armed robbery, including ship type, geographical constraints, crew response, and authorities’ actions. Even though BERTopic is an effective technique, its application in maritime piracy research is relatively new, and further studies are needed to confirm its reliability.

4.3. Smuggling and Trafficking Detection

The volume of maritime transportation carrying the global economy today facilitates non-negligible issues of the illicit movement of goods and people, often across international borders, using sea routes—raising growing concerns of maritime security breach, human rights abuses, and the breach of international law.
The authors in [3] used YOLOv10s for detecting dark vessels using Sentinel-1 SAR, Sentinel-2 optical imagery, and spatiotemporal AIS data. The goal of the research was to optimise YOLOv10s for detecting small ships by removing unnecessary Conv and C2f layers from the network architecture. At the same time, they used AIS matchmaking techniques for cross-referencing image-based detections with AIS data. In the case of dark vessels, the approach helps distinguish suspicious activities by vessels that are visible in image but are not broadcasting AIS signals. When compared with state-of-the-art YOLO models across different types of satellite imagery, YOLOv10s achieved superior performance with accuracy AP 50 = 0.8588, AP 50 95 = 0.6631, precision = 0.9370, recall = 0.9381, and specificity = 0.9869. Another major contribution of this research is their curated custom dataset HS3-S2, combining six open-source single-modality datasets, which helps improve detection capability.
The ship-to-ship exchange of illicit goods is another form of smuggling. To detect such activity, [73] utilised AIS data and PlanetScope satellite SAR imagery to train and test the deep learning model YOLOv8m. According to the authors, a ship-to-ship transfer between any two vessels identified in a satellite imagery can be referred to as smuggling when AIS signals from less than two vessels are available. With this approach, the research identified 400 of such events in the Kerch Strait between 2021 and 2023, with F1 accuracy of 97%.
The authors in [73] critically evaluated the use of AIS data for the suspicious movement of any ship and proposed to fuse information from more than one source. The study developed an information fusion framework MSIF-SSTR, which leverages high-speed radar trajectories and the corresponding meteorological data, especially at nighttime and during poor weather conditions, as the potential of smuggling activities is high in such conditions. It is a decision-level fusion where the temporal features extracted by the TCN network are input into LSTM for determining if the ship trajectory appears suspicious. The proposed model achieved a classification accuracy of over 94% on the collected dataset.
To address the problem of dealing with small inflatable smuggling boats, [2] employed advanced deep learning models such as YOLOv2, YOLOv3, and Faster R-CNN, alongside feature extraction models such as GoogLeNet, ResNet18, ResNet50, and ResNet101. The authors collected IR thermal images of small inflatable boats and people on the Elblag and Bug rivers in Poland in different weather conditions. The experiment suggested that, amongst other deep learning algorithms, Faster R-CNN with ResNet101 achieved the highest detection rate but required significant processing time. On the limitations, the authors highlighted challenges such as detection capability reducing in varying environmental conditions and the limitations of data sources catering variety of scenarios.

4.4. Maritime Environmental Monitoring

Scientists have widely used AI and deep learning techniques to detect marine pollution, monitor marine life, and improve our understanding of marine ecosystems, using data from sensors, satellites, and other sources.
Oil spill incidents have devastating effects on the marine ecosystem. Accurate and timely detection of oil spills helps protect and minimise harmful effects on the aquatic biome. There exists extensive literature that explores the deep learning application of oil spills, using computer vision and image data. For instance, [74] employed a variety of deep learning algorithms including UNet, BiSeNetV2, and DeepLabV3+, alongside attention mechanisms such as Squeeze-and-Excitation (SE), Convolutional Block Attention Module (CBAM), and Simple Attention Module (SimAM) on optical images from Sentinel-2 MSI, Landsat-8 OLI, and Landsat-9 OLI2. The experiments suggested that UNet with CBAM was more accurate than others in detecting oil slicks. While the model performed well on these common slicks, it struggled to fully capture thick crude and emulsified oil slicks, which are underrepresented in the given dataset limited to 143 optical images only.
The study [6] used CNN, Multilayer Perceptron (MLP), and U-Net models for oil spill classification and segmentation using Sentinel-1 SAR imagery. They trained and tested these models with 685 Sentinel-1 SAR images in dual-polarisation (VV, VH) mode. The results suggested that a CNN with 6 convolutional layers, 32 filters, and 2 hidden layers achieved 99% classification accuracy. U-Net with the Focal Loss and Intersection over Union (IoU) metric achieved 96% IoU for training and 90% for validation. Overall, the two-stage CNN and U-Net framework achieved an overall accuracy of 95% and an IoU of 90%. On research limitations, the authors inferred that the look-alikes in SAR images can cause false positives, thus increase in data variability may improve detection accuracy.
The authors in [21] developed a comprehensive pipeline for detecting and classifying oil spills using aerial imagery and deep learning. Collecting RGB image data from various sources, including Kaggle, GitHub, and the Korean Coast Guard, the authors trained and tested deep learning model with ResNet101V1c backbone and DaNet segmentor for image segmentation and classification. The model achieved a mean Intersection over Union (mIoU) of 72.49% and an accuracy of 94.22%. Furthermore, the research used a conditional generative adversarial network (GAN) model to generate synthetic images and annotations and achieved improved model accuracy by 2.56% and balanced the contribution proportions of different oil types. However, the research indicated a challenge in detecting silver and brown oils due to their resemblance to natural elements and imbalances in the training dataset.
Marine ecosystems prevention conservation is another critical research area, and deep learning techniques have effectively served the purpose in the existing literature. The authors in [75] used deep learning models such as ResNet, DenseNet, Inception, and Inception-ResNet for automated classification of deep-sea biota using remotely operated vehicle (ROV) imagery. According to findings of this research, Inception-ResNet achieved a mean classification accuracy of 65% with AUC scores exceeding 0.8 for each class.

4.5. Safety, Search and Rescue Operations

Safety, search and rescue operations in the challenging maritime environments require prompt and accurate decision-making. In this regard, advanced deep learning techniques have been trained and tested by complex data sources such as satellite imagery, aerial surveillance, sonar, and vessel tracking systems.
The authors in [16] developed a sea surface object detection system, so-called Sw-YoloX, which combines transformer and anchor-free mechanisms to improve detection accuracy. It integrates a convolutional block attention module (CBAM) and atrous spatial pyramid pooling (ASPP) to address two critical issues: slow convergence rate with high training overhead of transformer architecture and inefficient feature extraction of CNN. With this approach, the authors were able to reduce computational cost and increase detection accuracy. Additionally, they used data augmentation, multi-scale training, and a self-training classifier to enhance performance. The model achieved an F1 accuracy of 78%, mAP of 54, and recall of 72 when detecting objects in complex and undulating sea surface environments.
The study [76] attempted to detect a person in open water during search and rescue operations using UAV images and YOLOv4 with MobileNetV3Small backbone. With the image frames from videos captured by UAVs over lakes and seas in Turku, Finland, the authors trained and tested the deep learning model. The experimental analyses suggest that lowering IoU threshold (e.g., 0.10) may result in better detection metrics, with more true positives (TPs) and fewer false negatives (FNs). Additionally, bounding-box size and multiple consecutive frames have impact on detection accuracy, in terms of improved precision and recall, and computational cost.
Another deep learning-based framework for efficient search and rescue operations has been proposed in [77]. Here, the authors first pre-processed UAV images to identify potential victim locations using simple features and then used a pre-trained CNN to verify the presence of victims in the identified region. The two-step process improved the efficiency of detecting a person in water and achieved 95% accuracy, with a true positive rate (P_TP) of 0.9778 and a false alarm rate (P_FA) of 0.0769.
Table 3 provides a comparative overview of state-of-the-art deep learning models employed on specific maritime security applications. It summarises the model architecture, intended application domain, key strengths, known limitations, and the reported performance metrics.

5. Key Sources for Deep Learning in Maritime Domain

In this section, some of the key datasets suitable for deep learning research in the maritime domain are presented. These datasets can be obtained directly from a publicly accessible online venue or by requesting the data from the authors, as stated in their research publications. The datasets cover research problems such as vessel detection, classification, behaviour analysis, oil spill detection, security threat detection, and SAR imagery.
When using these datasets, in the context of deep learning for maritime security, key considerations should be taken into account, including annotation quality, diversity of vessel types, and environmental variability. It is observed that that many earlier datasets are limited in scale or completeness; for example, comprising as few as 14 videos [78]. More recent datasets such as Deepdive [75], SeaDronesSee [79], and SAR-HumanDetection-FinlandProper [76] attempt to address these limitations by offering broader coverage. However, significant gaps persist: few large-scale datasets combine EO and IR modalities, and acoustic data remains particularly scarce. While AIS data is widely available (e.g., through platforms like AISHub [80]), there is a lack of publicly labeled datasets for anomaly detection or vessel behaviour classification.
Table 4 presents computer vision datasets, and Table 5 lists the datasets according to their specific application in maritime domain.

6. Challenges and Future Directions

The domain of maritime security maintains several challenges that pose potential obstacles in the meaningful implementation and deployment of deep learning techniques in real-world use cases. It has been observed, from the existing literature, that the issue of data robustness and availability is the foremost challenge when it comes to training and testing deep learning techniques. Contrarily, in various other domains, such as healthcare, finance, environment, and government data, easy availability of open access public datasets facilitates the experimentation and expansion of research in a variety of problem areas. On the other hand, because of the specificity and selectivity of the maritime field, the access to maritime affairs and data is limited—especially maritime security where data confidentiality and security matters—curbing large-scale deep learning research. To address the data scarcity problem, researchers have attempted to generate synthetic data to expand existing datasets. To this end, data augmentation techniques, simulation environments to generate artificial data based on selected maritime security scenarios, and generative adversarial networks (GANs) have been explored; however, ensuring realism and diversity for real-life or practical deep learning applications is crucial. For example, the authors in [89] achieved mAP improvement of approx. 12% over baseline method using pseudo-SAR images the ship detection problem. Similarly, [90] generated synthetic data by adding elements, such as ocean textures, human models, boats, and rocks, to address data scarcity problem. They achieved over 28% improvement over the best-performing state-of-the-art baseline mode trained on real data.
Apart from data availability challenges, there are issues with the quality of available datasets, such as scarcity of labeled data, class imbalance (e.g., unequal representation of security scenarios), and bias (e.g., regional bias and common scenarios vastly outnumbering critical security events). This leads to an adverse impact on model generalisability and biased deep learning models that struggle to detect anomalies or rare threats effectively. Therefore, model performance degradation and lack of robustness in unseen conditions is inevitable because maritime environments are inherently dynamic. For example, Airbus Ship Detection dataset [81] maintains an imbalance in the number of ships per image: some images have only one ship, while others contain a dense group of ships. It will impact model’s ability to accurately detect and segment all ships in crowded scenarios. The authors in [91,92] have rightly raised the issue of dropping of recall or precision due to class imbalance in maritime datasets. For instance, there is imbalance between positive (ship) and negative (background) samples for ship detection. Moreover, small and inshore ships are underrepresented in datasets, resulting in models missing these targets, ultimately leading to significantly lower recall and a bias toward detecting larger, offshore vessels.
There are several natural and man-made variables for example, variations in weather and sea state, vessel appearance and forged status, and crafted perturbations in input data. The deep learning models trained in specific contexts often fail to generalise across new or adverse conditions, leading to reduced accuracy in the field. To this end, there are examples in literature where the authors have attempted to employ transfer learning strategies or domain adaptation techniques in maritime settings. For instance, the authors in [89] proposed an end-to-end generative knowledge transfer framework using GAN to synthesise SAR images. They significantly improved ship detection accuracy by generating pseudo-SAR images that enhanced the generalization performance of the detection model in unseen conditions. However, as discussed earlier in this paper, the existing state-of-the-art transfer learning models, such as, YoLo (different versions), FishNet, DenseNet, originally trained on non-maritime imagery have shown significant promises in maritime security threat detection scenarios while trained on limited maritime data.
Additionally, because of the criticality of the maritime security matters where a timely decision is paramount to planning and executing a response, the real-time application of deep learning models demands onboard deployment on maritime platforms such as ships, drones, or autonomous vehicles (aerial, surface, and underwater). The advanced deep learning architectures require substantial computational resources, forming a key bottleneck for real-time onboard processing. It is observed in existing literature that several deep learning solutions proposed for maritime security are observed to be complex (often combining multiple deep learning architectures) and computationally heavy. The limited onboard computing capability, processing power, and bandwidth-constrained environments require optimised models that operate under limited resources. To this end, researchers have proposed approaches such as model pruning, optimisation, and lightweight architectures (e.g., MobileNet, EfficientNet) that reduce the computational footprint of deep learning models without significantly compromising performance.
Another critical challenge is algorithmic bias. From the existing literature, it is obvious that deep learning models are trained on biased datasets, leading to an AI model that may not be fully trustworthy when taken into account in the decision-making process by the maritime security agencies. A biased algorithm may unfairly target specific vessel types, regions, or scenarios, leading to discriminatory outcomes that can introduce systemic inequities in maritime law enforcement and significant operational and legal consequences. Here comes the role of explainable AI (XAI), which helps stakeholders understand and validate the model outputs, fostering greater acceptance and accountability. Additionally, the techniques, such as saliency maps, class activation mappings (CAM), layer-wise relevance propagation (LRP), and SHAP (SHapley Additive exPlanations), help visualise and interpret the deep learning results by identifying which features or regions of an input contributed most to a prediction. For example, to explain why a certain vessel was flagged as security threat, SHAP may be applied to trajectory-based models. Additionally, associating scenarios of sudden course deviations, speed variation, or prolonged loitering near exclusive economic zone may also support more transparent decision-making. Furthermore, the graph-based AI models may be used to integrate maritime-specific knowledge graphs for examining vessel behavior with temporal, legal, and operational factors. Moreover, federated learning (FL) is another AI-related area of exploration in the maritime security domain. Specifically, to address the issue of an AI model trained on one of the datasets related to region and tested and run in another region, federated learning may enable collaborative training of deep learning models without sharing raw data between security agencies such as coast guards, navies, and satellite providers. For example, the authors in [93] suggested that FL enables machine learning in a decentralized manner by training the AI model across vessels, regions, and scenarios.
It is obvious from the existing maritime literature that the advanced deep learning techniques have proven their potential. The latest deep learning architectures have been widely leveraged in maritime security research, including, deep learning object detection/classification techniques in YOLO group; feature learning techniques in ResNet, EfficientNet, DenseNet, Vision Transformer (ViT) groups; and natural language processing techniques in BERT, XLNet, FLAN-T5 groups; and various techniques for multimodal data fusion. For future research, the avenues for combining these architectures are yet to be explored for robust and effective deep learning-based maritime security applications. Another potential research area would be integrating knowledge graphs or symbolic reasoning for enhancing situational awareness and contextual understanding in maritime security scenarios. Lastly, because of the criticality of outcomes of decisions taken by the security operators, it is important to develop XAI models that can be effectively interpretable by human users. Based on the review of existing literature performed in this research, this particular area highlights a clear gap in the literature, making it a promising avenue for future research.

7. Discussion and Conclusions

The smooth functioning of international affairs throughout the sea ensures global stability and economic sustainability. It heavily depends on secure and peaceful maritime domains; therefore, safeguarding maritime assets and borders is a critical mission. Especially, in the strategic maritime regions like the Mediterranean Sea, Gulf of Aden and the English Channel that are the frequent focus of illegal migration, trafficking, and smuggling.
In the presence of growing concerns about complexity and the scale of maritime security threats, that range from illegal fishing and smuggling to environmental damage and human trafficking, the demand for more intelligent and automated surveillance solutions exists more than ever. Traditional methods heavily rely on manual monitoring and physical patrolling. These are insufficient against the volume, stealth, and sophistication of modern maritime threats. In this context, the AI application, especially deep learning techniques, has emerged as a transformative approach for enhancing maritime domain awareness (MDA) and operational response capabilities. Deep learning provides capabilities vital for enabling proactive security interventions and efficient resource allocation.
In this review study, we examined how recent advances in deep learning have contributed to object detection and tracking, situational awareness, anomaly detection, and activity recognition in maritime environments. Across these security challenges, deep learning models have demonstrated notable performance due to their ability to process and learn from vast, heterogeneous datasets that include optical and radar imagery, AIS signals, SAR data, and real-time sensor inputs from UxVs and satellites.
Deep learning models such as CNNs (YOLO, SSD, RCNN), sequence models (LSTM, GRU, Transformers), and hybrid architectures (CNN+RNN, CNN+Attention, VAE+Transformer) have shown significant capability in detection accuracy and temporal modelling of maritime threats. Particularly in vessel detection and classification, YOLO-based models have shown high reliability in identifying multiple vessel types with precision under various environmental conditions.
It is also observed in the literature that combining data from various sources enhances comprehensive situational awareness in maritime environments. In this case, multimodal data fusion approaches have successfully integrated EO/IR imagery, radar, AIS, and acoustic data to enhance robustness and accuracy. These deep fusion architectures (e.g., multi-stream CNNs, BiLSTM-CNN-Attention, and sensor-level fusion) offer a unified situational picture that compensates for the limitations of individual data streams. For example, SAR proves effective during adverse weather and nighttime, while optical sensors provide rich visual detail during clear conditions. Radar and AIS data further complement these sources, especially when used in combination with CNNs or sequence models to detect trajectory anomalies and behavioural patterns.
Despite these advancements, the maritime literature has also highlighted persisting challenges. For example, the detection of small or partially occluded objects, especially in cluttered or reflective maritime backgrounds, remains difficult and requires significant research effort. Similarly, distinguishing between legitimate vessel behaviour and malicious activity also poses a challenge for anomaly detection algorithms, especially in congested waterways. Moreover, in the practical scenarios where nature plays its role in sea conditions, deep learning models suffer from depreciated performance.
Explainability and real-time deployment also demand more research. While saliency maps, rule-based logic overlays, and interpretive dashboards provide some transparency, fully interpretable deep learning systems for maritime security are yet to be researched. Lightweight models (e.g., Tiny YOLO) and onboard processing strategies are increasingly being employed to achieve high frame-rate performance, especially in UxV and edge-computing scenarios.

Author Contributions

Conceptualisation and methodology, K.T. and S.A.; writing—original draft preparation, K.T., R.H., I.G. and S.A.; writing—review and editing, K.T., S.A. and Z.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. European External Action Service. Maritime Security. 2025. Available online: https://www.eeas.europa.eu/marsec25-eu-maritime-security_en (accessed on 24 June 2025).
  2. Kowalski, M.; Pałka, N.; Młyńczak, J.; Karol, M.; Czerwińska, E.; Życzkowski, M.; Ciurapiński, W.; Zawadzki, Z.; Brawata, S. Detection of inflatable boats and people in thermal infrared with deep learning methods. Sensors 2021, 21, 5330. [Google Scholar] [CrossRef] [PubMed]
  3. Galdelli, A.; Narang, G.; Pietrini, R.; Zazzarini, M.; Fiorani, A.; Tassetti, A.N. Multimodal AI-enhanced ship detection for mapping fishing vessels and informing on suspicious activities. Pattern Recognit. Lett. 2025, 191, 15–22. [Google Scholar] [CrossRef]
  4. Guan, Y.; Zhang, X.; Chen, S.; Liu, G.; Jia, Y.; Zhang, Y.; Gao, G.; Zhang, J.; Li, Z.; Cao, C. Fishing vessel classification in SAR images using a novel deep learning model. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5215821. [Google Scholar] [CrossRef]
  5. Ventikos, N.P.; Koimtzoglou, A.; Michelis, A.; Stouraiti, A.; Kopsacheilis, I.; Podimatas, V. A Bayesian network-based tool for crisis classification in piracy or armed robbery incidents on passenger ships. Proc. Inst. Mech. Eng. Part M J. Eng. Marit. Environ. 2024, 238, 251–261. [Google Scholar] [CrossRef]
  6. Trujillo-Acatitla, R.; Tuxpan-Vargas, J.; Ovando-Vázquez, C.; Monterrubio-Martínez, E. Marine oil spill detection and segmentation in SAR data with two steps deep learning framework. Mar. Pollut. Bull. 2024, 204, 116549. [Google Scholar] [CrossRef]
  7. Gamage, C.; Dinalankara, R.; Samarabandu, J.; Subasinghe, A. A comprehensive survey on the applications of machine learning techniques on maritime surveillance to detect abnormal maritime vessel behaviors. WMU J. Marit. Aff. 2023, 22, 447–477. [Google Scholar] [CrossRef]
  8. Bentes, C.; Velotto, D.; Tings, B. Ship classification in TerraSAR-X images with convolutional neural networks. IEEE J. Ocean. Eng. 2017, 43, 258–266. [Google Scholar] [CrossRef]
  9. Wang, S.; Kim, B. Scale-Sensitive Attention for Multi-Scale Maritime Vessel Detection Using EO/IR Cameras. Appl. Sci. 2024, 14, 11604. [Google Scholar] [CrossRef]
  10. Jiang, X.; Liu, T.; Song, T.; Cen, Q. Optimized Marine Target Detection in Remote Sensing Images with Attention Mechanism and Multi-Scale Feature Fusion. Information 2025, 16, 332. [Google Scholar] [CrossRef]
  11. Mujtaba, D.F.; Mahapatra, N.R. Deep Learning for Spatiotemporal Modeling of Illegal, Unreported, and Unregulated Fishing Events. In Proceedings of the 2022 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 14–16 December 2022; pp. 423–425. [Google Scholar] [CrossRef]
  12. Yang, D.; Solihin, M.I.; Ardiyanto, I.; Zhao, Y.; Li, W.; Cai, B.; Chen, C. A streamlined approach for intelligent ship object detection using EL-YOLO algorithm. Sci. Rep. 2024, 14, 15254. [Google Scholar] [CrossRef]
  13. Karst, J.; McGurrin, R.; Gavin, K.; Luttrell, J.; Rippy, W.; Coniglione, R.; McKenna, J.; Riedel, R. Enhancing Maritime Domain Awareness Through AI-Enabled Acoustic Buoys for Real-Time Detection and Tracking of Fast-Moving Vessels. Sensors 2025, 25, 1930. [Google Scholar] [CrossRef] [PubMed]
  14. Yan, H.; Chen, C.; Jin, G.; Zhang, J.; Wang, X.; Zhu, D. Implementation of a Modified Faster R-CNN for Target Detection Technology of Coastal Defense Radar. Remote Sens. 2021, 13, 1703. [Google Scholar] [CrossRef]
  15. Hu, H.; Zhou, W.; Jiang, B.; Zhang, J.; Cheng, T. Exploring deep learning techniques for the extraction of lit fishing vessels from Luojia1-01. Ecol. Indic. 2024, 159, 111682. [Google Scholar] [CrossRef]
  16. Ding, J.; Li, W.; Pei, L.; Yang, M.; Ye, C.; Yuan, B. Sw-YoloX: An anchor-free detector based transformer for sea surface object detection. Expert Syst. Appl. 2023, 217, 119560. [Google Scholar] [CrossRef]
  17. Walsh, P.W.; Cuibus, M.V. People Crossing the English Channel in Small Boats; Briefing, Migration Observatory, University of Oxford: Oxford, UK, 2025. [Google Scholar]
  18. Government of Canada. Illegal, Unreported and Unregulated (IUU) Fishing; Government of Canada: Ottawa, ON, Canada, 2019.
  19. Cheng, X.; Wang, J.; Chen, X.; Zhang, F. Attention-enhanced and integrated deep learning approach for fishing vessel classification based on multiple features. Sci. Rep. 2025, 15, 8642. [Google Scholar] [CrossRef]
  20. Burgherr, P. In-depth analysis of accidental oil spills from tankers in the context of global spill trends from all sources. J. Hazard. Mater. 2007, 140, 245–256. [Google Scholar] [CrossRef]
  21. Bui, N.A.; Oh, Y.; Lee, I. Oil spill detection and classification through deep learning and tailored data augmentation. Int. J. Appl. Earth Obs. Geoinf. 2024, 129, 103845. [Google Scholar] [CrossRef]
  22. Qu, J.; Gao, Y.; Lu, Y.; Xu, W.; Liu, R.W. Deep learning-driven surveillance quality enhancement for maritime management promotion under low-visibility weathers. Ocean Coast. Manag. 2023, 235, 106478. [Google Scholar] [CrossRef]
  23. Baswaid, M.H.; Darir, F.F.F.; Qin, C.Y.; Sofian, A.P.; Amin, N. Deep Learning-Based Ship Detection: Enhancing Maritime Surveillance with Convolutional Neural Networks. Preprints 2025. [Google Scholar] [CrossRef]
  24. Dimitrov, T. Applying Artificial Intelligence for improving Situational awareness and Threat monitoring at sea as key factor for success in Naval operation. In Proceedings of the Environment, Technologies, Resources, Proceedings of the International Scientific and Practical Conference, Rezekne, Latvia, 27–28 June 2024; Volume 4, pp. 49–55. [Google Scholar]
  25. Kanjir, U.; Greidanus, H.; Oštir, K. Vessel detection and classification from spaceborne optical images: A literature survey. Remote Sens. Environ. 2018, 207, 1–26. [Google Scholar] [CrossRef]
  26. Ahmed, M.; El-Sheimy, N.; Leung, H. Dual-Modal Approach for Ship Detection: Fusing Synthetic Aperture Radar and Optical Satellite Imagery. Sensors 2025, 25, 329. [Google Scholar] [CrossRef]
  27. Jeon, I.; Ham, S.; Cheon, J.; Klimkowska, A.M.; Kim, H.; Choi, K.; Lee, I. A real-time drone mapping platform for marine surveillance. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 385–391. [Google Scholar] [CrossRef]
  28. Park, J.J.; Park, K.A.; Kim, T.S.; Oh, S.; Lee, M. Aerial hyperspectral remote sensing detection for maritime search and surveillance of floating small objects. Adv. Space Res. 2023, 72, 2118–2136. [Google Scholar] [CrossRef]
  29. Zhang, Z.; Lu, X.; Cao, G.; Yang, Y.; Jiao, L.; Liu, F. ViT-YOLO: Transformer-based YOLO for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 2799–2808. [Google Scholar]
  30. PhiliP-Kpae, F.O.; Ogbondamati, L.E.; Ebri, K.F.E. Evaluating Marine Radar Object Detection System Using Yolo-Based Deep Learning Algorithm. Direct Res. J. Eng. Inf. Technol. 2025, 13, 7–15. [Google Scholar] [CrossRef]
  31. International Maritime Organization. Automatic Identification Systems (AIS) Transponders. 2025. Available online: https://www.imo.org/en/OurWork/Safety/Pages/AIS.aspx (accessed on 9 June 2025).
  32. Murray, B.; Perera, L.P. An AIS-based deep learning framework for regional ship behavior prediction. Reliab. Eng. Syst. Saf. 2021, 215, 107819. [Google Scholar] [CrossRef]
  33. V7 Labs. Multimodal Deep Learning: Definition, Examples, Applications. 2025. Available online: https://www.v7labs.com/blog/multimodal-deep-learning-guide (accessed on 9 June 2025).
  34. Zhang, Q.; Wang, L.; Meng, H.; Zhang, Z.; Yang, C. Ship Detection in Maritime Scenes under Adverse Weather Conditions. Remote Sens. 2024, 16, 1567. [Google Scholar] [CrossRef]
  35. Chen, X.; Wei, C.; Xin, Z.; Zhao, J.; Xian, J. Ship Detection under Low-Visibility Weather Interference via an Ensemble Generative Adversarial Network. J. Mar. Sci. Eng. 2023, 11, 2065. [Google Scholar] [CrossRef]
  36. Riveiro, M.J. Visual Analytics for Maritime Anomaly Detection. Ph.D. Thesis, Örebro Universitet, Örebro, Sweden, 2011. [Google Scholar]
  37. Riveiro, M.; Pallotta, G.; Vespe, M. Maritime anomaly detection: A review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1266. [Google Scholar] [CrossRef]
  38. Nguyen, D.; Vadaine, R.; Hajduch, G.; Garello, R.; Fablet, R. GeoTrackNet—A maritime anomaly detector using probabilistic neural network representation of AIS tracks and a contrario detection. IEEE Trans. Intell. Transp. Syst. 2021, 23, 5655–5667. [Google Scholar] [CrossRef]
  39. Pradipta, G.A.; Wardoyo, R.; Musdholifah, A.; Sanjaya, I.N.H.; Ismail, M. SMOTE for Handling Imbalanced Data Problem: A Review. In Proceedings of the 2021 Sixth International Conference on Informatics and Computing (ICIC), Virtual Conference, 3–4 November 2021; pp. 1–8. [Google Scholar] [CrossRef]
  40. Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar] [CrossRef]
  41. González-Muñiz, A.; Díaz, I.; Cuadrado, A.A.; García-Pérez, D.; Pérez, D. Two-step residual-error based approach for anomaly detection in engineering systems using variational autoencoders. Comput. Electr. Eng. 2022, 101, 108065. [Google Scholar] [CrossRef]
  42. Wijaya, W.M.; Nakamura, Y. Loitering behavior detection by spatiotemporal characteristics quantification based on the dynamic features of Automatic Identification System (AIS) messages. PeerJ Comput. Sci. 2023, 9, e1572. [Google Scholar] [CrossRef] [PubMed]
  43. Duan, H.; Ma, F.; Miao, L.; Zhang, C. A semi-supervised deep learning approach for vessel trajectory classification based on AIS data. Ocean Coast. Manag. 2022, 218, 106015. [Google Scholar] [CrossRef]
  44. Maganaris, C.; Protopapadakis, E.; Doulamis, N. Outlier detection in maritime environments using AIS data and deep recurrent architectures. In Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments, Crete, Greece, 26–28 June 2024; pp. 420–427. [Google Scholar]
  45. Rong, H.; Teixeira, A.; Guedes Soares, C. A framework for ship abnormal behaviour detection and classification using AIS data. Reliab. Eng. Syst. Saf. 2024, 247, 110105. [Google Scholar] [CrossRef]
  46. Wolsing, K.; Roepert, L.; Bauer, J.; Wehrle, K. Anomaly Detection in Maritime AIS Tracks: A Review of Recent Approaches. J. Mar. Sci. Eng. 2022, 10, 112. [Google Scholar] [CrossRef]
  47. Minßen, F.M.; Klemm, J.; Steidel, M.; Niemi, A. Predicting Vessel Tracks in Waterways for Maritime Anomaly Detection. Trans. Marit. Sci. 2024, 13. [Google Scholar] [CrossRef]
  48. Martinčič, T.; Štepec, D.; Costa, J.P.; Čagran, K.; Chaldeakis, A. Vessel and Port Efficiency Metrics through Validated AIS data. In Proceedings of the Global Oceans 2020: Singapore–U.S. Gulf Coast, Virtual, 5–14 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
  49. Radon, A.N.; Wang, K.; Glässer, U.; Wehn, H.; Westwell-Roper, A. Contextual verification for false alarm reduction in maritime anomaly detection. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015; pp. 1123–1133. [Google Scholar] [CrossRef]
  50. Su, L.; Zuo, X.; Li, R.; Wang, X.; Zhao, H.; Huang, B. A systematic review for transformer-based long-term series forecasting. Artif. Intell. Rev. 2025, 58, 80. [Google Scholar] [CrossRef]
  51. Capobianco, S.; Millefiori, L.M.; Forti, N.; Braca, P.; Willett, P. Deep learning methods for vessel trajectory prediction based on recurrent neural networks. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 4329–4346. [Google Scholar] [CrossRef]
  52. Petković, M. Enhancing Maritime Video Surveillance Trough Deep Learning and Hybrid Distance Estimation. Ph.D. Thesis, University of Split, Split, Croatia, 2024. [Google Scholar]
  53. Seong, N.; Kim, J.; Lim, S. Graph-Based Anomaly Detection of Ship Movements Using CCTV Videos. J. Mar. Sci. Eng. 2023, 11, 1956. [Google Scholar] [CrossRef]
  54. Xue, H.; Chen, X.; Zhang, R.; Wu, P.; Li, X.; Liu, Y. Deep learning-based maritime environment segmentation for unmanned surface vehicles using superpixel algorithms. J. Mar. Sci. Eng. 2021, 9, 1329. [Google Scholar] [CrossRef]
  55. Bilous, N.; Malko, V.; Frohme, M.; Nechyporenko, A. Comparison of CNN-Based Architectures for Detection of Different Object Classes. AI 2024, 5, 2300–2320. [Google Scholar] [CrossRef]
  56. Matasci, G.; Plante, J.; Kasa, K.; Mousavi, P.; Stewart, A.; Macdonald, A.; Webster, A.; Busler, J. Deep learning for vessel detection and identification from spaceborne optical imagery. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2021, 3, 303–310. [Google Scholar] [CrossRef]
  57. Huang, Y.; Han, D.; Han, B.; Wu, Z. ADV-YOLO: Improved SAR ship detection model based on YOLOv8. J. Supercomput. 2025, 81, 34. [Google Scholar] [CrossRef]
  58. Li, X. Ship segmentation via combined attention mechanism and efficient channel attention high-resolution representation network. J. Mar. Sci. Eng. 2024, 12, 1411. [Google Scholar] [CrossRef]
  59. Xing, Z.; Ren, J.; Fan, X.; Zhang, Y. S-DETR: A transformer model for real-time detection of marine ships. J. Mar. Sci. Eng. 2023, 11, 696. [Google Scholar] [CrossRef]
  60. Guo, L.; Wang, Y.; Guo, M.; Zhou, X. YOLO-IRS: Infrared Ship Detection Algorithm Based on Self-Attention Mechanism and KAN in Complex Marine Background. Remote Sens. 2024, 17, 20. [Google Scholar] [CrossRef]
  61. Farahnakian, F.; Heikkonen, J. Deep learning based multi-modal fusion architectures for maritime vessel detection. Remote Sens. 2020, 12, 2509. [Google Scholar] [CrossRef]
  62. Kalliovaara, J.; Jokela, T.; Asadi, M.; Majd, A.; Hallio, J.; Auranen, J.; Seppänen, M.; Putkonen, A.; Koskinen, J.; Tuomola, T.; et al. Deep learning test platform for maritime applications: Development of the em/s salama unmanned surface vessel and its remote operations center for sensor data collection and algorithm development. Remote Sens. 2024, 16, 1545. [Google Scholar] [CrossRef]
  63. Lu, Y.; Yang, K.; Yang, D.; Ding, H.; Weng, J.; Liu, R.W. Graph Learning-Driven Multi-Vessel Association: Fusing Multimodal Data for Maritime Intelligence. arXiv 2025, arXiv:2504.09197. [Google Scholar]
  64. Lu, Y.; Ma, H.; Smart, E.; Vuksanovic, B.; Chiverton, J.; Prabhu, S.R.; Glaister, M.; Dunston, E.; Hancock, C. Fusion of camera-based vessel detection and ais for maritime surveillance. In Proceedings of the 2021 26th International Conference on Automation and Computing (ICAC), Portsmouth, UK, 2–4 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
  65. MIT Sea Grant Autonomous Underwater Vehicles Lab. AUV Lab – Marine Perception Datasets (AUVLab). 2022. Available online: https://seagrant.mit.edu/auvlab-datasets-marine-perception-2-3/ (accessed on 9 June 2025).
  66. Protopapadakis, E.; Voulodimos, A.; Doulamis, A.; Doulamis, N.; Dres, D.; Bimpas, M. Stacked autoencoders for outlier detection in over-the-horizon radar signals. Comput. Intell. Neurosci. 2017, 2017, 5891417. [Google Scholar] [CrossRef]
  67. Yao, S.; Guan, R.; Wu, Z.; Ni, Y.; Huang, Z.; Liu, R.W.; Yue, Y.; Ding, W.; Lim, E.G.; Seo, H.; et al. Waterscenes: A multi-task 4d radar-camera fusion dataset and benchmarks for autonomous driving on water surfaces. IEEE Trans. Intell. Transp. Syst. 2024, 25, 16584–16598. [Google Scholar] [CrossRef]
  68. Varga, M.; Liggett, K.K.; Bivall, P.; Lavigne, V. Exploratory Visual Analytics (STO-TR-IST-141); Technical Report; NATO: Brussels, Belgium, 2023. [Google Scholar]
  69. FAO. State Of Worlds Fisheries And Aquaculture 2002; Food and Agriculture Organization of the United Nations: Rome, Italy, 2020. [Google Scholar]
  70. Jin, M.; Shi, W.; Lin, K.C.; Li, K.X. Marine piracy prediction and prevention: Policy implications. Mar. Policy 2019, 108, 103528. [Google Scholar] [CrossRef]
  71. Li, H.; Yang, Z. Towards safe navigation environment: The imminent role of spatio-temporal pattern mining in maritime piracy incidents analysis. Reliab. Eng. Syst. Saf. 2023, 238, 109422. [Google Scholar] [CrossRef]
  72. Fahreza, M.I.; Hirata, E. Maritime piracy and armed robbery analysis in the Straits of Malacca and Singapore through the utilization of natural language processing. Marit. Policy Manag. 2024, 52, 709–722. [Google Scholar] [CrossRef]
  73. Hu, Z.; Sun, Y.; Zhao, Y.; Wu, W.; Gu, Y.; Chen, K. Msif-Sstr: A Ship Smuggling Trajectory Recognition Method Based on Multi-Source Information Fusion. Appl. Ocean. Res. 2025. [Google Scholar] [CrossRef]
  74. Sun, Z.; Yang, Q.; Yan, N.; Chen, S.; Zhu, J.; Zhao, J.; Sun, S. Utilizing deep learning algorithms for automated oil spill detection in medium resolution optical imagery. Mar. Pollut. Bull. 2024, 206, 116777. [Google Scholar] [CrossRef]
  75. Deo, R.; John, C.M.; Zhang, C.; Whitton, K.; Salles, T.; Webster, J.M.; Chandra, R. Deepdive: Leveraging Pre-trained Deep Learning for Deep-Sea ROV Biota Identification in the Great Barrier Reef. Sci. Data 2024, 11, 957. [Google Scholar] [CrossRef]
  76. Taipalmaa, J.; Raitoharju, J.; Queralta, J.P.; Westerlund, T.; Gabbouj, M. On automatic person-in-water detection for marine search and rescue operations. IEEE Access 2024, 12, 52428–52438. [Google Scholar] [CrossRef]
  77. Wang, S.; Han, Y.; Chen, J.; Zhang, Z.; Wang, G.; Du, N. A deep-learning-based sea search and rescue algorithm by UAV remote sensing. In Proceedings of the 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC), Xiamen, China, 10–12 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–5. [Google Scholar]
  78. Moosbauer, S.; Konig, D.; Jakel, J.; Teutsch, M. A benchmark for deep learning based object detection in maritime environments. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
  79. Kiefer, B.; Kristan, M.; Perš, J.; Žust, L.; Poiesi, F.; Andrade, F.; Bernardino, A.; Dawkins, M.; Raitoharju, J.; Quan, Y.; et al. 1st workshop on maritime computer vision (macvi) 2023: Challenge results. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 265–302. [Google Scholar]
  80. AISHub. AIS Data Sharing and Vessel Tracking by AISHub. 2025. Available online: https://www.aishub.net/ (accessed on 27 June 2025).
  81. Al-Saad, M.; Aburaed, N.; Panthakkan, A.; Al Mansoori, S.; Al Ahmad, H.; Marshall, S. Airbus ship detection from satellite imagery using frequency domain learning. In Proceedings of the Image and Signal Processing for Remote Sensing XXVII, Online, 13–18 September 2021; SPIE: Bellingham, WA, USA, 2021; Volume 11862, pp. 279–285. [Google Scholar]
  82. Shao, Z.; Wu, W.; Wang, Z.; Du, W.; Li, C. Seaships: A large-scale precisely annotated dataset for ship detection. IEEE Trans. Multimed. 2018, 20, 2593–2604. [Google Scholar] [CrossRef]
  83. Huang, W.; Feng, H.; Xu, H.; Liu, X.; He, J.; Gan, L.; Wang, X.; Wang, S. Surface Vessels Detection and Tracking Method and Datasets with Multi-Source Data Fusion in Real-World Complex Scenarios. Sensors 2025, 25, 2179. [Google Scholar] [CrossRef]
  84. Nanda, A.; Cho, S.W.; Lee, H.; Park, J.H. KOLOMVERSE: Korea Open Large-Scale Image Dataset for Object Detection in the Maritime Universe. IEEE Trans. Intell. Transp. Syst. 2024, 25, 20832–20840. [Google Scholar] [CrossRef]
  85. Wei, S.; Zeng, X.; Qu, Q.; Wang, M.; Su, H.; Shi, J. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access 2020, 8, 120234–120254. [Google Scholar] [CrossRef]
  86. Chen, S.Q.; Zhan, R.H.; Zhang, J. Robust single stage detector based on two-stage regression for SAR ship detection. In Proceedings of the 2nd International Conference on Innovation in Artificial Intelligence, London, UK, 26 July 2018; pp. 169–174. [Google Scholar]
  87. Watson, R.A. A database of global marine commercial, small-scale, illegal and unreported fisheries catch 1950–2014. Sci. Data 2017, 4, 170039. [Google Scholar] [CrossRef]
  88. Blondeau-Patissier, D.; Schroeder, T.; Suresh, G.; Li, Z.; Diakogiannis, F.I.; Irving, P.; Witte, C.; Steven, A.D. Detection of marine oil-like features in Sentinel-1 SAR images by supplementary use of deep learning and empirical methods: Performance assessment for the Great Barrier Reef marine park. Mar. Pollut. Bull. 2023, 188, 114598. [Google Scholar] [CrossRef] [PubMed]
  89. Lou, X.; Liu, Y.; Xiong, Z.; Wang, H. Generative knowledge transfer for ship detection in SAR images. Comput. Electr. Eng. 2022, 101, 108041. [Google Scholar] [CrossRef]
  90. Martinez-Esteso, J.P.; Castellanos, F.J.; Rosello, A.; Calvo-Zaragoza, J.; Gallego, A.J. On the use of synthetic data for body detection in maritime search and rescue operations. Eng. Appl. Artif. Intell. 2025, 139, 109586. [Google Scholar] [CrossRef]
  91. Zhao, T.; Wang, Y.; Li, Z.; Gao, Y.; Chen, C.; Feng, H.; Zhao, Z. Ship detection with deep learning in optical remote-sensing images: A survey of challenges and advances. Remote Sens. 2024, 16, 1145. [Google Scholar] [CrossRef]
  92. Guo, Y.; Zhou, L. MEA-Net: A lightweight SAR ship detection model for imbalanced datasets. Remote Sens. 2022, 14, 4438. [Google Scholar] [CrossRef]
  93. Giannopoulos, A.; Gkonis, P.; Bithas, P.; Nomikos, N.; Kalafatelis, A.; Trakadas, P. Federated learning for maritime environments: Use cases, experimental results, and open issues. J. Mar. Sci. Eng. 2024, 12, 1034. [Google Scholar] [CrossRef]
Figure 1. An illustration of SAR and optical (RGB) imagery data through satellite.
Figure 1. An illustration of SAR and optical (RGB) imagery data through satellite.
Information 16 00658 g001
Figure 2. An illustration of working system of AIS.
Figure 2. An illustration of working system of AIS.
Information 16 00658 g002
Figure 3. Multimodal data fusion approach to deep learning application of maritime security.
Figure 3. Multimodal data fusion approach to deep learning application of maritime security.
Information 16 00658 g003
Table 1. Summary of deep learning models for maritime image and video analysis in the context of surveillance and situational awareness.
Table 1. Summary of deep learning models for maritime image and video analysis in the context of surveillance and situational awareness.
Model Architecture TypeApplicationStrengthsWeaknessesLatency/Comp. CostOperational Context/Key LimitationReported Performance
YOLO (v4–v8) [55]CNN (single-shot, anchor-based)Ship/Object DetectionVery fast inference; high accuracy; real-time capable.Anchor design requires tuning; may miss very small objects.Low (e.g., 20–50 FPS on modern GPUs). Suitable for real-time edge deployment on USVs or drones.Performance degrades with heavy sun glare or reflective water, which can obscure small vessel outlines. Not ideal for highly cluttered port scenes without fine-tuning.mAP = ∼ 0.88 , F1 = ≈ 0.88 (on general object benchmarks)
RetinaNet [56]CNN (one-stage, FPN)Ship/Object DetectionHandles class imbalance with focal loss; high detection accuracy.Comparatively heavy; slower than YOLO; may struggle on tiny targets.Medium. Slower than pure single-stage detectors, may not be suitable for high-frame-rate video surveillance without powerful hardware.Focal loss is effective for the data imbalance of rare vessel types (e.g., specific illegal fishing boats), but still challenged by severe weather or high sea states which can obscure small targets.F1 = ≈ 0.795 on multiscale spaceborne dataset
CNN-MR [8]CNN (multi-resolution input)SAR Ship ClassificationUtilizes multi-scale SAR inputs for richer features; excellent classification.Requires multi-resolution SAR data; more complex input.High. Processing and fusing multi-resolution SAR is computationally intensive and not suited for real-time edge applications.Specialized for SAR imagery; invaluable for all-weather, night-time surveillance where optical sensors fail. Not applicable to standard optical/IR data streams.F1 = 0.94
EL-YOLO [12]CNN (YOLOv8 variant)Ship/Object Detection (RGB)Lightweight YOLOv8 variant; improved bounding box regression (AWIoU, SMFN); better small object performance.Still CNN-heavy; many components to tune.Low to Medium. Optimized for a reduced footprint suitable for edge devices, but added components can be more demanding than the simplest YOLO variants.Specifically tuned for maritime scenes with many small vessels. As an RGB model, its performance is entirely dependent on good visibility and lighting conditions. mAP 0.5 = 0.672, mAP 0.5 : 0.9 = 0.348 on Sea ships (significant gain YOLOv3-tiny)
ADV-YOLO [57]CNN (YOLOv8 variant)SAR Ship DetectionEnhanced for SAR: space-to-depth and dilation modules; uses WIoU loss.May be heavyweight; specialised to SAR imagery.High. The specialized modules add significant computational overhead compared to a baseline YOLOv8.A highly specialized model designed to extract better features from SAR images. Excellent for overcoming adverse weather, but not general-purpose for other sensor types.HRSID: AP 50 95 =  ≈ 70 % (+4.5% vs. YOLOv8n); SSDD: AP 50 + 1.1%.
CA2HRNet [58]CNN (HRNet with attention)Ship Segmentation/DetectionHigh resolution feature extraction with combined channel/spatial attention; achieves very high accuracy and IoU.Computationally heavy (segmentation network); specialised.Very High. The segmentation component adds significant overhead, making it unsuitable for real-time detection tasks.Designed for high-precision segmentation, not just detection. Useful for tasks like precise spill area estimation or docking assistance, but too slow for general real-time tracking.Accuracy = 99.77%, F1 = 97.0%, IoU = 96.97%
S-DETR [59]Transformer (DETRbased)Ship/Object DetectionEnd-to-end detection; built-in scale attention and dense queries for multi-scale ships; comparable speed to single-stage models.Higher complexity; slow convergence; needs many epochs.High (Training), Medium (Inference). Requires significantly more data and longer training than CNNs. Inference speed can approach real-time.Built-in attention is theoretically more robust for scenes with vessels of vastly different sizes. However, it is less mature in operational maritime deployments compared to the well-established YOLO family.Achieves state-of-art multi-scale detection in trials (real-time capable)
YOLO-IRS [60]CNN+Transformer (Swin)IR Ship DetectionYOLOv10-based IR model with Swin transformer backbone; better small/weak target detection, anti-interference.Slightly higher complexity; still emerging research.Medium. The Swin transformer backbone adds computational overhead compared to a pure CNN backbone, but is optimized for efficiency.Specialized for Infrared (IR) data. Highly effective at night or for detecting vessels with thermal signatures (e.g., running engines) against a cooler water background. May struggle in daytime.+1.3% precision, +0.5% mAP 50 , +1.7% mAP 50 95 vs. YOLOv10
Table 2. Deep learning architectures for multi-sensor fusion in maritime surveillance.
Table 2. Deep learning architectures for multi-sensor fusion in maritime surveillance.
Fusion TypeSensors CombinedTechniques UsedApplicationsPractical Challenges and ConsiderationsPerformance Highlights
Early Fusion [61]RGB (EO)+IR imageryCNN (concatenate inputs)Vessel detection in visible/thermalRequires precise pixel-level alignment and calibration between sensors, which is very difficult to maintain on a moving, vibrating platform. Any misalignment can corrupt the input data and degrade model performance.Fusing raw pixel data allows CNN to learn combined features; robust in mixed lighting.
Mid Fusion [61]RGB+IR imageryCNN (feature-level fusion)Vessel detection across modalitiesArchitecturally complex. Balancing and normalizing features from different modalities (e.g., visual texture vs. thermal intensity) before fusion is crucial to avoid one sensor’s features dominating the other. Requires careful network design.Multi-modal mid-fusion gave highest accuracy: AP = ≈ 79.1 % (daytime) and 61.6% (night), outperforming uni-modal.
Late Fusion [61]RGB+IR imageryCNN (separate branches)Ensemble detection/classificationCan be less efficient as it requires running multiple full models. The primary challenge lies in designing the decision-level logic to effectively associate or resolve conflicting detections from the different sensor streams.Decision-level fusion improves robustness; effectively integrates complementary IR and RGB cues.
Mid Fusion [66]AIS+Marine RadarRNN, CNNVessel behaviour classificationMajor challenge is robust data association between sparse, high-latency AIS signals and continuous radar tracks. Prone to failure if AIS signals are spoofed, delayed, or lost (e.g., ’dark vessels’), making it difficult to reliably link a radar blip to a vessel identity.Learns spatiotemporal patterns from trajectories and radar; showed moderate precision (data-limited) in identifying vessel status.
Association (graph) [63]AIS+EO Video (CCTV)GNN with attentionMulti-target vessel associationRequires complex and continuous temporal and spatial alignment: matching sparse AIS pings to continuous video frames and co-registering world coordinates with pixel coordinates. High vessel density can lead to incorrect associations.Graph-based fusion with spatiotemporal attention improved association accuracy and robustness.
Table 3. Summary of deep learning models for specific maritime security applications.
Table 3. Summary of deep learning models for specific maritime security applications.
ModelArchitecture TypeApplicationStrengthsWeaknessesReported Performance
BiLSTM-CNN-Attention [19]BiLSTM, CNN and attention mechanismIllegal Fishing DetectionHigh accuracy; real-time capable; capturing both past and future context in the sequential dataData bias problems; misclassifies stow-net vessels and gillnetters as illegal fishing trawlersAccuracy ≈ 74%, Precision = 0.7562, Recall = 0.7410, F1 Score = 0.7408
FishNet [4]A combination of DenseNet, Feature Fusion (CNN-based module), and  Multilevel Feature AggregationFishing vessels classificationHigh accuracyLonger training timeAccuracy ≈ 90%, Precision = 0.9017, Recall = 0.8981, F1 Score = 0.8971
Stacked-YOLOV5 [15]CNN (YOLOv5)Lit fishing boats detectionImproved feature extraction and detection performancePoor detection accuracy when lights from non-fishing vessels introduce noisePrecision = 0.966, Recall = 0.930, Map@0.5 = 0.931 F1 Score = 0.948
YOLOv10s [3]CNN (YOLOv10 small)Dark vessels detectionAble to detect small ships; reduced architecture with unnecessary Conv and C2f layers removedThe proposed pipeline demands high computational resources.accuracy AP 50  = 0.8588, AP 50 95 = 0.6631, precision = 0.9370, recall = 0.9381, and specificity = 0.9869
YOLOv8m [73]CNN (YOLOv8m)Ship-to-ship smuggling detectionHigh accuracy; fusion of radar trajectories and the corresponding meteorological dataHigher complexityF1 = 0.97, accuracy = 94%
Faster R-CNN with ResNet101 [2]CNN, RNN (YOLOv2-v3, Faster R-CNN), feature extraction (GoogLeNet, ResNet18, ResNet50, and ResNet101)Small inflatable smuggling boats detectionFaster R-CNN with ResNet101 achieves high detection rateHigher complexity; slow convergence; needs many epochs; detection capability reduction in varying environmental conditionsAccuracy = 95%, mIoU = 79%
Sw-YoloX [16]CNN (Convolutional Block Attention Module, Atrous Spatial Pyramid Pooling)Search and Rescue OperationsHigh accuracyRequires pruning for lower weights to reduce memory overheadF1 = 0.78, mAP = 54, recall = 0.72
Table 4. Image datasets for deep learning in maritime domain, either open access or available on request from the concerned authors.
Table 4. Image datasets for deep learning in maritime domain, either open access or available on request from the concerned authors.
Dataset NameSensor/ModalityData TypeAnnotationsSize/ScaleLimitations
WaterScenes [67]Camera (RGB), 4D Radar, GPS/IMUImage sequences (video)2D bounding boxes (camera), 3D point clusters (radar)54,120 RGB frames+radar scans; ∼200 k object instancesSame locale (Singapore); weather range limited.
SeaDronesSee [79]UAV RGB VideoImages & videoBounding boxes (boats, people, flares); track IDs (multi/SOT)8930 train+ 3750 test images (drones); includes full video clips for trackingMostly temperate marine conditions; daytime imagery
Airbus Ship Detection [81]Satellite optical (SPOT)Image chipsPixel-wise ship masks (RLE)231,723 images, 81,723 contain at-least 1 shipPrimarily daylight RGB; many empty frames; oriented masks
SeaShips [82]Shorebased cameras (RGB)ImagesBounding boxes + ship type (6 classes)31,455 images of coastal trafficFixed coastal perspectives; limited environmental diversity
SPSCD [83]Port surveillance (RGB)ImagesBounding boxes + ship class (12 types)19,337 images, 27,849 labeled ship instancesFocused on port environments; no AIS tracking
KOLOMVERSE [84]UAV 4K imagesImagesBounding boxes (vessels)100,000+ 4 K images of one class “boat”Single object class (“boat”); access upon request
HRSID [85]SAR imageryImagesBounding boxes (ships)5604 high-res SAR images, 16,951 ship instancesSAR-only modality (requires specialised processing)
SSDD [86]SAR imagery (Sentinel-1, TerraSAR-X)ImagesBounding boxes (ships)2752 SAR image chips (ships/non-ships)Limited to SAR; chip-based (small images)
Table 5. Deep learning datasets for specific applications in maritime domain, either open access or available on request from the concerned authors.
Table 5. Deep learning datasets for specific applications in maritime domain, either open access or available on request from the concerned authors.
Dataset NameApplicationSize/ScaleLimitations
Global Fisheries Catch 1950–2014 [87]A database of global marine commercial, small-scale, illegal and unreported fisheries catch 1950–2014Nearly 868 million records with 12 descriptive fields, structured in 5-year blocks starting from 1950Data can be heavily skewed toward certain regions or time periods, undermining representativeness
FishingVesselSAR [4]SAR images for fishing vessel classification369 high-resolution SAR image (116 gillnetters, 72 seiners, and 181 trawlers)Data can be heavily skewed toward certain regions or time periods, undermining representativeness
Luojia 0 1 [15]Nighttime SAR images for fishing vessel classification1364 high-resolution SAR image of 1281 lit fishing vesselsThe sample dataset is relatively small and the presence of lights from non-fishing vessels may introduce noise.
Maritime Piracy Incidents [70]Structured data of piracy incidents8369 records of piracy incidents from 1990–2021Dataset primarily focuses on high-risk areas, potentially overlooking other regions.
HS3-S2 [3]SAR, Sentinel-2, and high-resolution optical images for detecting suspicious maritime activities69,331 imagesIntegrating multiple sources of satellite imagery increases the complexity of pre-processing and model training. Additionally, the varying resolutions of the images from different sources can pose challenges in standardising the input data for the detection model.
HN_BF [73]Ship trajectories near Qiongzhou Strait in China from March to May 20245337 labeled trajectories including 1473 as “Big flyer” and the rest as “Normal”Focusing on one particular region which may impact model generalisation ability when employed outside the specified region.
CSIRO [88]Oil spill detection dataset5630 image chips: 3725 chips class 0 (no oil features) and 1905 chips with class 1 (containing oil features)Look-alike features such as wind shadows, reef structures, or biogenic slicks may increase the false positive rate of oil-like feature detection.
Oil spill [21]Oil spill segmentation and classification dataset19,544 RGB images: 8376 cropped images, 3168 resized images, and 8000 synthetic imagesThe dataset is imbalanced, with certain types of oil spills being underrepresented compared to others. The images come from various sources with different resolutions, which can affect the model’s performance.
Deepdive [75]Deep-sea biota images captured by a remotely operated vehicle (ROV)4158 images of deep-sea biota belonging to 62 different classesThe manual labeling process, despite rigorous quality control, may still introduce errors due to the complexity of deep-sea biota shapes and overlapping boundaries.
SeaDronesSee [79]UAV videos for maritime surveillance, rescue operations, human detection in aquatic environments, drone-based vision research.54,000 image with 400,000 instances with class labels such as boats, people, and buoys.It is a synthetic dataset, however effectiveness of computer vision algorithms is heavily reliant on real-case training data.
SAR-HumanDetection-FinlandProper [76]UAV images for maritime surveillance, rescue operations, human detection in aquatic environments, drone-based vision research.72,000 images of instances with positive class label as swimming/floating person.The dataset lacks complex scenarios and weather conditions, as the images are daylight and clear summer weather. It may be ineffective in detection tasks in real-world cases.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Talpur, K.; Hasan, R.; Gocer, I.; Ahmad, S.; Bhuiyan, Z. AI in Maritime Security: Applications, Challenges, Future Directions, and Key Data Sources. Information 2025, 16, 658. https://doi.org/10.3390/info16080658

AMA Style

Talpur K, Hasan R, Gocer I, Ahmad S, Bhuiyan Z. AI in Maritime Security: Applications, Challenges, Future Directions, and Key Data Sources. Information. 2025; 16(8):658. https://doi.org/10.3390/info16080658

Chicago/Turabian Style

Talpur, Kashif, Raza Hasan, Ismet Gocer, Shakeel Ahmad, and Zakirul Bhuiyan. 2025. "AI in Maritime Security: Applications, Challenges, Future Directions, and Key Data Sources" Information 16, no. 8: 658. https://doi.org/10.3390/info16080658

APA Style

Talpur, K., Hasan, R., Gocer, I., Ahmad, S., & Bhuiyan, Z. (2025). AI in Maritime Security: Applications, Challenges, Future Directions, and Key Data Sources. Information, 16(8), 658. https://doi.org/10.3390/info16080658

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop