Next Article in Journal
Evaluation of Different AI-Based Wave Phase-Resolved Prediction Methods
Previous Article in Journal
Spatial and Temporal Variation in Wave Overtopping Across a Coastal Structure Based on One Year of Field Observations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

RAFS-Net: A Robust Adversarial Fusion Framework for Enhanced Maritime Surveillance in Hostile Environments

1
Naval Architecture and Shipping College, Guangdong Ocean University, Zhanjiang 524005, China
2
Guangdong Provincial Key Laboratory of Intelligent Equipment for South China Sea Marine Ranching, Zhanjiang 524005, China
3
Technical Research Center for Ship Intelligence and Safety Engineering of Guangdong Province, Zhanjiang 524005, China
4
School of Civil and Transportation Engineering, Guangdong University of Technology, Guangzhou 510006, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(11), 2195; https://doi.org/10.3390/jmse13112195
Submission received: 30 October 2025 / Revised: 12 November 2025 / Accepted: 14 November 2025 / Published: 18 November 2025
(This article belongs to the Topic Coastal Engineering: Past, Present and Future)

Abstract

Deep learning-based intelligent ship surveillance technology has become an indispensable component of modern maritime intelligent perception, with its adversarial defense capabilities serving as a crucial guarantee for reliable and stable monitoring. However, current research on deep learning-based ship surveillance primarily focuses on minimizing the discrepancy between predicted labels and ground truth labels, overlooking the equal importance of enhancing defense capabilities in the adversarial technology-laden maritime environment. To address this challenge and improve model robustness and stability, this study proposes a novel framework termed the Robust Adversarial Fusion Surveillance Net Framework (RAFS-Net). Utilizing ResNet as the backbone network foundation, the framework constructs a ship adversarial attack chain through an adversarial generation module. An adversarial training module enables the model to comprehensively learn adversarial perturbation features. These dual modules effectively rectify abnormal decision boundaries via a synergistic mechanism, compelling the model to learn robust feature representations resilient to malicious interference. Experimental results demonstrate that the framework maintains stable and efficient detection capabilities even in marine environments saturated with interfering information. By systematically integrating gradient-driven adversarial sample generation and an end-to-end training mechanism, it achieves a performance breakthrough of 9.1% in mean Average Precision (mAP) on the ship adversarial benchmark dataset, providing technical support for maritime surveillance models in complex adversarial environments.

1. Introduction

The shipping industry, serving as the lifeblood of the global economic system, undertakes over 90% of global trade volume, with vessels constituting the core carriers within this vast transportation network [1]. The burgeoning development of artificial intelligence has propelled deep learning-based intelligent ship surveillance technology to achieve qualitative breakthroughs in maritime monitoring, owing to its exceptional data-fitting capabilities. Consequently, it has been widely adopted in vessel surveillance [2], significantly enhancing the efficiency and accuracy of ship management tasks. However, distinct from conventional surveillance tasks, maritime surveillance serves critical functions in civilian contexts [3]—such as ocean resource regulation, illegal fishing identification, and maritime search and rescue coordination—while at the military application level [4], it provides vital support for territorial waters patrol, maritime rights protection, and strategic port facility monitoring. The stable operation of maritime surveillance systems forms a crucial line of defense for safeguarding shipping safety and maintaining the orderly functioning of global supply chains. Its reliability directly impacts trillions of dollars in maritime trade, the safety of millions of crew members, and the sustainability of marine ecosystems.
Therefore, ship surveillance based on deep learning technology is an indispensable component of next-generation maritime strategic planning, with its core requirement being to ensure high-level safety and stability across diverse environments. For instance, incidents of vessel signal spoofing and jamming in the Red Sea region increased significantly in 2025 [5]. Such adversarial interference not only substantially escalates navigational complexity but also directly undermines vessels’ ability to transmit emergency distress signals. In today’s increasingly adversarial maritime environment, the robustness of automated ship surveillance models faces severe challenges.
Although existing deep learning ship surveillance models excel in vessel detection across numerous scenarios, the inherent black-box nature of deep learning models renders their intermediate features non-interpretable. This results in weak adversarial attack and defense capabilities, leaving subtle vulnerabilities that can be readily exploited. Such vulnerabilities provide opportunities for malicious actors to evade surveillance through opportunistic means. For example, in the field of Synthetic Aperture Radar (SAR) ship detection, studies reveal that adversaries can exploit gradient information of deep learning models to generate visually imperceptible adversarial perturbations, which are superimposed on original SAR images [6]. These meticulously crafted disturbances can successfully “deceive” advanced detection models. Furthermore, in optical surveillance tasks under visible-light scenarios, malicious actors can distort vessels’ imaging characteristics in sensors through camouflage [7], causing deep learning models to misidentify them as floating debris or background noise. Such flaws lead to critical failures, including missed detections of illegal vessels disguised as merchant ships or false alarms in harmless areas. This not only creates covert channels for smuggling, illegal fishing, and unauthorized approaches to sensitive waters but may also trigger collision risks in critical zones like busy shipping lanes or coastal areas. Moreover, it can disrupt the normal operation of maritime management and safety systems, directly threatening lives, property, and regional maritime order. Despite significant progress in precision enhancement, the stability and reliability of existing models remain inadequately validated in mission-critical scenarios—such as maritime law enforcement encounters involving electronic jamming or signal spoofing—where robustness is most essential. This gap in robustness research may propagate maritime safety risk chains in practical operations.
In summary, singular improvements in accuracy do not equate to a secure and reliable surveillance model; instead, they highlight substantial room for robustness enhancement. In environments plagued by adversarial attacks, decision boundary rectification and defense capability fortification emerge as pivotal challenges, further underscoring the imperative to enhance the safety and stability of ship surveillance models in complex marine environments.
The critical flaw in current automated vessel surveillance models lies in their extreme vulnerability to adversarial perturbations, directly causing reliability collapse under malicious attacks. Specifically, on one hand, attacks involving occlusion or tampering of sample data severely corrupt the foundational information models rely on for learning, triggering instability issues such as prediction inaccuracies and behavioral logic confusion, ultimately leading to surveillance failure, one the other hand carefully crafted adversarial inputs can effectively deceive models, causing severe detection evasion problems—manifested as failures to correctly identify or track target vessels—resulting in sharp declines in prediction accuracy, drastic trajectory deviations, and significantly escalated control risks. Therefore, there is an urgent need to develop an innovative training strategy-based vessel type detection model to fundamentally counteract these two types of adversarial interference. By revolutionizing the training paradigm, the model’s robustness can be substantially elevated, ensuring high-precision and highly reliable surveillance capabilities across diverse complex scenarios, including data corruption and adversarial inputs.
To address the aforementioned challenges regarding the unreliable deployment of current maritime surveillance models, this study proposes a novel vessel surveillance framework based on adversarial training strategies. Its core innovation lies in proactively introducing and learning to counteract simulated adversarial interference data—namely adversarial examples—during model training, thereby significantly enhancing model stability (robustness) in complex jamming environments. Specifically, we simulate various subtle, imperceptible data perturbations that attackers might generate and incorporate these disruptive data alongside normal data for model training. This compels the model not only to recognize standard vessel features but also to learn to ignore or resist such interference. Analogous to stress testing, this approach forces the model to adapt to harsher conditions, ensuring stable recognition performance when encountering similar disruptions in real-world scenarios, thereby reducing misjudgments. Such enhanced stability is paramount for guaranteeing the reliability of maritime regulatory tasks.
Experimental validation on standard vessel image datasets and adversarial interference datasets demonstrates that the RAFS-Net framework achieves high efficiency through its dual-module synergistic mechanism. Its adversarial generation module successfully synthesizes multimodal perturbation samples that disrupt benchmark models, significantly increasing their false detection rates. Concurrently, the adversarial training module designed in this study effectively maintains surveillance stability against heavily perturbed samples. Additionally, a series of supplementary experiments were conducted to evaluate the efficacy of the strategies employed in RAFS-Net.
In summary, the primary contributions of this study are as follows:
  • This study introduces an innovative robust adversarial fusion training framework, whose core lies in systematically integrating adversarial training strategies. By iteratively generating and injecting gradient-based adversarial perturbation samples during training, these perturbed samples synergistically optimize model parameters with original training data. This forces the model to learn more discriminative and robust feature representations under optimization objectives, effectively enhancing its generalization capability and stability in marine monitoring environments plagued by complex adversarial interference such as digital perturbations and physical camouflage. Compared to baseline models, the framework achieves significant improvements in key metrics: the mean Average Precision (mAP) increases by an average of 9.103%, while the Area Under the Curve (AUC) also exhibit synchronous enhancements. This substantially strengthens the system’s overall performance and reliability in adversarial scenarios.
  • This study constructes and provides a novel adversarial benchmark dataset specifically designed for evaluating the robustness of ship surveillance models. The dataset encompasses images from five typical vessel categories. Its core innovation involves applying multiple adversarial attack algorithms with controllable intensity levels to generate corresponding adversarial samples for off-target ship images within the dataset. Compared to existing general ship detection datasets, this offers a standardized adversarial evaluation environment capable of precisely quantifying models’ robustness degradation under attack. It thereby establishes a rigorous benchmark for assessing model generalization performance in adversarial interference environments.
  • This study proposes and elaborates on an adversarial sample generation and training integration method tailored for the Robust Adversarial Fusion Surveillance Network (RAFS-Net). This method not only defines the complete adversarial sample generation workflow within the RAFS-Net framework (including critical hyperparameters such as attack algorithm selection) but, more importantly, seamlessly integrates it into the model’s end-to-end training loop. Through extensive experiments on ship datasets of varying scales and characteristics, the method provides key empirical insights into hyperparameter configurations for generating effective adversarial samples to enhance model robustness. It addresses core parameter setting challenges in adversarial training, such as perturbation magnitude control and balancing attack strength with computational efficiency, ensuring the method’s reproducibility and generalizability.
The remaining content will be presented in the following sections: First, Section 2 elucidates the research status of traditional vessel monitoring methods and deep learning-based visible-light ship monitoring models. Second, Section 3 provides a detailed description of the proposed framework, RAFS-Net, encompassing the problem formulation, adversarial attack generation phase, and adversarial training phase. Subsequently, Section 4 meticulously discusses the experimental setup and analysis of results. Finally, Section 5 summarizes the principal contributions of this study and introduces supplementary research on maritime management to be pursued subsequently.

2. Literature Review

Current ship monitoring technologies are broadly categorized into two classes: traditional vessel monitoring methods and automated ship surveillance models based on visible-light imaging integrated with deep learning techniques.

2.1. Conventional Passive Vessel Monitoring Methods

In the domain of traditional ship monitoring, radar technology and Automatic Identification System (AIS) applications are most prevalent. Data from these technologies are applied through multiple fusion schemes for real-time vessel tracking, collision warning, and behavioral analysis. As core systems for automated vessel identification and tracking, radar and AIS effectively acquire real-time dynamic vessel information, providing critical support for maritime management.
Regarding radar technology, Anderson et al. [8] pioneered the exploration of next-generation High-Frequency Skywave Radar (HFSWR) capabilities in ocean surveillance, analyzing both its shared functionalities with High-Frequency Surface Wave Radar (HFSWR) and its unique characteristics. Through modeling and experimental data, they demonstrated its value in civil, commercial, and scientific applications. However, radar signal processing often suffers from aliasing effects and information loss. Addressing this challenge, Zhou et al. [9] proposed a radar target pole extraction method based on Standard Particle Swarm Optimization (SPSO) and Autoregressive Moving Average (ARMA) modeling. By optimizing model parameters and adopting sliding-window calibration techniques, this approach significantly enhances ship magnetic signature extraction performance in high-frequency radar systems. Concurrently, Shi et al. [10] focused on micro-Doppler (m-D) signals in radar echoes, developing an effective method to extract motion parameters of vessels on time-varying sea surfaces for ship detection and identification in marine environments. Meanwhile, Liu et al. [11] introduced a “power fitting-difference” method: after coordinate transformation and denoising of X-band radar images, automatic extraction is achieved via power function fitting, mean filtering, and connected component analysis, attaining visual-interpretation-equivalent accuracy without manual intervention. Recently, Woo-García et al. [12] implemented these concepts practically by developing a sailboat monitoring and positioning system based on GPRS/GSM and GPS technologies, capable of accurately tracking sailboats within 10 km offshore.
Nevertheless, subsequent studies reveal inherent limitations of radar-dependent systems, such as restricted coverage areas. Particularly in high-density shipping lanes or regions with poor communication, system failures may occur [13]. To mitigate these shortcomings, researchers actively explore fusing AIS with radar data. Meanwhile, Chen et al. [14] proposed a weighted trajectory fusion algorithm model that significantly improves vessel target detection accuracy and real-time performance by integrating AIS and radar data. Separately, Li et al. [15] utilized X-band marine radar images combined with machine learning (SVM-FCM) to develop an intelligent marine oil spill detection method, enabling rapid and accurate data support for emergency response. Similarly advancing fusion techniques, Kim et al. [16] integrated AIS with broadband 3G™ radar based on Frequency-Modulated Continuous Wave (FMCW) technology. After preprocessing FMCW radar data and cross-validating with AIS information, they applied this approach to vessel position detection and tracking, demonstrating excellent matching performance, particularly for enhanced monitoring of small vessels. Further advancing integration, Kim et al. [17] constructed a comprehensive monitoring system combining satellite Synthetic Aperture Radar (SAR), High-Frequency (HF) Radar, unmanned aerial vehicles, and AIS for ship detection. Through field experiments, they validated its effectiveness in detecting illegal vessels and explored application potentials of multi-platform data in routine surveillance and emergency response scenarios.
In summary, although traditional monitoring methods based on shipborne equipment like radar and AIS have achieved significant success, they exhibit notable deficiencies in specific scenarios. For instance, traditional passive monitoring systems—highly dependent on vessel communication equipment—often fail to effectively detect non-SOLAS vessels lacking AIS systems or vessels deliberately disabling AIS to evade surveillance [18]. This constitutes a critical gap in maritime safety oversight, severely limiting proactive monitoring capabilities of maritime authorities and posing substantial challenges to ensuring safe and effective maritime management in complex real-world environments.

2.2. Visible Light-Based Automatic Ship Monitoring Modeling Approach Using Deep Learning Techniques

Driven by the rapid advancement of deep learning technologies, their application in ship detection has expanded significantly [2], substantially enhancing task efficiency [19]. Focusing on detection accuracy improvements, researchers have proposed numerous innovative approaches. Ma et al. [20] pioneered a two-stage detection framework based on ship center and orientation prediction. By constructing a center region prediction network and a ship orientation classification network, this method generates rotated region proposals and predicts rotated bounding boxes, enabling accurate detection of arbitrarily oriented ships in optical remote sensing imagery. Similarly, Nie et al. [21] employed a Convolutional Neural Network (CNN) architecture to precisely localize vessels and infer approximate orientations by identifying salient features such as ship bows. To enhance generalization to unknown vessel types, Sun et al. [22] proposed a Siamese network leveraging multi-angle metric learning, improving real-world adaptability with limited samples. For infrared small target detection, Guo et al. [23] designed the FCNet flexible convolution network, integrating dilated and deformable convolutions, multi-stage feature enhancement, and channel attention mechanisms to achieve precise detection and background suppression on their Maritime-SIRST dataset. Addressing small vessel detection challenges, Yin et al. [24] introduced VIOS-Net—a visible-infrared dual-source multi-task framework with shared underlying structures. It employs dual-path feature extraction (shared and exclusive) to fuse day-night spectral information, achieving all-weather vessel monitoring with 96.20% recognition accuracy and halved parameters. Concurrently, Chen et al. [25] adopted a hybrid deep learning approach combining enhanced Generative Adversarial Networks (GANs) to synthesize informative small vessel samples and optimized CNNs for real-time detection, significantly boosting accuracy and robustness. Furthermore, Li et al. [26] developed a curriculum learning strategy based on difficulty progression, creating high-precision ship detection models adaptable to diverse weather and illumination conditions. Similarly targeting complex environments, Tian et al. [27] constructed a Lightweight Marine Ship Detector (LMSD-Net) for extreme weather by incorporating improved ELA-C3 modules, WGC-PANet, and CoT blocks, enabling real-time detection while meeting lightweight requirements. Finally, Feng et al. [28] explored knowledge transfer and multi-source information fusion (e.g., Bridging Optical, Infrared, and Satellite Insights) to further enhance monitoring precision.
Significant progress has also been made in accelerating model training and inference. For SAR satellite remote sensing, Zuo et al. [29] proposed TPNet, which rapidly localizes ships using only central and diagonal points, achieving leading accuracy with minimal computational overhead. Validation across three datasets confirmed its suitability for real-time, all-weather maritime monitoring. Similarly pursuing efficiency, Liu et al. [30] introduced a novel deep CNN-based vessel detection and classification method utilizing residual networks to address classifier divergence and Inception layers to increase depth without substantial complexity, effectively improving detection and training speeds. To tackle small targets in complex marine environments, Cai et al. [31] established a “whole-image annotation then slicing” workflow, constructing the fine-grained SDFSD dataset with 15 vessel types from sub-meter SAR imagery and evaluating it via rotated detection models, providing a high-quality benchmark for SAR ship detection. Zeng et al. [32] designed the YOLO-ssboat algorithm, integrating strategies including C2f-DCNv3, MSWPN, Dyhead-v3, and gradient flow-wake detection joint adversarial training, significantly boosting small target vessel recognition accuracy and robustness. Concurrently, Zhang et al. [33] proposed a deep learning-based vessel detection method that enhances small target detection in satellite imagery by integrating DCNv2 modules, introducing DSConv, and employing Dyhead detection heads while maintaining low computational burden.
In summary, current research predominantly focuses on the dual optimization of model accuracy and speed. Nevertheless, studies on the security and robustness of deep learning-based ship detection models remain nascent, presenting a significant research gap requiring urgent attention.

3. Robust Adversarial Fusion Framework for Maritime Surveillance

The adversarial training fusion network framework proposed in this study, termed RAFS-Net (as illustrated in Figure 1), comprises two critical phases: the Adversarial Attack Generation Phase and the Adversarial Training Phase. During the first phase (Adversarial Attack Generation), the target dataset is fed into the framework. Adversarial attack algorithms—such as the Fast Gradient Sign Method (FGSM) [34] and Projected Gradient Descent (PGD)—are employed to apply meticulously crafted [35], visually imperceptible perturbations to original images. These perturbations, though subtle, effectively disrupt the model’s decision-making process by aligning with the gradient direction of the model’s loss function.
Subsequently, the second phase (Adversarial Training) commences. Initially, the network model is trained on unperturbed original images to learn fundamental feature representations of the task data. At this stage, different backbone networks can be substituted within the framework based on task requirements. Next, adversarially perturbed samples generated in Phase 1 (Section 3.2) are incorporated into the training dataset and mixed with original images for model training. The network updates its parameters by minimizing a composite loss function that includes both classification loss for original images and classification loss for adversarial samples. This approach continuously exposes the model to adversarial examples during training, thereby compelling it to learn more generalized and robust feature representations while rectifying its classification decision boundaries.
Through this training strategy, models developed under the RAFS-Net framework maintain high classification performance not only on benign samples but also when confronting maliciously crafted adversarial samples, significantly enhancing practical stability and reliability. The subsequent sections comprehensively elaborate on the RAFS-Net framework, including problem formulation, adversarial attack generation, and adversarial training phases.

3.1. Problem Statement

In this study, we formulate the task as a classification problem under supervised learning, aiming to identify vessel types within images. The network framework is designed to classify ship outboard images captured via optical surveillance cameras in nearshore environments, predicting the vessel category contained in an input image. For the input dataset D = { ( x k , y k ) } k = 1 N , x k represents the profile images of the k-th vessel from varying viewing angles, with label y k denoting the vessel category. Here, N indicates the total number of ship outboard images in the dataset. The objective is to enable the model to output predicted vessel categories, formalized as the mapping function: f ship - match : x k { y k } .

3.2. Adversarial Attack Generation Stage

This phase commences by applying adversarial perturbation generation preprocessing algorithms to perform essential data augmentation on the Deep Learning Ship Dataset D . The process initiates with resizing each image x i to 224 × 224 pixels via a resize function, converting it into tensor format, and normalizing pixel values using a data normalization function that sets the mean and standard deviation of red, green, and blue channels to 0.5. Subsequently, for each sample x i in the dataset, its adversarial counterpart x adv θ is initialized as a copy of the original image x i .
Over T iterations, the algorithm iteratively generates perturbations r adv k for each adversarial sample through a perturbation calculation function. This function computes the gradient of the loss function with respect to the input and adjusts sample generation accordingly. After each perturbation step, the adversarial sample x adv k undergoes perturbation range trimming via a clipping function, constraining introduced perturbations within specified ϵ -bounds. This prevents excessive deviation from original images and ensures visual imperceptibility of perturbations.
Upon completing iterations, the final outputs are aggregated into an Adversarial Attack Training Dataset D t = { ( x k , x adv k ) } k = 1 i , containing adversarial samples x adv k paired with corresponding originals x i . This dataset is subsequently utilized in the next phase for neural network training, significantly enhancing robustness against potential adversarial interference—a critical requirement for maintaining model reliability in computer vision tasks such as ship detection. The detailed implementation of this phase is presented in Algorithm 1.
Algorithm 1 Adversarial Perturbation Generation for Training Dataset.
Require:
  • Deep learning vessel dataset D = { ( x i , y i ) } i = 1 N
  • Learning rate α
  • Perturbation range ε
  • Number of iterations T
  • Number of perturbation steps K
  • Neural network parameters θ
Require:
  • CALCULATEPERTURBATION ( x , y , θ )           ▷ Adversarial perturbation calculator
  • RESIZE ( · )                          ▷ Image resizing function (224 × 224)
  • TOTENSOR ( · )                              ▷ Data tensor conversion
  • NORMALIZE ( · )                   ▷ Per-channel normalization ( μ = 0.5 , σ = 0.5 )
  • CLIPPERTURBATION ( x adv , x , ε )                      ▷ Perturbation clipping
Ensure:
  • All images standardized to 224 × 224 resolution
  • Tensor format conversion completed
  • RGB channels normalized per specification
  • Perturbations constrained within ε -ball
  • Valid adversarial training pairs generated
  1:
D ˜ RESIZE ( D )                               ▷ Resize to 224 × 224
  2:
D ˜ TOTENSOR ( D ˜ )                              ▷ Convert to tensor
  3:
D ˜ NORMALIZE ( D ˜ )                        ▷ Apply N ( 0.5 , 0.5 ) per channel
  4:
D t                                         ▷ Initialize adversarial dataset
  5:
for iter = 1  to T do                                             ▷ Outer training iterations
  6:
  for each ( x i , y i ) D ˜  do                                            ▷ Process each sample
  7:
     x adv x i                                         ▷ Initialize adversarial sample
  8:
    for  k = 1  to K do                         ▷ Perturbation steps
  9:
      r adv CALCULATEPERTURBATION ( x adv , y i , θ )
10:
     x adv x adv + α · r adv                              ▷ Apply perturbation
11:
     x adv CLIPPERTURBATION ( x adv , x i , ε )
12:
   end for
13:
    D t D t { ( x i , x adv ) }                                   ▷ Store clean-adversarial pair
14:
end for
15:
end for
16:
return  D t                                      ▷ Adversarial training dataset

3.3. Adversarial Training Stage

To ensure effective monitoring of nearshore vessels in adversarial environments, we feed the Adversarial Attack Training Dataset obtained in the first stage into our network architecture for feature extraction and representation learning. We employ ResNet-34 as the backbone network in our implementation while maintaining flexibility to substitute other network models for ship image feature extraction. The information contained in ship images is processed through convolutional layers to obtain higher-dimensional representations. As the core component of Convolutional Neural Networks (CNNs), convolutional layers extract features from both original and adversarially perturbed images by kernel scanning. The operation of convolutional layers is formalized in Equation (1):
s i j = m n ( x k , x adv k ) i + m · w m , n
where s denotes extracted features, x represents input images to the convolutional layer, w signifies kernel weights, i , j denotes index feature dimensions, and m , n denotes index kernel dimensions. Within maritime surveillance systems, adversarial attacks simulate malicious interference such as electronic spoofing and signal camouflage to evaluate model robustness. First, the FGSM (Fast Gradient Sign Method) [34] employs single-step bounded perturbations with high computational efficiency but limited attack strength, suitable for simulating transient signal interference. Second, FGM (Fast Gradient Method) [36] introduces L2-norm constraints to generate directionally precise perturbations, effectively simulating radar scattering interference, though vulnerable to gradient saturation. The core distinction lies in their norm constraints and gradient utilization strategies, both belonging to single-step attacks.
Third, the advanced PGD (Projected Gradient Descent) [35] method utilizes multi-step iterations with random initialization, generating the strongest adaptive attacks for maritime monitoring that accurately simulate persistent vessel trajectory spoofing. While computationally costlier than FGSM/FGM, it achieves higher attack success rates. Finally, the complementary FreeLB (Free Large-Batch Training) [37] adopts a stochastic perturbation ensemble strategy by accumulating gradients multiple times within a single batch to maximize perturbation loss, functioning inherently as a defensive training strategy. We implement these four perturbation methods as formalized in Equations (2)–(5):
FGM : x adv = x + ϵ · x L ( θ , x , y ) x L 2
FGSM : x adv = x + ϵ · sign ( x L ( θ , x , y ) )
PGD : x adv 0 = x + U ( ϵ , ϵ ) x adv t + 1 = B ϵ ( x ) x adv t + sign ( x L ( θ , x adv t , y ) )
FreeLB : x adv t = x + δ t , δ t U ( ϵ , ϵ )
where x denotes the original ship image, x adv represents the perturbed input image, ϵ controls perturbation intensity, and L is the cross-entropy loss. FGM applies L2-norm constrained perturbations along the gradient direction, while FGSM utilizes the gradient sign function sign ( · ) to impose L -bounded perturbations ensuring imperceptibility. In PGD, B ϵ ( x ) defines the ϵ -radius perturbation ball and ( · ) denotes the projection operator, implementing multi-step iterative attacks projected to feasible regions. FreeLB generates uniformly distributed random perturbation vectors δ t within [ ϵ , ϵ ] .
The adversarial training mechanics are formalized as Equations (6)–(8):
L adv = CrossEntropy ( y , y ^ )
θ L adv = 𝜕 L adv 𝜕 θ
θ = θ α · θ L adv
Equation (6) computes the adversarial loss L adv by measuring the discrepancy between predicted labels y ^ and ground truth y. Equation (7) calculates the gradient θ L adv of this loss with respect to network parameters. Finally, Equation (8) updates network parameters by subtracting the product of learning rate α and gradient θ L adv from current parameters θ . The framework outputs trained parameters θ and predicted vessel detection labels y ^ . Detailed procedures for feature extraction, gradient computation, and adversarial loss calculation during the adversarial training phase are specified in Algorithm 2.
Algorithm 2 Adversarial Training with Fixed Attack Methocfor Ship Detection.
Require:
  • Clean training dataset: D = { ( x k , y k ) } k = 1 N
  • Chosen attack method:  attack { FGM , FGSM , PGD , FreeLB }
  • Attack hyperparameters: ϵ , α , T
  • Learning rate η
  • Initial network parameters θ
Ensure:
  • Trained network parameters θ , and ship detection labels y ^
  1:
function ShipResNetClassifier( x adv ; θ )
  2:
   z 0 Com ( x adv )                           ▷ Initial convolution
  3:
   z 0 MaxPool ( z 0 )                           ▷ Down-sampling
  4:
   blocks [ 3 , 4 , 6 , 3 ]                       ▷ Residual blocks per stage
  5:
  for  s = 1  to 4 do
  6:
   for  j = 1  to  blocks [ s ]  do
  7:
     z s Residual ( z s 1 , 0 )                      ▷ Basic residual block
  8:
   end for
  9:
  end for
10:
   v GlobalAvgPool ( z 4 )                          ▷ Feature vector
11:
   y ^ Softmax ( W v + b )                         ▷ Ship detection logits
12:
  return  y ^
13:
end function
14:
for epoch = 1  to E do                          ▷ Training loop over epochs
15:
  for  ( x b , y b ) in D  do                            ▷ Mini-batch loop
16:
   Generate adversarial examples:
17:
   if attack = FGM then                         ▷ Fast Gradient Method
18:
     g x L ( θ , x b , y b )
19:
     x adv x b + ϵ · g g 2                        ▷ L2-norm perturbation
20:
   else if attack = FGSM then                 ▷ Fast Gradient Sign Method
21:
     g x L ( θ , x b , y b )
22:
     x adv x b + ϵ · sign ( g )                        ▷ L perturbation
23:
   else if attack = PGD then                  ▷ Projected Gradient Descent
24:
     x adv 0 x b + U ( ϵ , ϵ )                      ▷ Random initialization
25:
    for  t = 0  to  T 1  do
26:
      g t x L ( θ , x adv t , y b )
27:
      x adv t + 1 Clip x b , ϵ x adv t + α · sign ( g t )
28:
    end for
29:
     x adv x adv T
30:
   else                            ▷ Free Large-Batch Training (FreeLB)
31:
    for  t = 1  to T do
32:
      δ t U ( ϵ , ϵ )                            ▷ Random perturbation
33:
      x adv t x b + δ t
34:
      L t L ( θ , x adv t , y b )
35:
    end for
36:
     L adv 1 T t = 1 T L t
37:
     θ L adv θ L adv                              ▷ Compute gradient
38:
   end if
39:
   if attack ≠ FreeLB then
40:
    Forward pass:  y ^ f θ ( x adv )                  ▷ Network forward propagation
41:
     L adv CrossEntropy ( y ^ , y b )
42:
     θ L adv θ L adv                              ▷ Compute gradient
43:
   end if
44:
   Update parameters:  θ θ η · θ L adv
45:
  end for
46:
end for
47:
return  θ , y ^                 ▷ Trained network parameters θ , and ship detection labels y ^

4. Experiments and Results

4.1. Experimental Datasets

In this study, three primary datasets were utilized for training and evaluating deep learning models to address ship image classification tasks. Firstly, the Deep Learning Ship Dataset was employed (https://www.kaggle.com/datasets/arpitjain007/game-of-deep-learning-ship-datasets, Figure 2) (accessed on 15 February 2025), providing extensive ship imagery covering five major vessel types: Cargo, Carriers, Cruise ships, Military, and Tankers. As illustrated, these images exhibit diverse visual characteristics across various environments and viewing angles. Statistical analysis reveals balanced category distribution with approximately equal image counts: 1334 for Cargo, 1327 for Carrier, 1331 for Cruise, 1328 for Military, and 1329 for Tankers, shown on Table 1. Such diversity establishes a solid foundation for training robust classification models.
Subsequently, the Original Validation Dataset was used for model evaluation, as shown in Figure 3. This dataset similarly contains images of all five ship categories but with smaller sample sizes—147 to 148 images per category, shown on Table 2. This configuration makes it ideal for assessing model generalization capabilities on limited data. Though smaller in scale than the training dataset, its high image quality and balanced category distribution render it invaluable for performance benchmarking.
Finally, the Ship Outboard Adversarial Attack Dataset was introduced to test model robustness against adversarial attacks, as shown in Figure 4. This dataset incorporates adversarially perturbed ship images, maintaining coverage of all five vessel types with 147 images per category, shown on Table 3. These images demonstrate how adversarial manipulations alter visual characteristics, challenging model recognition capabilities. Testing on this dataset enables rigorous assessment of model stability and reliability when confronting potential adversarial attacks in real-world scenarios, ensuring effective operation under complex conditions.

4.2. Evaluation Indicators

The maritime vessel recognition capability of RAFS-Net was quantitatively assessed employing four principal metrics: accuracy, precision, recall, and F1-score. Within the confusion matrix framework, the following equation was used:
Accuracy = ( TP + TN ) / ( TP + FP + FN + TN ) , Recall = TP / ( TP + FN ) , Precision = TP / ( TP + FP ) , F 1 - Score = 2 × ( Precision × Recall ) / ( Precision + Recall ) ,
Positive samples that are correctly predicted to be positive are represented by TP, while negative samples that are correctly predicted to be negative are represented by TN. FP and FN, respectively, represent the negative and positive samples of incorrect predictions.

4.3. Experiment Environment and Parameters Setting

In this study, a non-pretrained ResNet-34 [38] model was employed as the backbone of the RAFS-Net surveillance framework. The entire model was optimized using the cross-entropy loss function. Based on experimental findings, we configured the batch size to 32 and set the learning rate at 0.001, utilizing Adam as the optimization algorithm. All experiments were conducted on a workstation featuring a 64-bit Windows 11 operating system, 12th Gen Intel® Core™ i7-12700 processor (Intel Corporation, Santa Clara, CA, USA), 32 GB RAM, and NVIDIA GeForce RTX 3060 GPU (NVIDIA Corporation, Santa Clara, CA, USA). The PyTorch 1.7 deep learning framework was implemented with PyCharm (PyCharm 2025.1, JetBrains s.r.o., Prague, Czech Republic) as the primary integrated development environment and Python 3.9 as the programming language.

4.4. Results of Experiments

On the pristine validation dataset (without adversarial attacks), models without adversarial training generally exhibited high performance across metrics including accuracy, precision, recall, and F1-score, according to reference Formula (9). This indicates that while adversarial training enhances robustness against adversarial attacks, it may incur performance degradation in benign scenarios. However, when evaluated on the adversarial test set, non-adversarially-trained models suffered significant performance deterioration, whereas adversarially-trained models maintained superior performance levels. This validates the substantial efficacy of adversarial training in fortifying model resilience against attacks.
As detailed in Table 4, comprehensive performance metrics (accuracy, precision, recall, F1-score) of multiple neural network models are presented for both pristine and adversarial datasets. Overall, ResNet [38] and DenseNet series demonstrated exceptional performance on pristine data—ResNet-50 [38] achieved 94.158% accuracy and 94.099 F1-score. Under adversarial attacks, all models experienced performance declines with varying severity: VGG-series models (e.g., VGG-11, VGG-13) performed poorly on both datasets, with significantly lower accuracy and F1-scores than counterparts, suggesting inherent limitations even before attacks. The fundamental reason for VGG’s failure in adversarial training lies in its architectural design, which is ill-suited for tasks requiring stable gradient flow and high model robustness. Specifically, VGG is a “plain” network stacked with consecutive convolutional layers, lacking modern designs like residual connections. This defect causes two critical issues: first, during backpropagation essential for adversarial training, gradient signals in VGG’s deep layers become unstable or vanish, preventing effective weight updates based on adversarial loss and hindering both the generation of high-quality adversarial samples and learning from them; second, the impeded gradient flow obstructs the propagation of high-level semantic objectives (e.g., “resisting perturbations”) to underlying feature extractors, leading to a failure in learning robust features that remain stable under perturbations and instead promoting overfitting to non-robust superficial features of the training data. In contrast, modern architectures like ResNet and DenseNet, which performed excellently in our experiments, address these gradient issues through residual or dense connections, ensuring stable and efficient training. This enables them to successfully refine decision boundaries and learn more generalizable robust feature representations, thereby maintaining superior performance in adversarial environments.
Although ResNet [38] and DenseNet [39] models showed reduced performance under attacks, they retained relatively high levels—ResNet-50 [38] declined to a 73.505% accuracy and a 72.602 F1-score. Notably, DenseNet-169 [39] and DenseNet-121 [39] exhibited superior robustness with F1-scores of 71.051 and 81.796 respectively, indicating smaller performance drops compared to pristine conditions. MobileNetV3-Large [40] and MobileNetV3-Small [40] also demonstrated competitive robustness under attacks (73.958 and 64.205 F1-scores). These results confirm that the adversarial training framework is broadly applicable to most deep learning models, effectively enhancing adversarial robustness while maintaining high versatility.
In summary, while adversarial training may incur performance degradation in benign environments—necessitating careful consideration for practical maritime surveillance applications—it nevertheless demonstrates efficacy in enhancing model robustness. Crucially, it sustains high-level monitoring performance when confronted with malicious adversarial attacks, such as scenarios where adversaries deliberately camouflage vessels using adversarial perturbations.
Figure 5, Figure 6, Figure 7 and Figure 8 present the mean Average Precision (mAP) and Receiver Operating Characteristic (ROC) curves of various neural network models, including AlexNet [41], ResNet [38], DenseNet [39], and MobileNetV3, under both adversarial attack and attack-free conditions. Comparative analysis of these curves reveals that introducing adversarial attacks typically reduces mAP and Area Under the Curve (AUC) values, indicating performance degradation. However, significant disparities exist among models: certain architectures (e.g., ResNet-34 [38] and DenseNet-121 [39]) maintain relatively high curve coverage areas post-attack, demonstrating superior robustness. Conversely, others (e.g., VGG-series [42] models) exhibit sharp performance deterioration under attacks.
Table 4. A horizontal performance comparison of each model on the Original Validation Dataset and the Ship Outboard Adversary Attack Dataset.
Table 4. A horizontal performance comparison of each model on the Original Validation Dataset and the Ship Outboard Adversary Attack Dataset.
DatasetOriginal Validation DatasetAdversarial-Test Dataset
Models Accuracy Precision Recall F1-Score Accuracy Precision Recall F1-Score
AlexNet [41]63.04363.35863.06962.39750.67953.70550.72447.948
87.09287.01387.11387.02971.73973.54171.76770.915
VGG-11 [42]21.5414.29020.0007.06521.5414.29020.0007.065
21.5414.29020.0007.06521.5414.29020.0007.065
VGG-13 [42]21.5414.29020.0007.06521.5414.29020.0007.065
21.5414.29020.0007.06521.5414.29020.0007.065
ResNet-18 [38]92.12092.71692.12092.18369.97370.52269.99269.576
92.12092.10092.13892.04277.85378.46877.86977.885
ResNet-34 [38]90.35390.45890.36990.36471.60373.75871.62171.075
94.56594.67194.57394.60982.20182.83682.20482.004
ResNet-50 [38]86.14187.06186.17386.01265.08268.99665.08863.975
94.15894.09394.16894.09973.50575.93973.54972.602
ResNet-101 [38]85.59888.40685.63885.44862.90864.45462.96061.581
92.52793.09092.54092.64380.57181.98380.60380.354
DenseNet-121 [39]92.12092.10892.13692.07672.69074.46472.74770.798
93.07193.58593.07193.21081.65882.34881.66081.796
DenseNet-169 [39]91.98492.22991.99991.98671.33272.93671.35571.051
93.88693.94393.89393.91272.82677.14272.84973.097
MobileNetV3-Large [40]92.25592.50992.25992.32774.04974.45974.07573.958
86.41387.92686.40886.42574.86475.50174.86974.629
MobileNetV3-Small [40]88.85988.93788.86588.77064.26666.20864.28964.205
87.63688.70887.63587.80069.29370.69569.34867.336
Compared with the optimal value in the form of a bold shown in the table. The two lines, respectively, represent the performance comparison between the models that have not undergone adversarial training (upward) and those that have undergone adversarial training (downward).
Additionally, these visualizations show minimal performance variation among models in attack-free scenarios, highlighting the substantial impact of adversarial attacks on model behavior. Collectively, the figures emphasize performance shifts under adversarial conditions, providing critical references for selecting and optimizing deep learning models in surveillance applications.

4.5. Analysis of the Impact of Different Counter-Attack Methods Experiment

On the test set with applied adversarial perturbations, models lacking adversarial training exhibited suboptimal performance across metrics including accuracy, precision, recall, and F1 score, demonstrating significantly heightened vulnerability to such attacks, as illustrated in Figure 9. The impact of various attack methods (LinfPGD, FGM, FGSM, FreeLB, and no attack) on multiple neural network models (including AlexNet, ResNet, VGG, DenseNet, MobileNetV3, etc.) was summarized. Overall, VGG-series models displayed inferior performance under attacks, with significant declines in accuracy and precision metrics. Conversely, deeper architectures like ResNet and DenseNet exhibited higher robustness, maintaining relatively strong performance even under FreeLB attacks. Notably, AlexNet—an earlier model—showed adaptability in specific attack scenarios: its accuracy rose from 63.966% under no attack to 83.696% under LinfPGD attacks. These results indicate that model architectural complexity critically influences adversarial resistance, with substantial performance variations across attack scenarios. This underscores the necessity of selecting appropriate model architectures based on application-specific requirements and implementing targeted defense measures to enhance robustness.
In contrast, models trained with adversarial techniques demonstrated stronger adaptability to adversarial perturbations, exhibiting smaller performance degradation. Crucially, different adversarial training methods varied in their effectiveness. Notably, robust adversarial training strategies—exemplified by FreeLB-Attack—proved more effective in enhancing model robustness, thereby ensuring stability against adversarial assaults. When selecting adversarial training methods, performance in both non-adversarial and adversarial environments must be jointly considered. Thus, the chosen method should not only improve performance on pristine data but also effectively fortify the model’s resistance to adversarial intrusions.

4.6. Analysis of the Impact of Different Counter-Attack Parameters Experiment

This comprehensive study delineates the critical role of adversarial hyperparameters in ship surveillance systems through three targeted experiments. We systematically evaluated perturbation magnitude ( ϵ ), attack iterations (k), and gradient step size ( α ) within FreeLB and PGD attack paradigms, revealing fundamental trade-offs among attack efficiency, visual stealth, and model robustness. Key findings demonstrate a non-monotonic relationship where optimal parameterization balances maximal model vulnerability with minimal perceptual distortion—a crucial equilibrium for practical deployment in marine environments prone to electronic warfare (EW). Sensitivity analysis establishes quantitative design principles for adversarial training in surveillance applications.
The curve result graphs in this section (Figure 10), the vertical axis represents the accuracy rate of the model at this training moment, and the horizontal axis represents the total number of training rounds (epochs) used. Epochs are closely related to batch size and iterations. The relationship among the three is epoch = batch size × iterations.

4.6.1. Epsilon-Adversarial Perturbation Values

Further analysis of the impact of preprocessing parameters on effectiveness reveals that the ϵ parameter controls the magnitude of adversarial perturbations (Figure 11). In the FreeLB attack method, parameters α = 0.0025 and k (denoting the gradient step size for updating input samples at each step) = 5 were fixed while varying ϵ . Experimental results demonstrate that as ϵ increases (0.05, 0.075), perturbations gradually approach the original attack state of test images, potentially improving prediction accuracy. This improvement occurs because moderate perturbations effectively challenge the model to learn more robust feature representations without significantly distorting the underlying semantic content. Peak accuracy occurs at ϵ = 0.1, representing an optimal balance where perturbations are sufficiently strong to enhance model robustness while remaining subtle enough to preserve image recognizability. At this sweet spot, the adversarial samples are potent enough to expose and rectify vulnerable decision boundaries, yet they maintain fidelity to the original data distribution, preventing the model from learning distorted features. But beyond this threshold ( ϵ = 0.25, 0.5), excessive perturbations become perceptible, causing significant semantic distortion that deviates from realistic vessel appearances and causing classifiers to misidentify images and consequently reducing accuracy despite higher attack success rates. This degradation occurs because overly aggressive perturbations force the model to overfit to unrealistic, artifact-laden samples, compromising its ability to generalize to both clean data and subtly perturbed inputs. For practical marine vessel surveillance, selecting an appropriate ϵ value is critical, requiring careful trade-offs between prediction accuracy and acceptable perturbation levels to ensure adversarial samples effectively attack target models without human-perceptible artifacts. Parameter selection must therefore balance attack success rates against image fidelity and task performance, with ϵ = 0.1 emerging as the optimal value that maximizes robustness gains while minimizing semantic distortion in maritime surveillance contexts.

4.6.2. K-Attack Iteration Number Value

The hyperparameter k controls the number of attack iterations. Since only the PGD method among the four selected attack methods requires determination of iteration count, experiments were conducted with k as the variable while holding other parameters constant. The results demonstrate on Figure 12: When k is set to 1, prediction accuracy on the test image set reaches its peak. As k sequentially increases to 3, 5, 7, and 9, prediction accuracy progressively declines. Notably, while these accumulated perturbations remain largely imperceptible to human vision across all tested k-values (k ≤ 9). This indicates that increasing the step count or iterations of PGD attacks amplifies perturbations in adversarial samples, thereby reducing prediction accuracy for adversarial inputs when other parameters remain unchanged.
This phenomenon likely occurs because higher iteration counts drive adversarial samples closer to decision boundaries, making models more susceptible to misdirection and consequently degrading prediction accuracy. This highlights a critical different in adversarial attacks: perturbations that are visually negligible to humans can be strategically optimized through iterative refinement to become highly deceptive for machine learning models. While increasing PGD attack iterations enhances the divergence between adversarial and original samples—potentially improving adversarial sample quality—it simultaneously compromises the model’s prediction accuracy.

4.6.3. Alpha-Gradient Update Step Value

In experiments employing the FreeLB attack method with fixed ϵ = 0.1 and variable α , the impact of α on images is illustrated in Figure 13. This parameter governs both the gradient update step size during adversarial sample generation and the perturbation magnitude per step applied to original samples—higher α values intensify perturbations, enhancing adversarial sample aggressiveness while reducing naturalness. This creates a crucial different where perturbations remain largely imperceptible to human vision yet become increasingly potent against machine learning models. As shown in Figure 14: At low α values (e.g., 0.00025), prediction accuracy on the test image set remains relatively high due to minimal gradient steps producing insignificant perturbations, making models harder to mislead. Optimal accuracy (82%) occurs at moderate α (0.0025), where attackers successfully deceive models without excessive distortion. At high α (0.025 and 0.25), accuracy progressively declines as oversized gradient steps cause image overcorruption, hindering correct recognition.
Attack success rates can be modulated through α adjustment. Excessively small α values weaken attack effectiveness, while excessively large α values induce destructive over-perturbation—both scenarios impair attack success. Therefore, α must be contextually tuned in practical applications to achieve optimal attack performance.

5. Conclusions and Future Prospects

This study proposes a novel vessel type detection network framework employing an innovative training strategy, designed to enhance the model’s resilience against adversarial perturbations and thereby improve the security of automated ship surveillance systems. Specifically, the method consists of two key phases: the adversarial attack generation phase and the adversarial training phase. During the adversarial attack generation phase, gradient information of the model is computed to identify directions that induce misclassification. Perturbations are then added based on the loss function’s gradients to generate adversarial examples. These adversarial examples are subsequently incorporated into the training dataset. In the adversarial training phase, the model learns by training on both original and adversarial samples, continuously refining the classification decision boundary to ensure correct classification when confronted with perturbed samples. This process effectively enhances the model’s defense capability against concealed adversarial attacks and reduces misclassifications caused by minor perturbations.
Experimental results demonstrate that RAFS-Net, owing to its distinct training methodology compared to traditional ship surveillance models, excels across multiple vessel detection tasks with varying perturbation intensities. It not only maintains high recognition efficiency in disturbed environments but also effectively performs detection tasks in pristine conditions. Furthermore, ablation experiments conducted on the Ship Outboard Adversary Attack Image Dataset enabled RAFS-Net to achieve optimal parameter configurations. This verifies that the model reaches peak performance under these settings, yielding significant enhancements for vessel safety management.
To improve nearshore maritime management efficiency within the shipping safety context, future research will prioritize two directions. First, enhancing the model’s capability to recognize local vessel features will enable accurate identification without relying on excessive features, thereby minimizing the impact of disturbed information and further boosting the robustness and safety of ship surveillance models. Additionally, in real-world marine environments, visible-light image data acquisition is often compromised by non-adversarial perturbations such as rain/fog, varying illumination conditions, or lens distortions. These factors adversely affect model performance. Consequently, subsequent research will comprehensively analyze model behavior under diverse perturbation scenarios, providing valuable references for maritime management studies.
In summary, benefiting from the training strategy of the RAFS-Net framework, it proficiently handles ship monitoring and identification tasks in both perturbed and non-perturbed environments while maintaining highly efficient and precise recognition performance. This not only effectively addresses the current research gap but also tangibly enhances the management security of vessel traffic activity information in complex marine environments characterized by multiple interference sources.

Author Contributions

Conceptualization, J.L. and J.S.; methodology, J.L. and J.S.; software, J.S. and Q.S.; validation, J.S.; formal analysis, J.S.; investigation, J.S.; resources, Q.S.; data curation, Q.S.; writing—original draft preparation, J.L. and J.S.; writing—review and editing, Q.S. and M.S.; visualization, J.L., Q.S. and M.S.; supervision, J.L.; project administration, J.L.; funding acquisition, J.L. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Young Innovative Talents Grants Programme of Guangdong Province (Grant No. 2022KQNCX024), the Ocean Young Talent Innovation Programme of Zhanjiang City (Grant No. 2022E05002), the National Natural Science Foundation of China (Grant No. 52171346), the Natural Science Foundation of Guangdong Province (Grant No. 2021A1515012618), the special projects of key fields (Artificial Intelligence) of Universities in Guangdong Province (Grant No. 2019KZDZX1035), the Natural Science Foundation of Guangdong Province of China (Grant No. 2023A1515010684), the program for scientific research start-up funds of Guangdong Ocean University, the College Student Innovation Training Program (Grant No. CXXL2025190, S202510566022, 202510566027), the College Student Innovation Team of Guangdong Ocean University (Grant No. CXTD2021013), the China Transportation Education Research Association JTYB20-28, Guangdong Province Education Teaching Reform Research Project (Grant No. 010202132201), the Zhanjiang Federation of Social Science Circles (ZJ20YB0), Guangdong Ocean University PX-128223315, PX-131223629, 010302132103, 580320013, 010201132207, 580420014, Guangdong Ocean University Humanities and Social Sciences Research Project—Research on the Integration of English Course Construction and Civic and Political Teaching of Nautical Professionals under the View of Strategy of a Stronger Country by the Sea, Guangdong Ocean University Education and Teaching Reform Project: Research on the Construction of Artificial Intelligence Curriculum System for Nautical Disciplines in the Context of “Intelligent Shipping”.

Data Availability Statement

Data are available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. UN Trade and Development (UNCTAD). Review of Maritime Transport 2024: Navigating Maritime Chokepoints; UN Trade and Development (UNCTAD): Geneva, Switzerland, 2024. [Google Scholar]
  2. Zha, M.; Qian, W.; Yang, W.; Xu, Y. Multifeature Transformation and Fusion-Based Ship Detection With Small Targets and Complex Backgrounds. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4511405. [Google Scholar] [CrossRef]
  3. Wu, P.; Huang, H.; Qian, H.; Su, S.; Sun, B.; Zuo, Z. SRCANet: Stacked Residual Coordinate Attention Network for Infrared Ship Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5003614. [Google Scholar] [CrossRef]
  4. Li, Y.; Xu, Q.; He, Z.; Li, W. Progressive Task-Based Universal Network for Raw Infrared Remote Sensing Imagery Ship Detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5610013. [Google Scholar] [CrossRef]
  5. United Kingdom Maritime Trade Operations. JMIC Weekly Dashboard–14 July 2024 to 20 July 2024. Weekly Report. Available online: https://www.ukmto.org/partner-products/jmic-products/weekly-dashboard/2024 (accessed on 31 July 2024).
  6. Li, B.; Qi, H.; Tang, C.; Liu, Y.; Gao, Y.; Lian, J. Sea Clutter Suppression Method Based on Neural Networks. In Proceedings of the 2023 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Zhengzhou, China, 4–17 November 2023; pp. 1–5. [Google Scholar] [CrossRef]
  7. Thys, S.; Ranst, W.V.; Goedemé, T. Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 49–55. [Google Scholar] [CrossRef]
  8. Anderson, S. Societal Applications of HF Skywave Radar. Remote Sens. 2022, 14, 6287. [Google Scholar] [CrossRef]
  9. Zhou, S.; Gao, H.; Ren, F. Pole Feature Extraction of HF Radar Targets for the Large Complex Ship Based on SPSO and ARMA Model Algorithm. Electronics 2022, 11, 1644. [Google Scholar] [CrossRef]
  10. Shi, F.; Li, Z.; Zhang, M.; Li, J. Analysis and Simulation of the Micro-Doppler Signature of a Ship With a Rotating Shipborne Radar at Different Observation Angles. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1504405. [Google Scholar] [CrossRef]
  11. Liu, P.; Zhao, Y.; Liu, B.; Li, Y.; Chen, P. Oil spill extraction from X-band marine radar images by power fitting of radar echoes. Remote Sens. Lett. 2021, 12, 345–352. [Google Scholar] [CrossRef]
  12. Woo-García, R.M.; Herrera-Nevraumont, V.; Osorio-de-la Rosa, E.; Vázquez-Valdés, S.E.; López-Huerta, F. Location Monitoring System for Sailboats by GPS Using GSM/GPRS Technology. IEEE Embed. Syst. Lett. 2023, 15, 69–72. [Google Scholar] [CrossRef]
  13. Vesecky, J.F.; Laws, K.E.; Paduan, J.D. Using HF surface wave radar and the ship Automatic Identification System (AIS) to monitor coastal vessels. In Proceedings of the 2009 IEEE International Geoscience and Remote Sensing Symposium, Cape Town, South Africa, 12–17 July 2009; Volume 3, pp. III-761–III-764. [Google Scholar] [CrossRef]
  14. Chen, Y.; Qi, X.; Huang, C.; Zheng, J. A data fusion method for maritime traffic surveillance: The fusion of AIS data and VHF speech information. Ocean Eng. 2024, 311, 12. [Google Scholar] [CrossRef]
  15. Li, B.; Xu, J.; Pan, X.; Ma, L.; Zhao, Z.; Chen, R.; Liu, Q.; Wang, H. Marine Oil Spill Detection with X-Band Shipborne Radar Using GLCM, SVM and FCM. Remote Sens. 2022, 14, 3715. [Google Scholar] [CrossRef]
  16. Kim, T.H.; Yang, C.S. Ship Monitoring around the Ieodo Ocean Research Station Using FMCW Radar and AIS: November 23–30, 2013. Korean J. Remote. Sens. 2022, 38, 45–56. [Google Scholar]
  17. Kim, S.W.; Kim, D.; Lee, Y.K.; Lee, I.; Lee, S.; Kim, J.; Kim, K.; Ryu, J.H. Operational ship monitoring based on multi-platforms (satellite, uav, hf radar, ais). Korean J. Remote. Sens. 2020, 36, 379–399. [Google Scholar]
  18. Zhang, T.; Zhao, S.; Cheng, B.; Chen, J. Detection of AIS Closing Behavior and MMSI Spoofing Behavior of Ships Based on Spatiotemporal Data. Remote Sens. 2020, 12, 702. [Google Scholar] [CrossRef]
  19. Qin, C.; Wang, X.; Li, G.; He, Y. An Improved Attention-Guided Network for Arbitrary-Oriented Ship Detection in Optical Remote Sensing Images. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 6514805. [Google Scholar] [CrossRef]
  20. Ma, J.; Zhou, Z.; Wang, B.; Zong, H.; Wu, F. Ship Detection in Optical Satellite Images via Directional Bounding Boxes Based on Ship Center and Orientation Prediction. Remote Sens. 2019, 11, 2173. [Google Scholar] [CrossRef]
  21. Nie, G.H.; Zhang, P.; Niu, X.; Dou, Y.; Xia, F.; Long, L.; Li, Y.; Li, X.; Dai, Y.; Yang, H. Ship Detection Using Transfer Learned Single Shot Multi Box Detector. ITM Web Conf. 2017, 12, 01006. [Google Scholar] [CrossRef]
  22. Sun, J.; Li, J.; Li, R.; Wu, L.; Cao, L.; Sun, M. Addressing unfamiliar ship type recognition in real-scenario vessel monitoring: A multi-angle metric networks framework. Front. Mar. Sci. 2025, 11, 1516586. [Google Scholar] [CrossRef]
  23. Guo, F.; Ma, H.; Li, L.; Lv, M.; Jia, Z. FCNet: Flexible Convolution Network for Infrared Small Ship Detection. Remote Sens. 2024, 16, 2218. [Google Scholar] [CrossRef]
  24. Zhan, J.; Li, J.; Wu, L.; Sun, J.; Yin, H. VIOS-Net: A Multi-Task Fusion System for Maritime Surveillance Through Visible and Infrared Imaging. J. Mar. Sci. Eng. 2025, 13, 913. [Google Scholar] [CrossRef]
  25. Chen, Z.; Chen, D.; Zhang, Y.; Cheng, X.; Wu, C. Deep learning for autonomous ship-oriented small ship detection. Saf. Sci. 2020, 130, 104812. [Google Scholar] [CrossRef]
  26. Li, J.; Sun, J.; Li, X.; Yang, Y.; Jiang, X.; Li, R. LFLD-CLbased NET: A Curriculum-Learning-Based Deep Learning Network with Leap-Forward-Learning-Decay for Ship Detection. J. Mar. Sci. Eng. 2023, 11, 1388. [Google Scholar] [CrossRef]
  27. Tian, Y.; Wang, X.; Zhu, S.; Xu, F.; Liu, J. LMSD-Net: A Lightweight and High-Performance Ship Detection Network for Optical Remote Sensing Images. Remote Sens. 2023, 15, 4358. [Google Scholar] [CrossRef]
  28. Feng, Y.; Yin, H.; Zhang, H.; Wu, L.; Dong, H.; Li, J. Independent Tri-Spectral Integration for Intelligent Ship Monitoring in Ports: Bridging Optical, Infrared, and Satellite Insights. J. Mar. Sci. Eng. 2024, 12, 2203. [Google Scholar] [CrossRef]
  29. Zuo, W.; Fang, S. TPNet: A High-Performance and Lightweight Detector for Ship Detection in SAR Imagery. Remote Sens. 2025, 17, 1487. [Google Scholar] [CrossRef]
  30. Liu, Y.; Cui, H.; Li, G. A Novel Method for Ship Detection and Classification on Remote Sensing Images. In Artificial Neural Networks and Machine Learning—ICANN 2017; Springer: Cham, Switzerland, 2017. [Google Scholar]
  31. Cai, P.; Liu, B.; Wang, P.; Liu, P.; Yuan, Y.; Li, X.; Chen, P.; Li, Y. SDFSD-v1.0: A Sub-Meter SAR Dataset for Fine-Grained Ship Detection. Remote Sens. 2024, 16, 3952. [Google Scholar] [CrossRef]
  32. Zeng, Y.; Wang, X.; Zou, J.; Wu, H. YOLO-Ssboat: Super-Small Ship Detection Network for Large-Scale Aerial and Remote Sensing Scenes. Remote Sens. 2025, 17, 1948. [Google Scholar] [CrossRef]
  33. Zhang, S.; Zang, S.; Liu, S. Deep Learning-Based Ship Detection in Maritime Environments. In Proceedings of the 2024 7th International Conference on Computer Information Science and Artificial Intelligence, Shaoxing, China, 13–15 September 2024. [Google Scholar] [CrossRef]
  34. Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. arXiv 2015, arXiv:1412.6572. [Google Scholar] [CrossRef]
  35. Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. arXiv 2019, arXiv:1706.06083. [Google Scholar] [CrossRef]
  36. Miyato, T.; Dai, A.M.; Goodfellow, I. Adversarial Training Methods for Semi-Supervised Text Classification. arXiv 2021, arXiv:1605.07725. [Google Scholar] [CrossRef]
  37. Zhu, C.; Cheng, Y.; Gan, Z.; Sun, S.; Goldstein, T.; Liu, J. FreeLB: Enhanced Adversarial Training for Natural Language Understanding. arXiv 2020, arXiv:1909.11764. [Google Scholar] [CrossRef]
  38. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  39. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
  40. Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar] [CrossRef]
  41. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  42. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Figure 1. The accuracy curves of the query set for different optimizers for the rough classification nearshore detection task.
Figure 1. The accuracy curves of the query set for different optimizers for the rough classification nearshore detection task.
Jmse 13 02195 g001
Figure 2. Example images from Deep Learning Vessel Dataset.
Figure 2. Example images from Deep Learning Vessel Dataset.
Jmse 13 02195 g002
Figure 3. Example images from Original Validation Dataset.
Figure 3. Example images from Original Validation Dataset.
Jmse 13 02195 g003
Figure 4. Example images from Ship Outboard Adversary Attack Image Dataset.
Figure 4. Example images from Ship Outboard Adversary Attack Image Dataset.
Jmse 13 02195 g004
Figure 5. Comparison of ROC curves of prediction results for each model on Ship Outboard Adversary Attack Dataset with and without training using adversarial attack methods: (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Figure 5. Comparison of ROC curves of prediction results for each model on Ship Outboard Adversary Attack Dataset with and without training using adversarial attack methods: (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Jmse 13 02195 g005aJmse 13 02195 g005b
Figure 6. Comparison of ROC curves of prediction results for each model on Oniginal Validation Dataset with and without training using adversarial attack methods: (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Figure 6. Comparison of ROC curves of prediction results for each model on Oniginal Validation Dataset with and without training using adversarial attack methods: (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Jmse 13 02195 g006aJmse 13 02195 g006b
Figure 7. Comparison of Average-PR curves of prediction results for each model on Ship Outboard Adversary Attack Dataset with and without training using adversarial attack methods: (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Figure 7. Comparison of Average-PR curves of prediction results for each model on Ship Outboard Adversary Attack Dataset with and without training using adversarial attack methods: (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Jmse 13 02195 g007
Figure 8. Comparison of Average-PR curves of prediction results for each model on Oniginal Validation Dataset with and without training using adversarial attack methods. (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Figure 8. Comparison of Average-PR curves of prediction results for each model on Oniginal Validation Dataset with and without training using adversarial attack methods. (a) AlexNet; (b) ResNet-18; (c) ResNet-34; (d) ResNet-50; (e) ResNet-101; (f) DenseNet121; (g) DenseNet169; (h) MobileNetV3-Large; (i) MobileNetV3-Small.
Jmse 13 02195 g008
Figure 9. Comparison of the longitudinal prediction effect between different Adversarial methods based on the two datasets (green is the worse value, yellow is the moderate value, and red is the optimal value).
Figure 9. Comparison of the longitudinal prediction effect between different Adversarial methods based on the two datasets (green is the worse value, yellow is the moderate value, and red is the optimal value).
Jmse 13 02195 g009
Figure 10. Comparison of the effect of different Adversarial perturbation values ( ϵ ) on prediction accuracy.
Figure 10. Comparison of the effect of different Adversarial perturbation values ( ϵ ) on prediction accuracy.
Jmse 13 02195 g010
Figure 11. A comparison of the visualization effects of different adversarial perturbation values ( ϵ ).
Figure 11. A comparison of the visualization effects of different adversarial perturbation values ( ϵ ).
Jmse 13 02195 g011
Figure 12. A comparison of the effect of different Gradient update step values (k) on prediction accuracy.
Figure 12. A comparison of the effect of different Gradient update step values (k) on prediction accuracy.
Jmse 13 02195 g012
Figure 13. A comparison of the visualization effects of different Gradient update step value ( α ).
Figure 13. A comparison of the visualization effects of different Gradient update step value ( α ).
Jmse 13 02195 g013
Figure 14. A comparison of the effect of different Gradient update step value ( α ) on prediction accuracy.
Figure 14. A comparison of the effect of different Gradient update step value ( α ) on prediction accuracy.
Jmse 13 02195 g014
Table 1. Deep learning vessel dataset statistics.
Table 1. Deep learning vessel dataset statistics.
DatasetCategoryNumber
Deep learning vessel datasetCargo1334
Carrier1327
Cruise1331
Military1328
Tankers1329
Table 2. Original Validation Dataset statistics.
Table 2. Original Validation Dataset statistics.
DatasetCategoryNumber
Original Validation DatasetContainership148
Carrier147
Cruise147
Military147
Tankers147
Table 3. Ship Outboard Adversary Attack Dataset statistics.
Table 3. Ship Outboard Adversary Attack Dataset statistics.
DatasetCategoryNumber
Ship Outboard Adversary Attack DatasetCargo148
Carrier147
Cruise147
Military147
Tankers147
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, J.; Sun, J.; Shi, Q.; Sun, M. RAFS-Net: A Robust Adversarial Fusion Framework for Enhanced Maritime Surveillance in Hostile Environments. J. Mar. Sci. Eng. 2025, 13, 2195. https://doi.org/10.3390/jmse13112195

AMA Style

Li J, Sun J, Shi Q, Sun M. RAFS-Net: A Robust Adversarial Fusion Framework for Enhanced Maritime Surveillance in Hostile Environments. Journal of Marine Science and Engineering. 2025; 13(11):2195. https://doi.org/10.3390/jmse13112195

Chicago/Turabian Style

Li, Jiawen, Jiahua Sun, Qiqi Shi, and Molin Sun. 2025. "RAFS-Net: A Robust Adversarial Fusion Framework for Enhanced Maritime Surveillance in Hostile Environments" Journal of Marine Science and Engineering 13, no. 11: 2195. https://doi.org/10.3390/jmse13112195

APA Style

Li, J., Sun, J., Shi, Q., & Sun, M. (2025). RAFS-Net: A Robust Adversarial Fusion Framework for Enhanced Maritime Surveillance in Hostile Environments. Journal of Marine Science and Engineering, 13(11), 2195. https://doi.org/10.3390/jmse13112195

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop