LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection

Chen, Jiawei; Huang, Junyu; Tan, Yuna; Wu, Zhifeng; Luo, Renbo

doi:10.3390/rs17233800

Open AccessArticle

LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection

by

Jiawei Chen

^1,†

,

Junyu Huang

^1,†

,

Yuna Tan

¹

,

Zhifeng Wu

^1,2

and

Renbo Luo

^1,2,*

¹

The School of Geography and Remote Sensing, Guangzhou University, Guangzhou 510006, China

²

The Institute of Aerospace Remote Sensing Innovations, Guangzhou University, Guangzhou 510006, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Remote Sens. 2025, 17(23), 3800; https://doi.org/10.3390/rs17233800 (registering DOI)

Submission received: 24 October 2025 / Revised: 18 November 2025 / Accepted: 21 November 2025 / Published: 23 November 2025

(This article belongs to the Special Issue Ship Imaging, Detection and Recognition for High-Resolution SAR)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A novel SAR-adapted Ghost-enhanced architecture (GHBlock + HGStem) systematically exploits inherent feature redundancy in SAR imagery, effectively addressing domain-specific challenges including speckle noise suppression and sea clutter discrimination.
The Layer-wise Adaptive Magnitude-based Pruning (LAMP) strategy intelligently assigns layer-specific sparsity levels based on multi-scale detection contributions, achieving substantial compression (75.3% parameter reduction and 59.3% FLOPs reduction) while preserving superior detection performance.

What is the implication of the main finding?

This work demonstrates that SAR-specific lightweight architectures can simultaneously achieve extreme model compression and maintain high detection accuracy, thereby enabling practical deployment on resource-constrained edge computing platforms (135.39 FPS on hardware-constrained devices).
The proposed dual lightweight methodology establishes a comprehensive framework for edge-based maritime surveillance systems, providing theoretical and practical foundations for real-time SAR ship detection in distributed monitoring networks.

Abstract

Current SAR ship detection methods face a critical trade-off between accuracy and computational efficiency, severely limiting their deployment on resource-constrained edge devices that are essential for distributed maritime surveillance systems. This paper presents LGNet, a novel ultra-lightweight network specifically designed for edge deployment that achieves extreme model compression while maintaining detection performance through two core innovations. First, we develop a SAR-adapted Ghost-enhanced architecture that exploits inherent feature redundancy in SAR imagery through systematic integration of Ghost convolutions and hierarchical GHBlock modules, reducing redundant computation while preserving discriminative capabilities. Second, we introduce Layer-wise Adaptive Magnitude-based Pruning (LAMP) that assigns layer-specific sparsity levels based on multi-scale detection contributions, enabling intelligent compression with minimal accuracy loss. LGNet achieves remarkable efficiency gains: 75.3% parameter reduction and 59.3% FLOPs reduction compared to YOLOv8n baseline (from 3.0 M/8.1 G to 0.74 M/3.3 G) while delivering superior accuracy on SSDD (mAP@50: 97.9%, mAP@95: 71.9%) and strong generalization on RSDD-SAR (mAP@50: 94.4%). Extensive edge deployment validation demonstrates genuine real-time capability with 135.39 FPS performance on Huawei Atlas AIpro-20T edge computing platform, confirming practical viability for autonomous maritime systems and remote surveillance applications where computational resources are critically constrained. This work establishes that extreme model compression and high detection accuracy can coexist through principled SAR-specific lightweight design, enabling new paradigms for edge-based maritime monitoring networks.

Keywords:

Synthetic Aperture Radar (SAR); ship detection; lightweight model; Ghost convolutions; GHBlock module; Layer-wise Adaptive Magnitude-based Pruning (LAMP)

1. Introduction

Synthetic Aperture Radar (SAR) systems provide excellent maritime surveillance capabilities through all-weather, day-and-night imaging [1]. Unlike optical sensors, SAR actively transmits microwave signals and processes returned echoes to create high-resolution images, making it invaluable for ship monitoring, navigation safety, and detecting illegal maritime activities [2]. However, SAR imagery introduces unique challenges, including multiplicative speckle noise, sea clutter interference, and geometry-dependent distortions that differentiate it from optical ship detection [3].

SAR ship detection faces several distinct challenges compared to optical imagery. Speckle noise and sea clutter degrade image contrast, leading to poorly defined ship boundaries and ambiguous target shapes. The radar imaging geometry and polarization effects also cause significant variations in ship appearance across different incidence angles and sea conditions, which complicates feature extraction. Dense harbor environments add to these difficulties, where closely spaced vessels require robust multi-scale representation and precise localization amid strong background interference. As a result, optical detection models often perform poorly when applied directly to SAR data, requiring specialized architectures that handle noise suppression, multi-scale fusion, and orientation robustness [4,5,6].

Traditional SAR ship detection relied primarily on constant false alarm rate (CFAR) algorithms and statistical thresholding approaches [7,8]. Although these methods adapt to local background statistics, they struggle with heterogeneous sea clutter and low signal-to-noise ratio conditions. Deep learning has transformed SAR ship detection by enabling automatic hierarchical feature learning from large datasets, eliminating the need for hand-crafted features [9,10]. Two-stage detectors like Faster R-CNN [11] and Cascade R-CNN [12] deliver high accuracy at the cost of computational complexity, while single-stage approaches such as YOLO [13] and RetinaNet [14] prioritize efficiency for real-time deployment. The challenge of arbitrary ship orientations in SAR imagery has been addressed through rotation-aware detectors, including RoI Transformer [15], R3Det [16], and S2ANet [17].

Recent research has explored multi-scale and attention-driven architectures to improve SAR ship detection. Notable examples include AIS-FCANet [18], which integrates frequency and spatial attention mechanisms for sea clutter suppression and small target enhancement, and MSRIHL-CNN [19], which incorporates rotation-invariant Haar-like features for orientation-robust learning. Other significant contributions include the Dense Attention Pyramid Network [20], which employs dense connections for progressive feature aggregation across scales; Non-local Channel Attention Network [21], which captures long-range spatial dependencies through non-local operations; and Feature Balancing and Refinement Network [14], which addresses feature imbalance in multi-scale detection through adaptive refinement. Beyond these attention-driven approaches, the community has explored diverse innovative directions: sea–land awareness methods (SLA-Net [22]) explicitly model maritime–terrestrial distinction to reduce coastal false alarms, traditional feature fusion techniques (Laplace and LBP [23]) combine hand-crafted with learned features for speckle robustness, bio-inspired methods [24] mimic biological attention mechanisms for target localization, and weakly supervised approaches [25] enable training with limited annotations to reduce labeling costs. Despite these diverse innovations, spanning attention architectures (SwER [26] with efficient residual connections, FASC-Net [27] using feature augmentation for small target detection), rotation-aware detectors (RoI Transformer [15] for rotated bounding box prediction, S2ANet [17] with alignment convolution for oriented detection), and specialized methodologies, existing methods share fundamental limitations: high model complexity, insufficient small-object discrimination, and weak SAR-specific design rationale. These approaches rely heavily on dense convolutions and large parameter sets, creating computational bottlenecks that are unsuitable for resource-constrained edge platforms. Furthermore, they lack an explicit design rationale connecting architectural choices to SAR-specific characteristics such as inherent feature redundancy and the critical balance between spatial detail preservation and semantic understanding. This creates a research gap for SAR-oriented lightweight networks achieving competitive accuracy through principled exploitation of SAR image properties rather than brute-force model scaling.

To address these limitations, we develop a lightweight yet accurate SAR ship detection network that directly handles SAR-specific challenges. In this context, “lightweight” is defined by three key metrics:

Small model parameters (typically <1 M) to minimize memory footprint.
Low computational complexity measured in FLOPs (floating-point operations) for efficient processing.
High inference speed (FPS) on resource-constrained edge devices for real-time deployment.

Our design is based on two key observations grounded in SAR image characteristics. First, SAR ship targets typically appear as compact, high-intensity regions within relatively homogeneous backgrounds, creating substantial feature redundancy across convolutional maps. This characteristic allows feature reuse through efficient operations that maintain semantic integrity while significantly reducing computation. Second, effective multi-scale SAR detection requires careful preservation of both fine-grained spatial details and high-level semantic representations. Conventional uniform compression strategies often disrupt this balance, making layer-adaptive optimization necessary for maintaining both local precision and global contextual understanding.

We present LGNet (Lightweight Ghost-enhanced Detection Network), a novel SAR-oriented ship detection framework designed for edge deployment. Our key contributions include

SAR-adapted Ghost-enhanced lightweight architecture: We develop a novel Ghost-enhanced backbone with an efficient stem structure (GHBlock + HGStem) that systematically exploits inherent feature redundancy in SAR imagery, addressing specific SAR challenges, including speckle noise suppression and sea clutter discrimination, while maintaining robust representational capacity.
Layer-wise Adaptive Magnitude-based Pruning (LAMP) strategy: We introduce an intelligent pruning approach that assigns layer-specific sparsity levels based on each layer’s contribution to multi-scale ship detection, enabling extreme model compression while preserving both fine-grained spatial information and high-level semantic features.
Comprehensive efficiency validation with practical deployment: LGNet achieves over 75% reduction in parameters and 59% reduction in FLOPs compared to YOLOv8n while maintaining superior detection performance, with real-world edge deployment validation achieving 135.39 FPS on resource-constrained hardware for practical maritime surveillance applications.

Through these contributions, LGNet demonstrates that model compactness and high detection accuracy can coexist, paving the way for practical deployment of SAR-based maritime surveillance systems on resource-constrained platforms.

2. Materials and Methods

2.1. Overall Design of LGNet

LGNet aims for extreme parameter efficiency to work on resource-constrained edge platforms (typically <2 GB memory, <10 GFLOPS compute capability), where conventional CNN detectors (3–100 M parameters, 8–200 GFLOPs) are computationally prohibitive. The architecture follows a three-stage detector paradigm with systematic lightweight operator integration, as shown in Figure 1. The Ghost-enhanced backbone extracts hierarchical features using HGStem, alternating GHBlock/DWConv stages, and SPPF pooling. A feature pyramid neck then fuses multi-scale representations through upsampling and concatenation operations. Finally, three parallel detection heads process different spatial resolutions (20 × 20, 40 × 40, 80 × 80) to capture ships across various scales.

Backbone Design. The backbone architecture replaces YOLOv8n’s Conv-C2f structure with a lightweight pipeline comprising HGStem, alternating GHBlock/DWConv stages, and SPPF. The network begins with HGStem (640 × 640 × 3 → 160 × 160 × 48), which performs efficient shallow feature extraction via depthwise separable convolutions. GHBlock and DWConv modules then alternate to extract hierarchical features with progressive spatial downsampling. Early stages (160 × 160 → 80 × 80) capture mid-level features, intermediate layers (80 × 80 → 40 × 40) balance spatial details with semantic abstraction, and deeper stages (40 × 40 → 20 × 20) extract high-level contextual information essential for robust SAR ship detection. SPPF pooling concludes the backbone by expanding receptive fields and aggregating multi-scale context. This design achieves substantial compression compared to YOLOv8n’s baseline: parameters decrease from 3.0 M to 2.3 M (a 23.3% reduction) and FLOPs decrease from 8.1 G to 6.8 G (a 16.0% reduction) while preserving feature representation quality.

Neck Design. The neck employs a feature pyramid architecture that integrates semantic information from deep layers with spatial details from shallow layers. Bidirectional feature pathways enable upsampling (top-down) and downsampling (bottom-up) flows with concatenation and C2f fusion across multiple scales (20 × 20, 40 × 40, 80 × 80). This ensures each detection scale benefits from both high-level contextual understanding and fine-grained spatial information for effective multi-scale ship detection.

Head Design. Three parallel detection heads process different spatial resolutions (80 × 80, 40 × 40, 20 × 20) to accommodate ships of varying sizes. Each head predicts bounding box coordinates, objectness scores, and class probabilities through decoupled convolutional branches, following the established YOLO detection paradigm.

SAR-Specific Design Rationale. The lightweight architectural design explicitly addresses SAR-specific challenges through the following design considerations:

Speckle noise robustness: Ghost convolution-based feature reuse reduces sensitivity to local intensity variations by generating multiple feature maps through cheap linear transformations, effectively averaging out speckle noise inherent in SAR imagery.
Multi-scale ship detection: The alternating GHBlock–DWConv structure preserves both shallow fine-detail features (for small ship boundaries) and deep semantic features (for large ship classification), preventing the uniform compression that degrades multi-scale performance in standard lightweight networks.
Computational efficiency: By systematically replacing all standard convolutions with Ghost convolutions and depthwise separable convolutions, LGNet reduces redundant computation arising from SAR imagery’s characteristic feature redundancy, where ship targets appear as compact high-intensity regions against relatively homogeneous backgrounds.
Robustness under compression: The lightweight design does not degrade robustness; in low-SNR, cluttered SAR imagery, LGNet’s efficient operators demonstrate superior stability compared to parameter-heavy alternatives, validating that extreme compression and robust detection are compatible when architectural lightweighting is carefully orchestrated.

This architectural lightweighting establishes a strong foundation for subsequent LAMP-based pruning, enabling the dual lightweighting strategy that achieves >70% compression while maintaining state-of-the-art accuracy.

2.2. GhostConv Mechanism

Ghost convolution addresses the inherent feature redundancy in SAR ship imagery. While optical images require dense feature representations due to rich textural and spectral variations, SAR imagery has properties that allow substantial compression. Three key characteristics support this approach: spatial compactness, where ship targets appear as localized high-intensity regions with simple geometric patterns against homogeneous backgrounds; limited spectral diversity, as single-channel SAR intensity lacks the multi-spectral complexity of optical imagery; and cross-channel similarity, where speckle noise and sea clutter create correlated features across channels. These properties show that core intrinsic features can generate sufficient discriminative representations through efficient linear transformations, providing the theoretical foundation for Ghost convolution in SAR applications.

As illustrated in Figure 2, the key difference between standard convolution and Ghost convolution lies in their feature generation strategies. In standard convolution (Figure 2a), all output feature channels are generated through expensive full convolution operations, requiring

C \times K \times K \times N

parameters to produce N output channels from C input channels with kernel size K. Ghost convolution (Figure 2b) adopts a two-stage decomposition: first, a reduced set of primary features (

N / s

channels, where s is the Ghost ratio, typically

s = 2

) are extracted via standard convolution; subsequently, these primary features are transformed through a series of cheap linear operations (

Φ_{1}, Φ_{2}, \dots, Φ_{k}

) such as depthwise convolutions to generate additional Ghost features; finally, primary and Ghost features are concatenated along the channel dimension and combined with an identity connection to produce the output. This decomposition exploits the observation that many feature channels can be approximated as linear transformations of a smaller intrinsic feature set, dramatically reducing computational cost while maintaining representational capacity.

The computational complexity and parameterization are summarized as follows. Let the input be

X \in R^{H \times W \times C}

and the output size be

H^{'} \times W^{'}

with N channels. Denote the standard kernel size by K and the linear-transform kernel by

D \times D

, and let s be the Ghost ratio (typically s = 2).

{FLOPs}_{std} = H^{'} W^{'} \cdot C \cdot K^{2} \cdot N,

(1)

{Params}_{std} = C \cdot K^{2} \cdot N .

(2)

Primary features (

N / s

channels) via partial standard convolution:

{FLOPs}_{primary} = H^{'} W^{'} \cdot C \cdot K^{2} \cdot \frac{N}{s} .

(3)

Ghost features via cheap linear transforms:

{FLOPs}_{Ghost} = H^{'} W^{'} \cdot \frac{N}{s} \cdot D^{2} (s - 1) .

(4)

Total complexity of GhostConv:

{FLOPs}_{total} = H^{'} W^{'} \cdot \frac{N}{s} \cdot (C K^{2} + D^{2} (s - 1)) .

(5)

Complexity ratio:

r_{s} = \frac{{FLOPs}_{std}}{{FLOPs}_{total}} = \frac{C K^{2}}{\frac{1}{s} C K^{2} + \frac{s - 1}{s} D^{2}} .

(6)

Parameter count of GhostConv:

{Params}_{Ghost} = \frac{N}{s} \cdot C K^{2} + \frac{N}{s} \cdot D^{2} (s - 1) .

(7)

Parameter ratio:

r_{c} = \frac{{Params}_{std}}{{Params}_{Ghost}} = \frac{s C K^{2}}{C K^{2} + D^{2} (s - 1)} .

(8)

For D = 3, K = 3, s = 2 and sufficiently large C, we have

r_{s} \approx 2

and

r_{c} \approx 2

.

Implementation Details. The Ghost convolution module operates through a three-stage computational pipeline:

Y_{primary} = Conv (X, W_{primary}), W_{primary} \in R^{\frac{N}{s} \times C \times K \times K},

(9)

Y_{Ghost} = Φ (Y_{primary}),

(10)

Y_{out} = Concat (Y_{primary}, Y_{Ghost}, \dots) .

(11)

In the LGNet architecture, Ghost convolution is configured with the following hyperparameters: (i) Primary convolution: kernel size

K = 3

, stride

s = 1

, padding

p = 1

, generating

N / 2

intrinsic feature channels through standard

3 \times 3

convolution; (ii) Ghost transformation: the cheap operation

Φ

is implemented as a

3 \times 3

depthwise convolution (

D = 3

) followed by batch normalization and ReLU activation, producing an additional

N / 2

Ghost feature channels with minimal computational overhead (only

9 N / 2

parameters compared to

9 C N / 2

for standard convolution); (iii) Ghost ratio: fixed at

s = 2

to balance compression rate and representational capacity, achieving theoretical 2 × speedup while maintaining feature expressiveness. The concatenated output

Y_{out} \in R^{H^{'} \times W^{'} \times N}

combines both intrinsic and Ghost features, forming the complete feature representation with approximately 50% parameter reduction compared to standard convolution.

2.3. GHBlock: Ghost-Enhanced Hierarchical Block

We introduce GHBlock as the core building unit of LGNet’s backbone, extending the Ghost convolution paradigm through hierarchical feature refinement. GHBlock integrates Ghost-based feature extraction with efficient channel attention mechanisms, enabling multi-scale feature aggregation with minimal computational cost.

The block architecture comprises four sequential operations, as shown in Figure 3: hierarchical Ghost convolution for multi-level feature extraction, dense-style concatenation to preserve intermediate representations, dual-stage channel refinement through squeeze-and-excitation operations, and optional residual connections for enhanced gradient flow.

Module Design and Integration. GHBlock replaces YOLOv8n’s C2f modules across four hierarchical stages, as depicted in Figure 1, with the following configurations:

Stage 1 (160 × 160): GHBlock[48, 128, 3]— shallow feature extraction.
Stage 2 (80 × 80): GHBlock[96, 512, 3]—mid-level semantics.
Stage 3 (40 × 40): GHBlock[192, 1024, 1, True, *]—aggressive compression (18 total blocks).
Stage 4 (20 × 20): GHBlock[384, 2048, 1, True, False]—deep semantic extraction.

Mathematical Formulation. Given input

X \in R^{H \times W \times C_{in}}

and target output

Z \in R^{H \times W \times C_{out}}

, where

C_{m}

represents intermediate channel width, n denotes the number of GhostConv layers, and t specifies the channel squeeze ratio, the forward pass proceeds through four stages:

Hierarchical Ghost Feature Extraction. Multiple GhostConv layers are stacked to extract hierarchical features:

Y_{0} = X, Y_{i} = GhostConv (Y_{i - 1}; W_{i}^{Ghost}), i = 1, \dots, n

(12)

where each

GhostConv

operation follows the decomposition strategy defined in Equations (9)–(11), producing

C_{m}

output channels per layer.

Dense-like Feature Aggregation. All intermediate features are concatenated to preserve multi-scale information:

Y_{cat} = Concat (X, Y_{1}, \dots, Y_{n}) \in R^{H \times W \times (C_{in} + n C_{m})}

(13)

Channel Squeeze-and-Excitation. Two successive 1 × 1 convolutions perform channel refinement: first, a squeeze operation reduces channels by factor t, then an excitation operation restores to

C_{out}

:

Y_{s} = ReLU (BN ({Conv}_{1 \times 1} (Y_{cat}))) \in R^{H \times W \times \frac{C_{out}}{t}}

(14)

Z_{temp} = ReLU (BN ({Conv}_{1 \times 1} (Y_{s}))) \in R^{H \times W \times C_{out}}

(15)

Optional Residual Connection. When channel dimensions match (

C_{in} = C_{out}

) and shortcut=True, a residual connection enhances gradient flow:

Z = \{\begin{matrix} Z_{temp} + X, & C_{in} = C_{out} and shortcut = True, \\ Z_{temp}, & otherwise . \end{matrix}

(16)

Computational Complexity. The total computational cost of GHBlock comprises three components: (i) hierarchical feature extraction via n GhostConv layers, (ii) channel squeeze operation, and (iii) channel excitation operation:

{FLOPs}_{total} = \sum_{i = 1}^{n} {FLOPs}_{GhostConv, i} + H W \cdot (C_{in} + n C_{m}) \cdot \frac{C_{out}}{t} + H W \cdot \frac{C_{out}}{t} \cdot C_{out}

(17)

where each GhostConv layer contributes

H W \cdot C_{i - 1} \cdot K^{2} \cdot \frac{C_{m}}{s} + H W \cdot \frac{C_{m}}{s} \cdot D^{2} (s - 1)

(18)

FLOPs. The total parameter count is

{Params}_{total} = \sum_{i = 1}^{n} {Params}_{GhostConv, i} + (C_{in} + n C_{m}) \cdot \frac{C_{out}}{t} + \frac{C_{out}}{t} \cdot C_{out}

(19)

With typical configurations (n = 6, s = 2, t = 2, K = D = 3), GHBlock achieves approximately 40–50% parameter reduction compared to an equivalent standard convolution-based block while maintaining comparable feature representation capacity through hierarchical Ghost feature aggregation and dense-like connectivity.

2.4. LAMP Pruning and Adaptive Fine-Tuning

We implement Layer-wise Adaptive Magnitude-based Pruning (LAMP) with adaptive fine-tuning to achieve extreme compression while preserving detection performance. The complete pipeline, illustrated in Figure 4, encompasses seven sequential stages: importance estimation, layer adaptation, global pruning, iterative optimization with speed validation, structured sparsity generation, and adaptive fine-tuning for accuracy recovery.

LAMP Importance Estimation. The pruning process begins by calculating layer-wise importance scores that guide the subsequent pruning decisions. The importance score combines magnitude-based and depth-aware factors:

I_{LAMP} (W^{(l)}) = {∥W^{(l)}∥}_{2} \times α^{(l)}

(20)

where

W^{(l)}

represents the weights of layer l and

α^{(l)}

is the layer-adaptive factor.

Layer Adaptive Scaling. The layer-adaptive factor

α^{(l)}

accounts for the hierarchical importance of different network depths:

α^{(l)} = \frac{depth (l)}{\sum_{i = 1}^{L} depth (i)} \times β_{global}

(21)

where

β_{global}

= 1.5 is the global tuning coefficient, L is the total number of layers, and

depth (l)

denotes the depth weight. This scaling strategy preserves critical semantic features in deeper layers while allowing aggressive compression of redundant shallow features.

Global Pruning. Based on the importance scores, global pruning determines which weights to remove across all layers simultaneously:

min_{S} L (M_{S}, D_{val}) + λ \sum_{l = 1}^{L} |S^{(l)}|

(22)

subject to the FLOPs constraint:

\frac{FLOPs (M_{S})}{FLOPs (M_{0})} \leq \frac{1}{s_{target}}

(23)

where S is the pruning mask,

M_{S}

is the pruned model,

D_{val}

is the validation set,

λ

is the regularization coefficient, and

s_{target}

= 2.0 is the target compression ratio.

Iterative Optimization & Speed Target Check. The pruning process follows a progressive schedule with continuous speed monitoring:

r_{t} = r_{max} (1 - e^{- t / τ})

(24)

where

r_{t}

is the pruning ratio at iteration t,

r_{max}

is the maximum pruning ratio, and

τ

controls the pruning speed. The speedup is continuously checked against the target:

{SpeedUp}_{t} = \frac{{FLOPs}_{0} \times s_{target}}{{FLOPs}_{t}}

(25)

The iterative process continues until the target 2× speedup is achieved while maintaining acceptable accuracy.

Structured Sparsity. The global pruning results in structured sparsity patterns that eliminate entire channels or filters, ensuring hardware-friendly acceleration.

Adaptive Fine-tuning. The compressed model undergoes adaptive fine-tuning to recover performance using a composite loss function:

L_{finetune} = L_{detection} + λ_{KD} L_{KD} + λ_{reg} L_{reg}

(26)

where

L_{detection}

is the standard detection loss,

L_{KD}

is the knowledge distillation loss from the original model, and

L_{reg}

provides regularization.

Performance Recovery. The final stage evaluates and ensures performance recovery through comprehensive metrics, including accuracy retention and efficiency gains.

Implementation Configuration. The pipeline is configured with:

β_{global}

= 1.5 for deeper layer emphasis,

s_{target}

= 2.0 for 2× acceleration,

λ_{KD}

= 0.1 for knowledge distillation, and

λ_{reg}

= 0.0005 for regularization. Fine-tuning employs 300 epochs with an SGD optimizer, batch size 8, and a close-mosaic strategy in the final 20 epochs for enhanced small-target detection.

Evaluation Metrics. The compression effectiveness is measured through multiple metrics:

Recovery Rate:

Recovery Rate = \frac{{mAP}_{compressed}}{{mAP}_{original}} \times 100 %

(27)

Compression Ratio:

Compression Ratio = \frac{{FLOPs}_{original}}{{FLOPs}_{compressed}}

(28)

Parameter Reduction:

Parameter Reduction = \frac{{Params}_{original} - {Params}_{compressed}}{{Params}_{original}} \times 100 %

(29)

This comprehensive pipeline achieves >98.5% performance retention while delivering 2× speedup and significant memory reduction.

3. Results

3.1. Experimental Settings and Dataset Description

Experiments were performed on a workstation with an NVIDIA GeForce RTX 2080 Ti GPU (22 GB memory) using a PyTorch 2.2.2, CUDA 11.8, Python 3.12.11, and Ultralytics YOLOv8 framework (v8.1.9). Training employed 100 epochs with batch size 8, SGD optimization (momentum 0.9), and initial learning rate 0.01. Input images were standardized to

640 \times 640

resolution for both training and inference unless noted otherwise.

We utilize the SAR Ship Detection Dataset (SSDD) [28] as our primary evaluation benchmark for ablation studies, component analysis, and hyperparameter validation. SSDD comprises 1160 SAR images with horizontal bounding box ship annotations, acquired from three different sensors across varied viewing geometries and sea conditions. The dataset encompasses C and X frequency bands with spatial resolutions from 1 to 15 m. We adopt a standard 4:1 train-test split, yielding 928 training and 232 test images containing a total of 2587 ship instances. SSDD categorizes scenes into Inshore (coastal environments with complex clutter) and Offshore (open-ocean scenarios with homogeneous backgrounds), as illustrated in Figure 5. Table 1 details the characteristics of both SSDD and our cross-dataset validation benchmark RSDD-SAR [29].

SSDD serves as the primary benchmark for systematic component analysis and hyperparameter validation. RSDD-SAR is employed exclusively for cross-dataset generalization evaluation with state-of-the-art oriented detection methods.

3.2. Evaluation Metrics

Primary metrics:

Precision:

Precision = \frac{TP}{TP + FP}

(30)

Recall:

Recall = \frac{TP}{TP + FN}

(31)

Mean Average Precision (mAP):

mAP = \frac{1}{K} \sum_{k = 1}^{K} {AP}_{k}

(32)

where mAP@[0.5:0.95] averages AP over IoU thresholds

{0.50, 0.55, \dots, 0.95}

.

Frames Per Second (FPS):

FPS = \frac{N_{frames}}{T_{total}}

(33)

where

N_{frames}

is the total number of processed frames and

T_{total}

is the total inference time in seconds. FPS measures real-time inference speed, with higher values indicating faster processing capability for practical deployment. Additional metrics include Params (learnable parameters in millions) and GFLOPs (floating-point operations for computational cost assessment).

3.3. Ablation Study on Lightweight Components

We evaluate the contributions of lightweight modules on SSDD, starting from the YOLOv8n baseline. Table 2 presents both individual and combined component analysis.

Parameters are measured in millions; GFLOPs are measured at 640 × 640 input resolution. LGNet backbone integrates all proposed components with Ghost convolutions.

The full LGNet backbone achieves 23.3% parameter reduction (3.0 M→2.3 M) and 16.0% computational savings (8.1 G→6.8 G) compared to YOLOv8n while maintaining competitive accuracy (mAP@95: 0.671 vs. 0.683, −1.2 pp).

Individual Component Analysis: Each lightweight module shows distinct characteristics. HGStem achieves the highest precision (0.963) but with slight mAP@95 degradation (0.676). Both GHBlock and DWConv deliver strong performance with the highest mAP@50 (0.974) among individual components, where DWConv also provides the best mAP@95 (0.687) with minimal computational overhead.

Component Combination Strategy: The combination analysis shows complementary benefits. GHBlock + DWConv achieves the most aggressive compression (2.3 M parameters, 6.3 G FLOPs) while maintaining reasonable accuracy. The complete LGNet backbone provides optimal integration with the strongest recall (0.923), excellent mAP@50 (0.972), and efficient parameter usage (2.3 M), showing the effectiveness of the comprehensive lightweight architecture design.

Visual Validation: Figure 6 presents activation heatmaps generated using HiResCAM (High-Resolution Class Activation Mapping) to visualize model attention patterns. The heatmap technique employs gradient-weighted class activation mapping to highlight discriminative regions by computing gradients of the target class score with respect to feature maps from specific network layers. Compared to the YOLOv8n baseline (detection score: 0.87), LGNet achieves comparable target localization capability (detection score: 0.85) despite significant parameter reduction (3.0 M→2.3 M, −23.3%) and computational savings (8.1 G→6.8 G, −16.0%). The slight accuracy decrease demonstrates the acceptable trade-off between model efficiency and detection performance in the lightweight architecture design.

3.4. LAMP Pruning Results

Experimental Performance Analysis. Building on the lightweight LGNet architecture, we apply LAMP-based structured channel pruning to achieve extreme model compression while maintaining detection accuracy. Table 3 demonstrates the complete three-stage pipeline: baseline performance, post-pruning degradation, and fine-tuning recovery.

The pruning process achieves 2.03 × speedup (FLOPs reduction: 51.47%) and 67.90% parameter reduction.

The results reveal that Global LAMP pruning achieves 2.03 × speedup with 67.90% parameter reduction (2.3 M→0.74 M) and 51.47% FLOP reduction (6.8 G→3.3 G). Despite a significant initial accuracy drop (mAP@50: 0.972→0.590), the 200-epoch adaptive fine-tuning with knowledge distillation successfully recovers and even improves performance beyond the original baseline: P: 0.945→0.971 (+2.6 pp), R: 0.923→0.936 (+1.3 pp), mAP@50: 0.972→0.979 (+0.7 pp), mAP@95: 0.671→0.719 (+4.8 pp).

Hyperparameter Configuration and Rationale Analysis. The success of LAMP pruning depends critically on carefully chosen hyperparameters that balance compression aggressiveness with accuracy preservation. Table 4 presents our optimized configuration.

This configuration is optimized specifically for SAR ship detection tasks with validation on SSDD and cross-dataset evaluation on RSDD-SAR.

Three critical parameters govern the compression–accuracy trade-off:

Global pruning enables cross-layer importance comparison, allowing LAMP to allocate varying sparsity levels based on layer-wise semantic importance—achieving 67.90% compression compared to only 42% with uniform layer-wise pruning.

Knowledge distillation weight (

λ_{KD} = 0.1

) balances soft-target guidance from the teacher model with hard-target supervision, transferring crucial semantic relationships while preventing over-dependence.

Regularization (

λ_{reg} = 0.0005

) maintains optimization stability in the dramatically reduced parameter space, preventing catastrophic forgetting during extended fine-tuning.

LAMP Integration with SAR Ship Detection and Technical Contributions. The LAMP pruning strategy is specifically adapted for SAR imagery characteristics and ship detection requirements. Unlike natural images, SAR data exhibits unique challenges, including speckle noise, geometric distortions, and limited training samples, making aggressive pruning particularly risky for performance preservation.

Our key technical contributions for LAMP-SAR integration include the following: (1) layer-adaptive importance scoring that accounts for the hierarchical nature of SAR feature extraction—early layers capture fundamental radar backscattering patterns while deeper layers encode complex ship signatures and contextual information; (2) structured channel pruning that maintains the spatial coherence essential for SAR target detection, avoiding the fragmented feature maps that harm small ship detection accuracy; (3) knowledge distillation with SAR-specific loss weighting that preserves the subtle discriminative features crucial for distinguishing ships from clutter and wake interference.

The final compressed model achieves remarkable efficiency gains (2.03 × speedup, 67.90% parameter reduction) while delivering superior detection performance (+4.8 pp mAP@95 improvement). This demonstrates that LAMP pruning, when properly configured for SAR characteristics, not only reduces computational burden but can also act as an effective regularization mechanism, improving the model’s generalization capability on challenging maritime surveillance tasks. The combination of extreme compression with performance enhancement makes our approach particularly valuable for resource-constrained SAR applications such as real-time maritime monitoring and autonomous vessel detection systems (Figure 7).

Training dynamics analysis further confirms convergence stability. Pre-pruning training exhibits rapid initial learning (epochs 1–50) followed by refinement (epochs 51–80) and plateau (epochs 81–100), with validation mAP@50 stabilizing at >0.96 and loss converging smoothly without oscillation. Post-pruning fine-tuning demonstrates two-phase behavior: rapid recovery (epochs 1–50, mAP@95: 0.391→0.65) through knowledge distillation guidance, followed by gradual refinement (epochs 51–200) with a close-mosaic strategy contributing to final improvement. The smooth convergence without significant oscillation validates hyperparameter stability across both training stages.

3.5. Comparative Analysis

LGNet produces concentrated target responses while suppressing background interference. Compared with YOLOv5n, Faster R-CNN, FASC-Net, and RLE-YOLO, LGNet better preserves small targets while maintaining precise localization (Figure 8).

As shown in Figure 8, qualitative comparison across six representative scenarios reveals LGNet’s superior performance in handling SAR-specific challenges. In the first scenario (column 1), featuring large vessels in harbor environments with strong land clutter, all methods achieve successful detection, but LGNet (row f) demonstrates the tightest bounding box alignment with ground truth (row a), indicating more precise localization capability. The second scenario (column 2) presents a critical small-target detection challenge where YOLOv5n (row b) produces a noticeably larger bounding box with reduced confidence, while Faster R-CNN (row c), FASC-Net (row d), and RLE-YOLO (row e) maintain reasonable detection quality. LGNet achieves the most compact and accurate bounding box, demonstrating superior small-target discrimination ability.

The third scenario (column 3) represents the most challenging case with severe speckle noise and heterogeneous sea clutter that creates multiple false-alarm candidates. Here, YOLOv5n exhibits slight localization deviation, while Faster R-CNN shows marginal bounding box drift. FASC-Net and RLE-YOLO maintain stable detection, but LGNet provides the most precise localization with minimal background interference, validating its robust noise suppression capability through Ghost-enhanced feature extraction. In the fourth scenario (column 4), featuring another small ship target, all methods successfully detect the vessel, yet LGNet consistently delivers the tightest bounding box fit, demonstrating its advantage in preserving fine-grained spatial details through hierarchical multi-scale feature aggregation.

The second scenario (column 2) tests medium-sized ship detection in relatively clean backgrounds where all methods perform well, confirming baseline detection capability. The fifth scenario (column 5) presents a complex case with oblique ship orientation and strong land–background interference. Here, YOLOv5n shows slight bounding box looseness, Faster R-CNN maintains reasonable accuracy, while FASC-Net and RLE-YOLO achieve competitive results. LGNet again demonstrates the most precise boundary delineation, effectively suppressing complex background interference while maintaining tight target localization.

Overall, LGNet exhibits three key advantages: (1) superior small-target preservation with consistently tighter and more accurate bounding boxes across all scenarios; (2) enhanced robustness to speckle noise and sea clutter through Ghost convolution-based feature redundancy exploitation, which reduces sensitivity to local intensity variations; (3) precise localization capability in complex backgrounds via hierarchical GHBlock modules and multi-scale feature pyramid integration that balances fine-grained spatial details with high-level semantic understanding. These qualitative observations align with the quantitative superiority demonstrated in Table 5, where LGNet achieves the highest mAP@95 (71.9%) while maintaining the smallest model footprint (0.74 M parameters, 3.3 GFLOPs) and fastest inference speed (681 FPS).

On SSDD, LGNet achieves the best mAP@95 (71.9%) with the smallest size (0.74 M), the lowest computational cost (3.3 G), and the highest inference speed (681 FPS), outperforming YOLOv8n/RLE-YOLO by +3.5–3.6 pp in accuracy while being 13% faster than YOLOv8n (Table 5).

LGNet achieves the best mAP@95 (71.9%) on SSDD with the smallest model size (0.74 M parameters) and lowest computational cost (3.3 GFLOPs), demonstrating optimal accuracy–efficiency balance. Parameters are measured in millions; GFLOPs are measured at 640 × 640 input resolution. FPS measurements were conducted on an NVIDIA GeForce RTX 2080 Ti GPU with batch size 8, representing inference speed under consistent hardware conditions for a fair comparison across all methods.

Cross-dataset evaluation on RSDD-SAR (Table 1) uses identical hyperparameters from SSDD (Table 4) and training protocol (100+200 epochs). Results (Table 6) validate architecture robustness across dataset scales and annotation types.

LGNet achieves the highest mAP@50 (94.4%) among all evaluated methods on RSDD-SAR, surpassing the second-best-performing method LSD-Det (+4.1 pp), Rotated-RTMDet-s (+4.2 pp), Oriented RepPoints (+4.9 pp), R3Det (+5.5 pp), S2ANet (+6.0 pp), SPG-OSD (+25.1 pp), and YOLOv8n baseline (+28.5 pp). Although LSD-Det achieves higher precision (92.4%) and recall (90.3%), LGNet demonstrates superior overall detection performance with the best mAP@50, validating its effectiveness for cross-dataset generalization in SAR ship detection tasks (Figure 9).

4. Discussion

4.1. Experimental Validation and Performance Analysis

Our experimental evaluation shows that LGNet achieves superior performance across multiple benchmarks and deployment scenarios. Our validation approach used comprehensive ablation studies on SSDD (Table 2 and Table 3) and cross-dataset evaluation on RSDD-SAR (Table 6) to verify architecture robustness and generalization capability.

The architectural design validation confirms our lightweight approach works effectively. Individual component analysis shows that GHBlock achieves optimal balance between accuracy and efficiency (mAP@95: 0.682, 2.7 M parameters, 7.3 GFLOPs), while DWConv delivers the highest individual accuracy (0.687). The systematic integration of Ghost convolutions throughout the backbone provides significant computational benefits: replacing standard C2f modules yields 45.2% FLOPs reduction and 45.1% parameter reduction with 1.89 × actual speedup, while maintaining minimal accuracy degradation (<1 pp mAP). The complete LGNet backbone achieves 23.3% parameter reduction and 16.0% computational savings compared to the YOLOv8n baseline.

The LAMP pruning strategy works highly effectively for extreme model compression. Despite the initial accuracy drop after pruning (mAP@50: 0.970→0.590), the 200-epoch adaptive fine-tuning with knowledge distillation successfully recovers and surpasses baseline performance (final mAP@95: 0.719, +4.8 pp improvement). This aggressive compression achieves 67.90% parameter reduction (2.3 M→0.74 M) and 51.47% FLOPs reduction (6.8 G→3.3 G), showing that intelligent pruning can maintain detection capability while significantly reducing model complexity.

Performance benchmarking validates LGNet’s superiority across different evaluation scenarios. On SSDD, LGNet achieves the best mAP@95 (71.9%) with the smallest model size (0.74 M parameters) and lowest computational cost (3.3 GFLOPs), outperforming traditional two-stage detectors by +10.1–12.6 pp with 98.2% parameter reduction. Cross-dataset evaluation on RSDD-SAR confirms robust generalization, where LGNet achieves the highest mAP@50 (94.4%) among all evaluated methods, surpassing recent oriented detectors including LSD-Det (+4.1 pp), Rotated-RTMDet-s (+4.2 pp), and R3Det (+5.5 pp).

4.2. Edge Deployment Validation

Real-world deployment validation on Huawei Atlas AIpro-20T confirms LGNet’s practical feasibility for edge computing scenarios. The model achieves exceptional performance with 135.39 FPS (7.39 ms latency) on the SSDD test set (640 × 640 input), demonstrating genuine real-time capability on resource-constrained hardware (Figure 10). This performance validates the effectiveness of our dual-strategy lightweight approach for practical maritime surveillance applications, where computational resources and power consumption are critical constraints.

The edge deployment results highlight LGNet’s suitability for autonomous systems such as unmanned aerial vehicles (UAVs), autonomous surface vessels, and satellite-based monitoring platforms. The ultra-compact model size (0.74 M parameters) enables deployment on devices with limited memory capacity (<2 GB), while the low computational requirement (3.3 GFLOPs) ensures efficient operation on edge computing hardware with <10 GFLOPS processing capability.

4.3. Limitations and Failure Case Analysis

While LGNet achieves strong overall performance, certain limitations deserve acknowledgment. Failure case analysis reveals challenges in specific SAR detection scenarios. Figure 9 demonstrates typical error patterns: false positive detection in image (b) at the second row, fourth column, where non-ship targets are misclassified as vessels, and false negative detection in image (d) at the first row, second column, where actual ship targets remain undetected.

The lightweight architecture faces particular difficulties in densely clustered harbor environments, where overlapping vessel signatures complicate individual target delineation. Small or partially occluded targets also challenge the compressed model, as reduced feature capacity may limit detection of fine-grained signatures essential for identifying diminutive or partially visible vessels.

The model occasionally produces false positives in complex coastal environments where man-made structures (piers, oil platforms, large debris) exhibit SAR signature similarities to ships. False negatives occur primarily in low signal-to-noise ratio conditions or when ships appear at the image boundaries with partial visibility. These limitations reflect the inherent trade-off between model compression and representational capacity, suggesting that certain detection scenarios may require specialized handling or ensemble approaches.

Cross-sensor and cross-domain generalization requires further evaluation, particularly for different frequency bands (L-band, S-band), polarization configurations (HH, VV, VH, HV), and extreme weather conditions (heavy precipitation, high sea states). The current evaluation focuses on C-band and X-band SAR data under typical maritime conditions, limiting the assessment of broader operational robustness.

4.4. Contribution to SAR-Based Ship Detection

LGNet’s integration with SAR imagery advances maritime surveillance capabilities significantly. While SAR sensors offer unique advantages—all-weather operation, day–night independence, and cloud penetration—they introduce computational challenges through complex preprocessing requirements, large data volumes, and specialized feature patterns.

LGNet addresses these SAR-specific challenges through targeted innovations. Ghost convolution efficiently exploits redundant feature patterns characteristic of maritime SAR scenes while minimizing computational overhead. GHBlock’s hierarchical aggregation enables robust multi-scale detection essential for SAR imagery, where vessel appearances vary dramatically with imaging geometry and target characteristics.

The LAMP pruning strategy demonstrates remarkable compatibility with SAR detection tasks, where the preserved semantic information remains sufficient for accurate ship classification despite aggressive model compression. This is particularly valuable for SAR applications, where traditional deep learning models often require substantial computational resources that exceed the capabilities of maritime edge computing platforms.

The successful deployment on edge hardware represents a paradigm shift for SAR-based maritime surveillance, enabling distributed monitoring networks where individual sensors can perform intelligent detection without relying on ground-based processing centers. This capability is especially critical for applications in remote maritime areas, autonomous vessel navigation, and real-time threat assessment scenarios where communication bandwidth and latency constraints prohibit cloud-based processing.

Future research directions should focus on expanding the model’s robustness across diverse SAR acquisition parameters, incorporating temporal and polarimetric information for enhanced detection reliability, and developing specialized architectures for emerging maritime surveillance challenges such as small vessel trafficking detection and illegal fishing monitoring in protected waters.

5. Conclusions

We present LGNet, an ultra-lightweight SAR ship detection network that combines architectural innovation with intelligent compression to achieve high efficiency without sacrificing accuracy. The dual-strategy approach—Ghost-enhanced architecture with adaptive pruning—shows that substantial model compression and high detection performance can coexist in SAR applications.

Comprehensive evaluation validates LGNet’s effectiveness across diverse SAR scenarios, achieving superior performance on SSDD (mAP@95: 71.9%) and RSDD-SAR (mAP@50: 94.4%) with remarkable efficiency (0.74 M parameters, 3.3 GFLOPs). This represents a significant advance in computational efficiency for maritime surveillance without compromising detection quality.

LGNet shows robust generalization across different SAR datasets and annotation formats, maintaining consistent performance for both horizontal (SSDD) and oriented (RSDD-SAR) bounding box detection tasks. This adaptability confirms the architecture’s suitability for diverse SAR imaging conditions and operational requirements.

Real-time deployment on edge hardware (Huawei Atlas AIpro-20T, 135.39 FPS) confirms LGNet’s practical viability for resource-constrained maritime surveillance systems. The model’s exceptional efficiency makes it particularly suited for autonomous platforms and distributed monitoring networks operating in challenging maritime environments.

Author Contributions

Conceptualization, R.L., J.C. and J.H.; methodology, R.L., J.C. and J.H.; software, J.C. and J.H.; validation, J.H. and J.C.; formal analysis, J.H. and J.C.; investigation, R.L., J.C. and J.H.; resources, R.L. and Z.W.; data curation, Y.T., J.H. and J.C.; writing—original draft preparation, J.C. and J.H.; writing—review and editing, R.L., J.C. and J.H.; visualization, J.C. and J.H.; supervision, R.L.; project administration, R.L.; funding acquisition, R.L. Specifically, J.C. was responsible for network architecture design and code implementation, including the development of GHBlock modules, integration of Ghost convolutions, and implementation of the LAMP pruning algorithm. J.H. conducted the experimental processes, including comprehensive experiments on the SSDD dataset, ablation studies, performance evaluation, and result analysis. Y.T. was responsible for literature review and reference collection, providing comprehensive background research support. R.L. supervised and guided the entire research process, providing direction on research methodology, experimental design validation, and overall project coordination. Z.W. provided the necessary research platform and computational resources for the study. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Grants by National Natural Science Foundation of China (Grant Nos. 42571372, 42271345).

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the reported results in this study are based on two publicly available datasets: SAR Ship Detection Dataset (SSDD) [28]: the SSDD dataset is released under the Apache-2.0 license and provides comprehensive SAR ship detection data with horizontal bounding box annotations. The dataset can be accessed at https://github.com/TianwenZhang0825/Official-SSDD/tree/main (accessed on 17 November 2025). Rotating SAR Ship Detection Dataset (RSDD-SAR) [29]: the RSDD-SAR dataset provides 7000 SAR images with rotating bounding box annotations for oriented ship detection. The dataset can be downloaded from https://github.com/makabakasu/RSDD-SAR-OPEN (accessed on 17 November 2025). Both datasets are publicly accessible, and all experimental results reported in this paper can be reproduced using these publicly available datasets.

Acknowledgments

During the preparation of this manuscript/study, the author(s) used ChatGPT version GPT-5 (OpenAI) for the purposes of language polishing and textual optimization. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rocca, F.; Prati, C.; Monti Guarnieri, A.; Ferretti, A. SAR interferometry and its applications. Surv. Geophys. 2000, 21, 159–176. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, L.; Wu, J.; Guo, W. Optical and synthetic aperture radar image fusion for ship detection and recognition: Current state, challenges, and future prospects. IEEE Geosci. Remote Sens. Mag. 2024, 12, 132–168. [Google Scholar] [CrossRef]
Lang, P.; Fu, X.; Dong, J.; Yang, H.; Yin, J.; Yang, J.; Martorella, M. Recent Advances in Deep-Learning-Based SAR Image Target Detection and Recognition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 6884–6915. [Google Scholar] [CrossRef]
Chen, Y.; Shen, Y.; Duan, C.; Wang, Z.; Mo, Z.; Liang, Y.; Zhang, Q. Robust and Efficient SAR Ship Detection: An Integrated Despecking and Detection Framework. Remote Sens. 2025, 17, 580. [Google Scholar] [CrossRef]
Cao, R.; Sui, J. A Dynamic Multi-Scale Feature Fusion Network for Enhanced SAR Ship Detection. Sensors 2025, 25, 5194. [Google Scholar] [CrossRef] [PubMed]
Zhao, S.; Liu, Q.; Yu, W.; Lv, J. A Single-Stage Arbitrary-Oriented Detector Based on Multiscale Feature Fusion and Calibration for SAR Ship Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 8179–8198. [Google Scholar] [CrossRef]
Fuhrmann, D.R.; Kelly, E.J.; Nitzberg, R. A CFAR adaptive matched filter detector. IEEE Trans. Aerosp. Electron. Syst 1992, 28, 208–216. [Google Scholar]
Gierull, C. Statistical analysis of multilook SAR interferograms for CFAR detection of ground moving targets. IEEE Trans. Geosci. Remote Sens. 2004, 42, 691–701. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; Pereira, F., Burges, C., Bottou, L., Weinberger, K., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2012; Volume 25. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar]
Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning RoI Transformer for Oriented Object Detection in Aerial Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Yang, X.; Yan, J.; Feng, Z.; He, T. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3163–3171. [Google Scholar] [CrossRef]
Liu, Y.; Sun, X.; Shao, W.; Yuan, Y. S2ANet: Combining local spectral and spatial point grouping for point cloud processing. Virtual Real. Intell. Hardw. 2024, 6, 267–279. [Google Scholar] [CrossRef]
Xue, W.; Ai, J.; Zhu, Y.; Chen, J.; Zhuang, S. AIS-FCANet: Long-Term AIS Data Assisted Frequency-Spatial Contextual Awareness Network for Salient Ship Detection in SAR Imagery. IEEE Trans. Aerosp. Electron. Syst. 2025, 61, 15166–15171. [Google Scholar] [CrossRef]
Ai, J.; Tian, R.; Luo, Q.; Jin, J.; Tang, B. Multi-Scale Rotation-Invariant Haar-Like Feature Integrated CNN-Based Ship Detection Algorithm of Multiple-Target Environment in SAR Imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 10070–10087. [Google Scholar] [CrossRef]
Cui, Z.; Li, Q.; Cao, Z.; Liu, N. Dense Attention Pyramid Networks for Multi-Scale Ship Detection in SAR Images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8983–8997. [Google Scholar] [CrossRef]
Zhang, Y.; Li, K.; Li, K.; Zhong, B.; Fu, Y. Residual Non-local Attention Networks for Image Restoration. arXiv 2019, arXiv:1903.10082. [Google Scholar] [CrossRef]
Ke, H.; Ke, X.; Zhang, Z.; Chen, X.; Xu, X.; Zhang, T. SLA-Net: A Novel Sea–Land Aware Network for Accurate SAR Ship Detection Guided by Hierarchical Attention Mechanism. Remote Sens. 2025, 17, 3576. [Google Scholar] [CrossRef]
Ke, H.; Ke, X.; Yan, Y.; Luo, D.; Cui, F.; Peng, H.; Hu, Y.; Liu, Y.; Zhang, T. Laplace & LBP feature guided SAR ship detection method with adaptive feature enhancement block. In Proceedings of the 2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 24–26 May 2024; Volume 6, pp. 780–783. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X.; Gao, G. Divergence to Concentration and Population to Individual: A Progressive Approaching Ship Detection Paradigm for Synthetic Aperture Radar Remote Sensing Imagery. IEEE Trans. Aerosp. Electron. Syst. 2025, 1–13, early access. [Google Scholar] [CrossRef]
Gao, F.; Fan, C.; He, X.; Wang, J.; Sun, J.; Hussain, A. Weakly Supervised SAR Ship Oriented-Detection Algorithm Based on Pseudo-Label Generation Optimization and Guidance. Remote Sens. 2025, 17, 3663. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, S.; Ren, H.; Hu, J.; Zou, L.; Wang, X. Multi-Level Feature-Refinement Anchor-Free Framework with Consistent Label-Assignment Mechanism for Ship Detection in SAR Imagery. Remote Sens. 2024, 16, 975. [Google Scholar] [CrossRef]
Yu, J.; Zhou, G.; Zhou, S.; Qin, M. A Fast and Lightweight Detection Network for Multi-Scale SAR Ship Detection under Complex Backgrounds. Remote Sens. 2022, 14, 31. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X.; Li, J.; Xu, X.; Wang, B.; Zhan, X.; Xu, Y.; Ke, X.; Zeng, T.; Su, H.; et al. SAR Ship Detection Dataset (SSDD): Official Release and Comprehensive Data Analysis. Remote Sens. 2021, 13, 3690. [Google Scholar] [CrossRef]
Xu, C.; Su, H.; Li, J.; Liu, Y.; Yao, L.; Gao, L.; Yan, W.; Wang, T. RSDD-SAR: Rotated Ship Detection Dataset in SAR Images. J. Radars 2022, 11, 581–596. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
Pang, J.; Chen, K.; Shi, J.; Feng, H.; Ouyang, W.; Lin, D. Libra R-CNN: Towards Balanced Learning for Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 821–830. [Google Scholar] [CrossRef]
Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-time Object Detection. arXiv 2024, arXiv:2304.08069. [Google Scholar] [CrossRef]
Jocher, G.; Chaurasia, A.; Qiu, J. YOLOv5: A State-of-the-Art Real-Time Object Detection System. 2022. Available online: https://github.com/ultralytics/yolov5 (accessed on 15 November 2025).
Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8. 2023. Available online: https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/v8 (accessed on 15 November 2025).
Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLO11. 2024. Available online: https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/11 (accessed on 15 November 2025).
Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: A Simple and Strong Anchor-Free Object Detector. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 1922–1933. [Google Scholar] [CrossRef]
Xu, Y.; Xue, X.; Li, C.; Zhao, S.; Xu, X.; Zeng, C. RLE-YOLO: A Lightweight and Multiscale SAR Ship Detection Based on Improved YOLOv8. IEEE Access 2025, 13, 46584–46600. [Google Scholar] [CrossRef]
Guo, H.; Yang, X.; Wang, N.; Gao, X. A CenterNet++ Model for Ship Detection in SAR Images. Pattern Recognit. 2021, 112, 107787. [Google Scholar] [CrossRef]
Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.S.; Bai, X. Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1452–1459. [Google Scholar] [CrossRef] [PubMed]
Yang, X.; Yan, J. Arbitrary-Oriented Object Detection with Circular Smooth Label. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 677–694. [Google Scholar] [CrossRef]
Li, J.; Chen, M.; Hou, S.; Wang, Y.; Luo, Q.; Wang, C. An Improved S2A-Net Algorithm for Ship Object Detection in Optical Remote Sensing Images. Remote Sens. 2023, 15, 4559. [Google Scholar] [CrossRef]
Yang, X.; Zhang, G.; Li, W.; Wang, X.; Zhou, Y.; Yan, J. KFIoU: Kalman Filter based Intersection over Union for Oriented Object Detection. arXiv 2022, arXiv:2201.12558. [Google Scholar]
Li, W.; Chen, Y.; Hu, K.; Zhu, J. Oriented RepPoints for Aerial Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1829–1838. [Google Scholar] [CrossRef]
Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3520–3529. [Google Scholar] [CrossRef]
Zhang, Y.; Lu, D.; Qiu, X.; Li, F. Scattering-Point-Guided RPN for Oriented Ship Detection in SAR Images. Remote Sens. 2023, 15, 1411. [Google Scholar] [CrossRef]
Lyu, C.; Zhang, W.; Huang, H.; Zhou, Y.; Wang, Y.; Liu, Y.; Zhang, S.; Chen, K. RTMDet: An Empirical Study of Designing Real-Time Object Detectors. arXiv 2022, arXiv:2212.07784. [Google Scholar] [CrossRef]
Wang, Z.; Qin, B.; Gao, S. LSD-Det: A Lightweight Detector for Small Ship Targets in SAR Images. IEEE Access 2025, 13, 133483–133496. [Google Scholar] [CrossRef]

Figure 1. Overall architecture of LGNet. The backbone progressively extracts hierarchical features through HGStem, alternating GHBlock/DWConv stages, and SPPF. The neck fuses multi-scale features via upsampling and concatenation. Three detection heads operate at different resolutions (20 × 20, 40 × 40, 80 × 80) for multi-scale ship detection.

Figure 2. Comparison of standard convolution and Ghost convolution mechanisms. (a) Standard convolution: all output channels generated via dense convolution operations. (b) Ghost convolution: intrinsic features are first extracted via partial standard convolution, then cheap linear transformations (

Φ_{1}, Φ_{2}, \dots

) generate additional Ghost features, which are concatenated and fused with identity mapping to form the output. This decomposition reduces computation by exploiting feature redundancy.

Figure 2. Comparison of standard convolution and Ghost convolution mechanisms. (a) Standard convolution: all output channels generated via dense convolution operations. (b) Ghost convolution: intrinsic features are first extracted via partial standard convolution, then cheap linear transformations (

Φ_{1}, Φ_{2}, \dots

) generate additional Ghost features, which are concatenated and fused with identity mapping to form the output. This decomposition reduces computation by exploiting feature redundancy.

Figure 3. Architecture of GHBlock. The block employs n stacked GhostConv layers for hierarchical feature extraction (with repetition factor indicated), followed by dense-like concatenation, channel refinement via two successive 1 × 1 convolutions (squeeze and excitation), and an optional residual shortcut connection for feature reuse.

Figure 4. LAMP pruning pipeline for LGNet compression. The process flows from input model through importance estimation, layer-adaptive scaling, global pruning with iterative optimization, and structured sparsity generation to adaptive fine-tuning for performance recovery.

Figure 5. Representative samples from SSDD. Inshore (top): near-shore scenes with strong background clutter and man-made structures. Offshore (bottom): open-sea scenes with relatively homogeneous backgrounds but wake interference.

Figure 6. Heatmap comparison between YOLOv8n and LGNet backbone: (a) YOLOv8n baseline; (b) LGNet before pruning.

Figure 7. LAMP pruning effect visualization: (a) LGNet before pruning; (b) LGNet after LAMP pruning and fine-tuning. The compressed model maintains a strong target detection capability while achieving 67.90% parameter reduction.

Figure 8. Visual comparison of detection results across different methods. Row (a): Ground truth annotations. Row (b): YOLOv5n detection results. Row (c): faster R-CNN detection results. Row (d): FASC-Net detection results. Row (e): RLE-YOLO detection results. Row (f): LGNet detection results. LGNet maintains concentrated activations with precise localization while suppressing background clutter and preserving small targets.

Figure 9. Detection result citatts on the RSDD-SAR dataset. The results demonstrate LGNet’s capability to accurately detect ships of various sizes and orientations in complex SAR scenes. (a) Ground truth labels—batch 0. (b) LGNet predictions—batch 0. (c) Ground truth labels—batch 1. (d) LGNet predictions—batch 1. Red arrows in (b) and (c) indicate regions with detection errors.

Figure 10. Edge deployment on Huawei Atlas AIpro-20T: 135.39 FPS (7.39 ms latency), real-time multi-scale ship detection with low computational overhead.

Table 1. Comparison of SSDD and RSDD-SAR datasets.

Characteristic	SSDD	RSDD-SAR
Image number	1160	7000
Ship instances	2587	10,263
Image size (pixels)	190–668 (variable)	$512 \times 512$ (fixed)
Number of sensors	3	2
Resolution (m)	1–15	2–20
Frequency bands	C, X	C, X
Annotation type	Horizontal bounding box	Rotating bounding box
Train/Test split	928/232 (4:1)	5600/1400 (4:1)
Usage in this work	Primary benchmark	Cross-dataset validation
	Ablation & hyperparameter	Generalization capability
	analysis (Section 3.3 and Section 3.4)	evaluation (Section 3.5)

Table 2. Comprehensive ablation study of lightweight components on YOLOv8n.

Method	P	R	mAP@50	mAP@95	Params	GFLOPs
YOLOv8n (Baseline)	0.940	0.921	0.971	0.683	3.0	8.1
+ HGStem	0.963	0.897	0.969	0.676	3.0	7.9
+ GHBlock	0.950	0.911	0.974	0.682	2.7	7.3
+ DWConv	0.961	0.901	0.974	0.687	2.6	7.2
+ GHBlock + DWConv	0.931	0.918	0.969	0.674	2.3	6.3
+ HGStem + DWConv	0.949	0.916	0.969	0.666	2.6	7.2
+ HGStem + GHBlock	0.958	0.912	0.970	0.671	2.7	7.0
+ LGNet backbone (Full)	0.945	0.923	0.972	0.671	2.3	6.8

Table 3. Performance evolution through LAMP pruning and fine-tuning stages.

Method	P	R	mAP@50	mAP@95	Params	GFLOPs
LGNet backbone (Pre-prune)	0.945	0.923	0.972	0.671	2.3	6.8
+ LAMP (Before finetune)	0.543	0.621	0.590	0.391	0.74	3.3
LGNet (After finetune)	0.971	0.936	0.979	0.719	0.74	3.3

Table 4. Hyperparameter configuration for LAMP pruning and adaptive fine-tuning.

Hyperparameter	Value	Rationale
Speedup target	2.0×	Optimal compression–accuracy balance for SAR applications
Global pruning	True	Cross-layer importance enables adaptive sparsity allocation
KD weight $λ_{KD}$	0.1	Transfers semantic knowledge while preserving independence
Regularization $λ_{reg}$	0.0005	Prevents overfitting in reduced-capacity networks
Fine-tuning epochs	200	Sufficient recovery time without diminishing returns
Close-mosaic epochs	20	Enhances small-target detection under realistic conditions
Batch size	8	Hardware-optimized for GPU memory constraints
Learning rate $η_{0}$	0.01	SGD with cosine decay for stable convergence

Table 5. Comprehensive comparison on SSDD. Best results are in bold.

Method	P	R	mAP	mAP	Params	GFLOPs	FPS
			@50	@95	(M)
Faster R-CNN [30]	0.824	0.905	0.893	0.593	41.4	134.4	18
Cascade R-CNN [12]	0.842	0.956	0.963	0.618	67.2	153.2	16
Libra R-CNN [31]	0.836	0.947	0.948	0.598	42.8	141.3	17
RT-DETR [32]	0.938	0.942	0.972	0.639	20.1	60.9	40
YOLOv5n [33]	0.951	0.940	0.956	0.599	1.8	4.1	628
YOLOv5s [33]	0.953	0.946	0.969	0.607	7.1	16.4	147
YOLOv8n [34]	0.940	0.921	0.971	0.683	3.0	8.1	602
YOLOv8s [34]	0.948	0.967	0.976	0.660	11.1	28.4	90
YOLO11s [35]	0.969	0.959	0.972	0.671	9.4	21.3	113
FCOS [36]	0.847	0.955	0.958	0.596	32.1	126.0	21
SwFR [26]	0.951	0.940	0.972	0.620	58.7	167.2	14
RLE-YOLO [37]	0.976	0.968	0.982	0.684	6.2	18.1	133
FASC-Net [38]	0.956	0.945	0.974	0.655	9.2	17.7	136
LGNet (ours)	0.971	0.936	0.979	0.719	0.74	3.3	681

Table 6. Performance comparison on the RSDD-SAR dataset. Results demonstrate cross-dataset generalization capability.

Method	P	R	F1	mAP@50
Rotated-RetinaNet [14]	0.898	0.843	0.852	0.852
Gliding Vertex [39]	0.298	0.620	0.403	0.532
RoI Transformer [15]	0.439	0.673	0.532	0.561
R3Det [16]	0.906	0.865	0.889	0.889
CSL [40]	0.047	0.639	0.087	0.276
S2ANet [41]	0.898	0.864	0.884	0.884
KFIoU [42]	0.083	0.830	0.151	0.530
Oriented RepPoints [43]	0.902	0.891	0.895	0.895
Oriented-RCNN [44]	0.367	0.741	0.490	0.645
Baseline (YOLOv8n)	0.389	0.772	0.517	0.659
SPG-OSD [45]	0.467	0.786	0.586	0.693
Rotated-RTMDet-s [46]	0.916	0.885	0.902	0.902
LSD-Det [47]	0.924	0.903	0.913	0.903
LGNet (ours)	0.931	0.870	0.900	0.944

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, J.; Huang, J.; Tan, Y.; Wu, Z.; Luo, R. LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection. Remote Sens. 2025, 17, 3800. https://doi.org/10.3390/rs17233800

AMA Style

Chen J, Huang J, Tan Y, Wu Z, Luo R. LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection. Remote Sensing. 2025; 17(23):3800. https://doi.org/10.3390/rs17233800

Chicago/Turabian Style

Chen, Jiawei, Junyu Huang, Yuna Tan, Zhifeng Wu, and Renbo Luo. 2025. "LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection" Remote Sensing 17, no. 23: 3800. https://doi.org/10.3390/rs17233800

APA Style

Chen, J., Huang, J., Tan, Y., Wu, Z., & Luo, R. (2025). LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection. Remote Sensing, 17(23), 3800. https://doi.org/10.3390/rs17233800

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

LGNet: A Lightweight Ghost-Enhanced Network for Efficient SAR Ship Detection

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Overall Design of LGNet

2.2. GhostConv Mechanism

2.3. GHBlock: Ghost-Enhanced Hierarchical Block

2.4. LAMP Pruning and Adaptive Fine-Tuning

3. Results

3.1. Experimental Settings and Dataset Description

3.2. Evaluation Metrics

3.3. Ablation Study on Lightweight Components

3.4. LAMP Pruning Results

3.5. Comparative Analysis

4. Discussion

4.1. Experimental Validation and Performance Analysis

4.2. Edge Deployment Validation

4.3. Limitations and Failure Case Analysis

4.4. Contribution to SAR-Based Ship Detection

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI