Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation

Guan, Tianyue; Chang, Sheng; Deng, Yunkai; Xue, Fengli; Wang, Chunle; Jia, Xiaoxue

doi:10.3390/rs17091612

Open AccessArticle

Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation

by

Tianyue Guan

^1,2,

Sheng Chang

^1,*

,

Yunkai Deng

^1,2,

Fengli Xue

¹,

Chunle Wang

¹ and

Xiaoxue Jia

¹

Space Microwave Remote Sensing System Department, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China

²

School of Electronics, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(9), 1612; https://doi.org/10.3390/rs17091612

Submission received: 28 March 2025 / Revised: 23 April 2025 / Accepted: 29 April 2025 / Published: 1 May 2025

(This article belongs to the Special Issue SAR Image Object Detection and Information Extraction: Methods and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Ship detection in synthetic aperture radar (SAR) images holds significant importance for both military and civilian applications, including maritime traffic supervision, marine search and rescue operations, and emergency response initiatives. Although extensive research has been conducted in this field, the interference of speckle noise in SAR images and the potential discontinuity of target contours continue to pose challenges for the accurate detection of multi-directional ships in complex scenes. To address these issues, we propose a novel ship detection method for SAR images that leverages edge deformable convolution combined with point set representation. By integrating edge deformable convolution with backbone networks, we learn the correlations between discontinuous target blocks in SAR images. This process effectively suppresses speckle noise while capturing the overall offset characteristics of targets. On this basis, a multi-directional ship detection module utilizing radial basis function (RBF) point set representation is developed. By constructing a point set transformation function, we establish efficient geometric alignment between the point set and the predicted rotated box, and we impose constraints on the penalty term associated with point set transformation to ensure accurate mapping between point set features and directed prediction boxes. This methodology enables the precise detection of multi-directional ship targets even in dense scenes. The experimental results derived from two publicly available datasets, RSDD-SAR and SSDD, demonstrate that our proposed method achieves state-of-the-art performance when benchmarked against other advanced detection models.

Keywords:

synthetic aperture radar (SAR); ship detection; edge deformable convolution; point set transformation

1. Introduction

Ships are vital tools and carriers in maritime activities, playing a crucial role in ensuring maritime safety and facilitating maritime transportation. Synthetic aperture radar (SAR) serves as a primary data source for maritime ship detection, offering continuous operation under adverse conditions such as cloud cover or nighttime, along with high-resolution imaging capabilities [1,2,3,4,5]. The development of a reliable SAR ship detection method not only enhances the real-time monitoring capabilities of maritime traffic conditions, illegal fishing, and smuggling activities but also provides essential technical support to safeguard national maritime rights and interests, as well as for maritime law enforcement and rescue operations [6,7,8,9]. Consequently, research on SAR image-based ship detection methods holds significant practical value in maintaining marine security and promoting the sustainable use of marine resources.

Although researchers have conducted extensive studies on ship detection in SAR images and proposed numerous detection models [10,11,12], the technology for detecting ships in SAR imagery still encounters several challenges [13,14,15]. The imaging process is influenced by various factors, including imaging geometry, sensor parameters, and electromagnetic wave scattering. Consequently, the resulting images often exhibit significant speckle noise, leading to blurred or incomplete target edges. The shapes of ships within these images can appear distinctly discrete and intermittent, posing substantial difficulties for feature extraction. Furthermore, ships in SAR images display characteristics such as multi-directionality, multi-scale variations, and proximity to port facilities. These factors can easily lead to deviations in the predicted bounding boxes generated by detection algorithms, resulting in missed detections and false alarms, as well as failing to meet the requirements for the precise localization of densely arranged ship targets within ports and maritime environments. Figure 1 illustrates two representative scenarios of SAR ship detection, where the green boxes denote the ground-truth, the yellow circles indicate potential false alarms, and the red circles represent possible missed detections. The ships depicted on the left in Figure 1a exhibit a discontinuity in target representation, whereas the ship targets shown in Figure 1b are presented from multiple angles at various scales and are densely arranged. Due to the aforementioned challenges, it is imperative to develop an effective strategy for extracting ships from SAR images and to establish a reliable correlation between target features and algorithmic prediction boxes.

To address these challenges, we propose a ship detection method for SAR images that leverages edge deformable convolution and point set representation. Specifically, to tackle the issues of discontinuous characteristics and speckle noise interference in target blocks within SAR images, we integrate the strengths of traditional convolution with deformable convolution techniques. This leads to the introduction of edge deformable convolution, which facilitates an exploration of the correlations among discontinuous target blocks in SAR imagery. Our approach enables the learning of overall deformation characteristics while effectively suppressing speckle noise, resulting in a preliminary feature map. Building upon this foundation, we present a multi-directional ship detection model based on radial basis function (RBF) point set representation. By constructing an RBF point set transformation function, we establish efficient geometric alignment between the point set and the predicted rotation box. Furthermore, we impose constraints on the penalty term associated with point set transformation to ensure accurate mapping between point set features and directed prediction boxes. This methodology achieves the precise detection of multi-directional ship targets even in dense scenes. The main contributions of this work can be summarized as follows:

An SAR ship detection framework via edge deformable convolution and point set representation is proposed, which can achieve the accurate detection of densely arranged multi-directional ships in port scenes.
A feature extraction module based on edge deformable convolution is proposed, which explores the correlation between discontinuous target blocks in SAR images, suppressing speckle noise while learning the overall deformation features of ship targets.
The RBF point set transformation function and the associated point set transformation penalty term are introduced, establishing efficient and accurate mapping between point set features and predicted rotation boxes. This approach enables the precise detection of ship target direction and position in complex scenes.

The subsequent sections of this paper are structured as follows: Section 2 gives a thorough analysis of work related to the topic; Section 3 elaborates on the proposed methodology; Section 4 presents the experimental outcomes; and Section 5 is the conclusion.

2. Related Work

2.1. Feature Extraction of Ship Targets in SAR Images

In SAR images, methods for ship feature extraction can be systematically categorized into three primary approaches: statistical feature-based methods, frequency-domain analysis, and deep learning techniques. Statistical methods exploit differences in gray-level distributions, textural patterns, and gradient characteristics between targets and backgrounds. A representative algorithm within this category is the gray-level co-occurrence matrix (GLCM), which quantifies textural features through metrics such as energy, contrast, and correlation. Khesali et al. [16] enhanced this approach by employing variable window sizes to compute co-occurrence matrices, thereby capturing spatial relationships between pixels while analyzing gray-level distributions. Hanbay et al. [17] further improved feature discrimination by integrating Gaussian derivative filters to calculate gradient magnitudes for GLCM construction.

Frequency-domain methods offer significant advantages in computational efficiency and background suppression, primarily through the application of Fourier transforms and wavelet analysis. For instance, He et al. [18] developed an automatic ship detection system utilizing a modified hypercomplex Fourier transform (MHFT) saliency model, which effectively suppresses clutter while preserving target signatures. Wavelet-based approaches excel in multi-scale feature extraction, as demonstrated by Wu et al. [19], who proposed a wavelet-driven feature enhancement network that incorporates wavelet cascade residual (WCR) modules to improve detection capabilities in complex backgrounds.

Recent advancements have increasingly focused on end-to-end deep learning frameworks. Convolutional neural networks (CNNs) are at the forefront of this domain, and Yang et al. [20] proposed a lightweight CNN architecture featuring split bidirectional feature pyramids (S-BiFPN) to optimize multi-scale feature representation. Li et al. [21] introduced an attention-guided balanced feature pyramid network (A-BFPN) with enhanced refinement modules (ERMs) to learn channel-dependent and spatial-dependent features more effectively. Further contributions by Zhang et al. [22] advanced this area by proposing a quadruple feature pyramid network (Quad-FPN) to strengthen SAR ship feature extraction. In addition to CNNs, Generative Adversarial Networks (GANs) have emerged as a significant research frontier, particularly in applications related to data augmentation and domain adaptation. Song et al. [23] addressed cross-modal discrepancies by employing a domain transfer adversarial metric (DTAM) network, which integrates adversarial training with metric learning techniques to enhance feature extraction in SAR ship imagery.

Compared with traditional methods, deep learning networks have significant advantages in feature extraction. Unlike other deep learning networks, we propose an edge deformable convolution approach to extract ship features from SAR images. By flexibly adjusting the position and distribution of the convolution kernel sampling points, we improve the adaptive ability of SAR image ship feature extraction and accurately capture the overall contour information of ships, effectively solving problems such as speckle noise and the segmentation of ship targets into multiple discontinuous feature blocks in the image.

2.2. Ship Detection Method in SAR Images

In the field of SAR ship detection, detection methods can be categorized into traditional algorithms and deep learning-based approaches. Traditional SAR ship detection relies on handcrafted feature extraction, with key steps including image preprocessing, candidate region extraction, target recognition, and post-processing. Due to the distinct scattering distributions of ships and sea backgrounds, constant false alarm rate (CFAR) detection is widely employed. For instance, Leng et al. [24] proposed a bilateral CFAR algorithm that integrates the intensity and spatial distributions of SAR images to mitigate the effects of SAR ambiguity and sea clutter, introducing a novel kernel density estimation method for spatial distribution modeling. Gao and Shi [25] developed a statistical model for the Geometric Perturbation-Polarization Notch Filter (GP-PNF), using the inverse gamma distribution to characterize sea clutter texture and enabling CFAR detection. Later, Li et al. [26] introduced an adaptive CFAR algorithm based on fused intensity–texture features and an attention contrast mechanism, enhancing detection performance in complex scenarios through local feature enhancement and generalized gamma distribution modeling. Despite its widespread use, CFAR exhibits limitations in complex sea conditions, low signal-to-noise ratios, and small target detection.

With advances in deep learning, data-driven SAR ship detection methods have emerged as a research focus. Li et al. [27] pioneered the application of deep learning in SAR ship detection by refining feature fusion in Faster R-CNN and incorporating transfer learning and hard negative mining to improve performance. Chai et al. [28] enhanced the Cascade R-CNN algorithm for small ship detection in complex backgrounds by introducing a multi-scale backbone network and a spatial enhancement module, achieving higher accuracy for densely distributed targets. While two-stage detectors excel in performance, their serial processing of region proposal generation and classification–regression leads to high computational costs, limiting real-time applicability. In contrast, single-stage networks, with their end-to-end architecture, are better suited for real-time applications. Zhou et al. [29] improved the Single Shot MultiBox Detector (SSD) by redesigning anchor box aspect ratios and leveraging saliency maps to augment training data, boosting performance in few-shot scenarios. Guo et al. [30] proposed a lightweight convolutional unit and an adaptive spatial feature fusion module for YOLO (You Only Look Once) to enable multi-scale feature fusion.

Anchor-based methods often struggle with the diverse scales and complex backgrounds in remote sensing imagery. Anchor-free approaches address this by directly regressing target locations, eliminating predefined anchor constraints and adapting flexibly to targets of varying scales, shapes, and densities. For example, Guo et al. [31] localized ships via center points and predicted their size and shape, reducing background interference. Chen et al. [32] proposed the anchor-free box detection method SAD-Det based on Transformer and adaptive features. Long-range dependency and context modeling are enhanced by pooling the pyramid Transformer backbone in ship space, and multi-scale feature fusion is optimized by combining the adaptive feature pyramid neck network. Additionally, a deformable detection head is designed to dynamically adapt to the spatial sampling position of the target, achieving optimal detection performance on the SAR ship detection dataset, significantly improving the robustness and accuracy of multi-directional ships in complex scenarios. Shen et al. [33] proposed an efficient and lightweight network, ELLK-Net, for SAR ship detection. It fuses global features and long-range dependencies through large kernel convolution decomposition and the adaptive multi-scale attention mechanism (LKMA), and it optimizes the reasoning speed by combining the structural reparameterization technique. It provides a high-precision and low-latency solution for edge computing deployment in complex SAR scenarios. Li et al. [34] proposed a lightweight edge feature perception and fusion method, combining the multi-scale channel attention module (EFA) and the pruned optimized bidirectional feature pyramid (FP-BiFPN), achieving efficient model compression through an adaptive bit-width selection quantization strategy. The detection accuracy and reasoning speed are balanced. Zhao et al. [35] introduced a supervised guidance paradigm using Kullback–Leibler Divergence (KLD) loss to integrate the structural and positional attributes of scatter points in RepPoints.

Unlike existing methods, we propose a point set-based multi-oriented ship detection model. By adaptively transforming point sets, our approach learns and regresses ship contours or keypoint layouts, enabling the accurate detection of densely arranged, multi-oriented ships in complex scenarios.

3. Methodology

3.1. Method Overview

A flowchart of the proposed method is shown in Figure 2. Similar to the classic Oriented Reppoints model [36], the proposed method consists of two parts. In order to achieve the accurate detection of ship targets in SAR images, we propose a ship detection model based on edge deformable convolution (EDConv) and point set representation. It mainly consists of two parts: a feature extraction module based on EDConv and a multi-directional ship detection module based on point set transformation. Through the first module, we can explore the correlation between discontinuous target blocks in SAR images and learn the overall deformation features of the target while suppressing speckle noise, thus generating preliminary feature maps. On this basis, in order to improve the detection performance of densely arranged ships, a multi-directional ship detection model based on point set transformation is proposed. By establishing mapping relationships between point sets and prediction boxes and building loss constraints, high-probability and low false alarm detection of targets in complex scene images can be achieved.

3.2. Feature Extraction Module via Edge Deformable Convolution

Due to the limitations of SAR imaging systems and resolution, ship targets typically exhibit strong local reflections and incomplete overall contours in SAR images. Moreover, a ship may appear as multiple discontinuous feature blocks in the image. These characteristics can easily lead to missed or false detections by ship detection algorithms. The limited local structure perception capability of typical convolution operations makes it difficult to effectively capture and extract highly discrete and fragmented local features, thereby restricting further improvements in detection performance.

To address these issues, we propose an edge deformable convolution (EDConv) structure to enhance the backbone network’s ability to focus on local target features. By organically integrating the advantages of both standard and deformable convolutions, EDConv enables the flexible and efficient extraction of local target features, thereby significantly improving feature representation quality. A schematic diagram of the proposed edge deformable convolution structure is illustrated in Figure 3.

For a standard 2D convolution with a 3 × 3 kernel, the computation formula for each position

p_{0}

on the output feature map is given by

y (p_{0}) = \sum_{\begin{matrix} p_{n} \in R_{s} \end{matrix}} w (p_{n}) \cdot x (p_{0} + p_{n})

(1)

where

p_{n}

enumerates the positions in

R = {(- 1, - 1), (- 1, 0), \dots, (1, 0), (1, 1)}

.

To enhance the convolution kernel’s ability to flexibly perceive the complex geometric features of ships, we draw inspiration from deformable convolutions [37] and introduce deformation offsets. In traditional deformable convolutions, the deformation offsets allow the kernel to freely adjust within the entire receptive field. While this fully flexible deformation mechanism improves model adaptability, it may also cause the receptive field to deviate from the target region, especially when the target’s weak features are unevenly distributed, thereby affecting the effectiveness of feature extraction. To better capture the local features of ships, we propose a hybrid strategy that combines standard convolution and deformable convolution.

Specifically, we design an edge deformable convolutional kernel structure consisting of both standard and deformable convolution components. In the central cross-shaped region of the kernel, we retain standard convolution operations to ensure the stable linear extraction of local cross-shaped features from the input image. In contrast, the remaining regions of the kernel are assigned deformable convolution properties, enabling them to flexibly capture the geometric features of targets through adaptive spatial transformations. This design not only enhances the perception of local target features but also mitigates the potential divergence of the receptive field caused by fully flexible deformations.

In EDConv, we consider a 3 × 3 convolution kernel and introduce offsets

Δ p

at its four corner positions. In other words, we sample at both regular positions

p_{i}

and irregular offset positions

p_{j} + Δ p_{j}

. This calculation process can be characterized as

y (p_{0}) = \sum_{p_{i} \in R_{s}} w (p_{i}) \cdot x (p_{0} + p_{i}) + \sum_{p_{j} \in R_{d}} w (p_{j}) \cdot x (p_{0} + p_{j} + Δ p_{j})

(2)

where

R_{s}

represents the central cross-shaped region, and

R_{d}

denotes the edge region. The term

Δ p_{j}

represents the learnable offset at the j-th position. To prevent excessive displacement, we constrain

Δ p_{j}

within a predefined range of [−2, 2]. Since

K = p_{0} + p_{n} + Δ p_{j}

is generally not an integer, the computation of

x (K)

is performed using bilinear interpolation:

x (K) = \sum_{K^{'}} B (K^{'}, K) \cdot x (K^{'})

(3)

where

K^{'}

enumerates all integer spatial positions in the feature map x, and B represents the bilinear interpolation kernel, which can be decomposed into two one-dimensional kernels:

B (K^{'}, K) = b (K_{x^{'}}, K_{x}) \cdot b (K_{y^{'}}, K_{y}) = max (0, 1 - |K_{x^{'}} - K_{x}|) \cdot max (0, 1 - |K_{y^{'}} - K_{y}|) .

(4)

As illustrated in Figure 3, to generate offsets, we introduce an additional convolution layer that takes the input feature map and outputs offsets with the same spatial resolution. The channel dimension of the output offsets is

2 k^{'}

, corresponding to a total of

k^{'}

two-dimensional offsets in both the x and y directions. These offsets are then weighted by a channel attention mechanism, followed by batch normalization and tanh activation function processing to obtain the normalized output offsets. This computation not only ensures that the offset values remain within a controlled range but also enhances the model’s adaptability to discontinuous local features of the target.

After implementing the EDConv convolution operation, we mainly apply it to the feature extraction stage of SAR ship detection models to enhance the network’s ability to extract intermittent local features of ship targets in SAR images. Specifically, EDConv is applied to multiple stages of the detection model backbone network. In each stage, the input features are first split, and then one branch performs feature extraction through bottleneck and standard convolution calculations. The other branch is processed by EDConv to adaptively adjust the receptive field and achieve the flexible and efficient modeling of local discontinuous features of the ship. Finally, the features of the two branches are weighted and fused at the element level to obtain the output features of this stage.

It should be noted that we embed this module into the third, fourth, and fifth layers of the ResNet extraction branch, mainly because deep features emphasize the completeness of semantic information and context dependence, making it easier to capture the large-scale structure and potential location distribution of the target. EDConv can optimize the modeling of target deformation and scale changes in these branches.

Through the above modules, we can enable the extraction branches to learn discontinuous target feature blocks and output feature maps for the subsequent detection head.

3.3. Multi-Directional Ship Detection Module via Point Set Transformation

After processing the previous module, we can obtain a feature map of ship targets. Given that ships may be densely arranged in port scenes, to enhance target representation and minimize missed detections, we propose a multi-directional ship detection module based on point set transformation, which can capture the shape and direction changes of densely arranged ships more flexibly.

The point set representation-based method enables the direct learning and regression of polygonal contours or keypoint layouts for ships. This approach not only captures the shape features and docking angles of various ships but also enhances localization accuracy for complex ship outlines. Unlike classical point set representation methods, traditional point sets often lack clear semantic information, such as bounding boxes and masks, due to their discrete nature. This limitation frequently results in issues related to non-convergence and susceptibility to outliers when attempting to translate into target locations or regions. To address these challenges, we propose an RBF (radial basis function) point set transformation function that facilitates efficient geometric alignment between the point set and the predicted rotation box. Furthermore, a penalty term is introduced for point set transformation to ensure accurate mapping between the features of the point set and directed prediction boxes. The proposed module for transforming point sets primarily consists of four steps.

(1) Point set representation

The point set representation method achieves refined feature extraction by generating point sets on the surface or edges of objects, enabling a more precise capture of local structures and orientations of ships than traditional bounding boxes. Given a feature map

X \in R^{H \times W \times C}

, a deformable convolutional network (DCN) is dynamically employed to generate a point set comprising adaptive sample points

R = {(x_{k}, y_{k})}_{k = 1}^{n}

, where

n = 9

is specifically designed to align with a 3 × 3 convolution kernel. By iteratively refining the candidate point set positions using the predicted offsets

Δ

from the DCNs, this approach efficiently aligns semantic features with geometric deformations. This mechanism enhances both the precision and robustness of ship detection in remote sensing imagery.

(2) Point set transformation function

To map any shaped point set

P = {p_{1}, p_{2}, \dots, p_{n}}

to a standard oriented box while ensuring geometric consistency, we propose a nonlinear mapping model based on the radial basis function (RBF) to achieve efficient geometric alignment between point sets and oriented boxes.

First, we define an RBF mapping model to project input points into a new feature space:

ϕ (x) = \sum_{i = 1}^{k} α_{i} ρ (\frac{x - c_{i}}{r_{i}})

(5)

where

c_{i}

represents the core point,

r_{i}

denotes the kernel radius,

α_{i}

corresponds to the weight coefficient, and

ρ

signifies the radial basis function. By adjusting the kernel parameters

{c_{i}, r_{i}, α_{i}}

, the RBF network can learn the nonlinear mapping relationship between the input point set and the target ground-truth rotated bounding box.

According to the definition of the target rotated bounding box, its geometric properties are characterized by the center coordinate

c_{0}

, rotation angle

θ

, and semi-axis lengths

w / 2

and

h / 2

.

The coordinates of the predicted points in dataset P relative to the oriented bounding box must satisfy the following constraint equations:

\{\begin{matrix} {(x - c_{0})}^{⊤} R (θ) R {(θ)}^{⊤} (x - c_{0}) \leq 1 \\ where R (θ) = [\begin{matrix} cos θ & - sin θ \\ sin θ & cos θ \end{matrix}] \end{matrix}

(6)

(3) Adaptive point set quality assessment metric

To address the issue of ground-truth annotation deficiency in point set representation learning, we propose an adaptive point set quality assessment framework based on multidimensional feature fusion. Since ground-truth annotated point sets remain directly inaccessible during training, conventional methods encounter significant challenges in quantifying the confidence levels of predicted point sets. This limitation directly impacts the model’s ability to efficiently acquire high-quality point set prediction capabilities. To overcome this fundamental obstacle, our approach constructs a comprehensive quality metric system encompassing three core evaluation dimensions.

Classification quality assessment
We measure the classification quality by utilizing the classification loss between the point set $R_{i}^{cls}$ and the ground-truth $b_{j}^{cls}$ . The specific formula is as follows:

$Q_{cls} (R_{i}, b_{j}) = L_{cls} (R_{i}^{cls} (θ), b_{j}^{cls}) .$

(7)
Localization quality assessment
The localization quality is measured by the localization loss between the polygons generated from point sets $O B_{i}^{loc}$ and the ground-truth $b_{j}^{loc}$ . The formula for localization quality is as follows:

$Q_{loc} (R_{i}, b_{j}) = L_{loc} (O B_{i}^{loc} (θ), b_{j}^{loc}) .$

(8)

The location quality of a point set primarily focuses on the overlap extent between the polygon generated by the point set and the ground-truth while being insensitive to directional changes. This phenomenon is particularly evident in ship targets within remote sensing images.
Oriented quality assessment
We utilize the MinAeraRect point set transformation function to convert the predicted point set into oriented bounding boxes. Subsequently, we perform equidistant sampling along each edge of the oriented bounding boxes. Based on the sampled points, we compute the corner distance between the predicted box and the ground-truth box to determine the oriented quality of the point set. This process can be expressed as follows:

$Q_{ori} (R_{i}, b_{j}) = C D (R_{i}^{v} (θ), R_{b_{j}}^{g})$

(9)

where $C D$ denotes the chamfer distance between two point sets.

To sum up, based on classification quality

Q_{cls}

, localization quality

Q_{loc}

, and orientation quality

Q_{ori}

, we derived an adaptive point set quality assessment metric:

Q = Q_{cls} + μ_{1} Q_{loc} + μ_{2} Q_{ori}

(10)

where

μ

represents the weighted factor.

(4) Loss function

The ship detection algorithm based on point sets involves two optimization stages. In the network initialization stage, adaptive point sets are generated from the object centers in the feature map, and a set of point sets is allocated to each object. In the refinement stage, high-quality predicted point sets are selected as positive samples based on a sample allocation scheme derived from quality assessment metrics of the adaptive point sets, followed by further refinement adjustments. These two point set optimization processes are implemented by minimizing the following loss function:

L = L_{cls} + λ_{1} L_{s 1} + λ_{2} L_{s 2}

(11)

where

λ_{1}

and

λ_{2}

denote the balance coefficients,

L_{cls}

represents the classification loss, and

L_{s 1}

and

L_{s 2}

denote the spatial localization losses during the network initialization phase and the refinement phase, respectively.

The calculation formula for the classification loss

L_{cls}

is as follows:

L_{cls} = \frac{1}{N_{cls}} \sum_{i} F_{cls} (R_{i}^{cls} (θ), b_{j}^{cls})

(12)

where

N_{cls}

denotes the number of point sets,

R_{i}^{cls}

denotes the classification confidence of the point set, and

b_{j}^{cls}

denotes the true class label.

Since both

L_{s 1}

and

L_{s 2}

terms are involved, they can be unified into a single formula. Here,

L_{loc}

represents the loss between the oriented bounding boxes derived from the transformation function and the ground-truth boxes.

L_{loc} = \frac{1}{N_{loc}} \sum_{i} [b_{j}^{cls} \geq 1] F_{loc} ({O B}_{i}^{loc} (θ), b_{j}^{loc})

(13)

where

N_{loc}

denotes the number of positive samples in the point set,

b_{j}^{cls}

represents the ground-truth located label, and

F_{loc}

signifies the GIoU loss.

Additionally, to optimize the RBF mapping model, we design a loss function that minimizes the distance from the mapped points to the bounding box boundaries.

L = \sum_{i = 1}^{n} min (\frac{w}{2} \cdot |q_{i x}|, \frac{h}{2} \cdot |q_{i y}|, \frac{w}{2} \cdot |q_{i y} cos θ + q_{i x} sin θ|, \frac{h}{2} \cdot |q_{i y} sin θ - q_{i x} cos θ|) .

(14)

To ensure that the point set forms a convex polygon after transformation via the RBF mapping model, a combination of geometric constraint optimization and post-processing strategies is required. Specifically, during the optimization of RBF kernel parameters, convexity constraints are introduced by incorporating a convexity penalty term. This term calculates the average squared distance of the points to their convex hull, thereby enforcing the mapped point set to lie within the convex hull. The convexity penalty term is formulated as follows:

L_{convexity} = \frac{1}{n} \sum_{q \in Q} min_{h \in ConvexHull (Q)} {∥ q - h ∥}^{2} .

(15)

The mathematical expression for incorporating a convexity penalty term in the optimization of the RBF mapping model can be represented as

L_{rbf} = L + λ L_{convexity}

(16)

where the

λ

regularization parameter, used to control the strength of the convexity constraint, must be tuned based on the specific application scenario.

By combining the above losses and training the network, we can obtain the final ship detection result, achieving the accurate detection of targets in dense scenes.

4. Experiments

4.1. Datasets and Experimental Settings

To verify the effectiveness and superiority of the proposed method, we conducted extensive experiments on two publicly available datasets, RSDD-SAR [38] and SSDD [39]. Detailed information of the two datasets is shown in Table 1.

The RSDD-SAR dataset consists of images from the Gaofen-3 satellite and the European Space Agency’s TerraSAR-X satellite. Specifically, this dataset consists of data from 127 scenes, namely, 84 scenes from Gaofen-3 data, 41 scenes from TerraSAR-X data slices, and 2 uncropped large images. It includes 7000 slices with multiple imaging modes, polarization modes, and resolutions, as well as 10,263 ship instances. A total of 7000 data samples were divided into 5000 training samples and 2000 test samples. The RSDD-SAR dataset has the characteristics of an arbitrary rotation direction, a large aspect ratio, a high proportion of small targets, and rich scenes. It covers features such as multi-scale, multi-azimuth, and dense distributions of targets, as well as backgrounds of different complexity levels, such as sea surfaces and ports.

SSDD is a small-scale SAR ship detection dataset. The data sources are Radarsat-2, TerraSAR-X, and Sentinel-1 satellites. The polarization modes are HH, VV, HV, and VH, and the image resolutions range from 1 m to 15 m. SSDD contains 1160 SAR image slices, with a total of 2587 ships. The ratio of the training set to the test set is 8:2, that is, 928 training samples and 232 test samples.

The experiments are conducted on a workstation equipped with 64 GB of RAM and a Quadro RTX 4090 GPU, utilizing the Pytorch framework. Each model is trained for 36 epochs, with a batch size of 4. We use stochastic gradient descent (SGD) with a weight decay of 0.0005 and a momentum of 0.9. Additionally, mosaic data augmentation is employed, which combines four training samples into a single one, reducing the requirement for a large mini-batch size. The intersection over union (IoU) value of the prediction box is 0.5.

4.2. Evaluation Metrics

In order to measure the performance of different detection models, classic metrics such as precision, recall, and average precision (AP) are selected to measure the overall effectiveness of the algorithm. Precision (P) measures the proportion of correct positive predictions among all detections identified by the model. It is defined as the ratio of true-positive detections to the total number of detections made by the detector.

P = \frac{TP}{TP + FP}

(17)

where TP and FP represent the true positive and false positive, respectively.

Recall (R) measures the proportion of correctly identified positive instances out of all actual positive instances in the dataset.

R = \frac{TP}{TP + FN}

(18)

where FN represents the false negative.

The AP is a widely used metric to evaluate the overall performance of object detection algorithms. Specifically, the AP is computed as the mean of the average precision (AP) values for one class, where the AP is obtained by calculating the area under the precision–recall curve:

AP = \int_{0}^{1} P (R) d R .

(19)

4.3. Ablation Experiments

To demonstrate the contribution of the proposed modules, ablation experiments are first conducted on the RSDD-SAR dataset. For a fair comparison, the original Oriented Reppoints method is chosen as the baseline. Table 2 shows the experimental results of multiple image scenes on the RSDD-SAR dataset. It can be observed that, when Oriented Reppoints are directly used, the AP value obtained is 89.48%. Based on this situation, when the EDConv module is assembled on the original architecture, the AP value is 90.23%. When configuring the point set transformation module separately, the AP value is 90.74%. Furthermore, when two modules are configured simultaneously in the original network, the AP can reach 91.62%, an increase of 2.14% compared to the baseline network.

We also conducted ablation experiments on the SSDD dataset. Table 3 shows the ablation experiment results on this dataset. It can be observed that, when Oriented Reppoints are directly used, the AP value obtained is 88.40%. Based on this situation, when the EDConv module is assembled on the original architecture, the AP value is 90.66%. When configuring the point set conversion module separately, the AP value is 91.85%, an increase of 3.45%. Furthermore, when two modules are simultaneously configured in the baseline network, the AP can reach 92.81%, an increase of 4.41%. From the above results, it can be found that both of the proposed modules have positive gains in detection performance, and the gain of the point set transformation module is more significant than that of the EDConv module.

The results of the ablation experiment detection are illustrated in Figure 4. It is evident that the EDConv module plays a crucial role in minimizing false alarms, thereby demonstrating its effectiveness in interference suppression. Additionally, the point set transformation module significantly reduces missed detections and exhibits excellent detection performance for densely arranged ships.

4.4. Comparison Experiments

In order to comprehensively evaluate the effectiveness and advancement of the proposed method in applying images to different scenes, seven advanced object detection models are selected for comparative experiments, namely, Rotated-RetinaNet [40], Oriented Faster R-CNN [41], S2A-Net [42], R3Det [43], Oriented Reppoints [36], Rotated-RTMDet-s [44], and LSR-Det [45].

In order to quantitatively evaluate the performance of the proposed method in comparison with that of other algorithms, we present the quantitative index results for different methods on the RSDD-SAR and SSDD datasets in Table 4 and Table 5, respectively. The performance distribution of these algorithms across both datasets is largely consistent, primarily due to the similarities in image resolution and scene types between the two datasets, which leads to comparable robustness in algorithm application. Specifically, Rotated RetinaNet exhibits the poorest overall performance, while S2A-Net and R3Det demonstrate performance that is similar to that of our proposed method. Additionally, LSR-Det’s performance is also close to that of our approach. Our method outperforms other algorithms across several indices. Based on an analysis of both the quantitative and qualitative results presented above, it is evident that our method achieves superior detection performance compared to alternative approaches and effectively identifies multi-directionally distributed ship targets with dense docking amidst complex port facility interference.

In order to visually demonstrate the applicability of the proposed method to various image scenes, we show the detection results of several typical port scene images in the RSDD-SAR dataset in Figure 5. In order to show the distribution of the prediction box more clearly, we do not display the confidence label of the prediction box. It can be observed in Figure 5 that the port scene is selected to cover the image of intensive docking of multiple ship targets in multiple directions, the interference of port facilities on the shore is complex and changeable, and the characteristics of some facilities are highly similar to those of ship targets. It should be noted that we use the green box to represent the true position of the target, the red box to represent the true-positive prediction box of the algorithm, the yellow box to represent the false detection of the algorithm, and the blue circle to indicate missed detections.

In the first scene of Figure 5, ships dock side by side near the port. The Oriented Faster R-CNN and Oriented Reppoints methods correctly detect most ships, but there are multiple false detections, mainly because the extraction of target features is not precise enough. It is difficult to effectively distinguish between targets and shore facilities. Although the overall detection result of LSR-Det is close to that of the proposed method, it fails to detect targets with occluded edges. However, our method uses the idea of point set to focus on and extract the features of important parts of ships, which can better detect targets with incomplete structures. In the second scenario, the shape, brightness, and other characteristics of port facilities and ship targets are highly pixellated. Oriented Faster R-CNN and Oriented Reppoints still produce multiple errors in detection, and both miss small-sized ship targets that are densely docked. LSR-Det detects most of the targets but easily misidentifies two approaching ships as one target. Our method misses one small target but picks up the rest. The third scenario is mainly the condition of multi-target multi-direction distribution. Oriented Faster R-CNN and Oriented Reppoints miss targets with less obvious edge profiles, while LSR-Det misses targets with limited adaptability to targets with partially obscured edges. The fourth scenario is the condition that the scene scattering interference is more serious. Oriented Faster R-CNN, Oriented Reppoints, and LSR-Det cannot distinguish the docked ships seriously affected by scattering interference, and the first two methods show many false detections. Our method uses edge deformable convolution to better adapt to deformed and discontinuous targets. In addition, the point set conversion strategy adopted can better focus the core features of the target and capture the direction more accurately, showing the best detection performance in the above scenarios. The experimental results of different scenarios show that the proposed method is superior to other methods in detecting the target and reducing the false alarms and missed detections as much as possible.

We also show the detection results of different algorithms on the SSDD dataset in Figure 6. Similar to the performance of the RSDD-SAR dataset, Oriented Faster R-CNN and Oriented Reppoints are prone to false detection. The overall detection result of LSR-Det is close to that of the proposed method, but it is poor in distinguishing targets with dense docking, and it is easy to identify two targets as one target. By incorporating missed detection labels, we observe that the proposed method has significantly improved detection outcomes for densely arranged nearshore ships and effectively mitigates instances of missed detections.

5. Conclusions

In this article, we propose an SAR ship detection method based on edge deformable convolution and point set representation, which can achieve the accurate detection of densely arranged ships in complex port scenes. The proposed method mainly consists of two parts. Firstly, a structure of edge deformable convolution is proposed to learn the highly discrete and fragmented local features of ship targets, solving the interference of speckle noise and discontinuous target blocks in SAR images. Afterwards, a multi-directional ship detection module based on point set transformation is proposed, which generates high-quality prediction boxes by constructing point set transformation functions and loss constraints, thus flexibly capturing the shape and direction changes of densely arranged ships. The experimental results on two publicly available datasets, RSDD-SAR and SSDD, show that the proposed ship detection method exhibits better detection performance than seven typical detection models.

Author Contributions

Conceptualization, T.G. and C.W.; methodology, T.G.; software, T.G. and C.W.; validation, T.G., S.C. and X.J.; formal analysis, T.G. and S.C.; investigation, T.G.; resources, T.G.; data curation, T.G.; writing—original draft preparation, T.G., S.C. and X.J.; writing—review and editing, T.G. and C.W.; visualization, T.G. and S.C.; supervision, Y.D. and S.C.; project administration, S.C. and Y.D.; funding acquisition, S.C. and F.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62201548.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author. The data are not publicly available due to avoid compromising future research outputs based on the dataset.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A tutorial on synthetic aperture radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef]
Chang, S.; Deng, Y.; Zhang, Y.; Zhao, Q.; Wang, R.; Zhang, K. An advanced scheme for range ambiguity suppression of spaceborne SAR based on blind source separation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5230112. [Google Scholar] [CrossRef]
Chang, S.; Deng, Y.; Zhang, Y.; Wang, R.; Qiu, J.; Wang, W.; Zhao, Q.; Liu, D. An advanced echo separation scheme for space-time waveform-encoding SAR based on digital beamforming and blind source separation. Remote Sens. 2022, 14, 3585. [Google Scholar] [CrossRef]
Liu, D.; Chang, S.; Deng, Y.; He, Z.; Wang, F.; Zhang, Z.; Han, C.; Yu, C. A Novel Spaceborne SAR Constellation Scheduling Algorithm for Sea Surface Moving Target Search Tasks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 3715–3726. [Google Scholar] [CrossRef]
Deng, Y.; Tang, S.; Chang, S.; Zhang, H.; Liu, D.; Wang, W. A novel scheme for range ambiguity suppression of spaceborne SAR based on underdetermined blind source separation. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5207915. [Google Scholar] [CrossRef]
Gens, R. Oceanographic applications of SAR remote sensing. GIScience Remote Sens. 2008, 45, 275–305. [Google Scholar] [CrossRef]
Gao, G. Statistical modeling of SAR images: A survey. Sensors 2010, 10, 775–795. [Google Scholar] [CrossRef] [PubMed]
Renga, A.; Graziano, M.D.; Moccia, A. Segmentation of marine SAR images by sublook analysis and application to sea traffic monitoring. IEEE Trans. Geosci. Remote Sens. 2018, 57, 1463–1477. [Google Scholar] [CrossRef]
Liu, D.; Deng, Y.; Chang, S.; Zhu, M.; Zhang, Y.; Zhang, Z. Orbital Design Optimization for Large-Scale SAR Constellations: A Hybrid Framework Integrating Fuzzy Rules and Chaotic Sequences. Remote Sens. 2025, 17, 1430. [Google Scholar] [CrossRef]
Tao, L.; Ziyuan, Y.; Yanni, J.; Gui, G. Review of ship detection in polarimetric synthetic aperture imagery. J. Radar 2021, 10, 1–19. [Google Scholar]
Li, J.; Xu, C.; Su, H.; Gao, L.; Wang, T. Deep learning for SAR ship detection: Past, present and future. Remote Sens. 2022, 14, 2712. [Google Scholar] [CrossRef]
Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, S.; Sun, Z.; Liu, C.; Sun, Y.; Ji, K.; Kuang, G. Cross-sensor SAR image target detection based on dynamic feature discrimination and center-aware calibration. IEEE Trans. Geosci. Remote Sens. 2025; early access. [Google Scholar]
Sun, Z.; Leng, X.; Zhang, X.; Zhou, Z.; Xiong, B.; Ji, K.; Kuang, G. Arbitrary-direction SAR ship detection method for multi-scale imbalance. IEEE Trans. Geosci. Remote Sens. 2025, 16, 156–169. [Google Scholar]
Guan, T.; Chang, S.; Wang, C.; Jia, X. SAR Small Ship Detection Based on Enhanced YOLO Network. Remote Sens. 2025, 17, 839. [Google Scholar] [CrossRef]
Khesali, E.; Enayati, H.; Modiri, M.; Mohseni Aref, M. Automatic ship detection in Single-Pol SAR Images using texture features in artificial neural networks. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 40, 395–399. [Google Scholar] [CrossRef]
Hanbay, K.; Özdemir, T.B. Ship Classification Based On Co-Occurrence Matrix and Support Vector Machines. Electrica 2024, 24, 812–817. [Google Scholar] [CrossRef]
He, J.; Guo, Y.; Yuan, H. Ship target automatic detection based on hypercomplex flourier transform saliency model in high spatial resolution remote-sensing images. Sensors 2020, 20, 2536. [Google Scholar] [CrossRef]
Wu, F.; Hu, T.; Xia, Y.; Ma, B.; Sarwar, S.; Zhang, C. WDFA-YOLOX: A wavelet-driven and feature-enhanced attention YOLOX network for ship detection in SAR images. Remote Sens. 2024, 16, 1760. [Google Scholar] [CrossRef]
Yang, X.; Zhang, J.; Chen, C.; Yang, D. An efficient and lightweight CNN model with soft quantification for ship detection in SAR images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5230713. [Google Scholar] [CrossRef]
Li, X.; Li, D.; Liu, H.; Wan, J.; Chen, Z.; Liu, Q. A-BFPN: An attention-guided balanced feature pyramid network for SAR ship detection. Remote Sens. 2022, 14, 3829. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, X.; Ke, X. Quad-FPN: A novel quad feature pyramid network for SAR ship detection. Remote Sens. 2021, 13, 2771. [Google Scholar] [CrossRef]
Song, Y.; Li, J.; Gao, P.; Li, L.; Tian, T.; Tian, J. Two-stage cross-modality transfer learning method for military-civilian SAR ship recognition. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4506405. [Google Scholar] [CrossRef]
Leng, X.; Ji, K.; Yang, K.; Zou, H. A bilateral CFAR algorithm for ship detection in SAR images. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1536–1540. [Google Scholar] [CrossRef]
Gao, G.; Shi, G. CFAR ship detection in nonhomogeneous sea clutter using polarimetric SAR data based on the notch filter. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4811–4824. [Google Scholar] [CrossRef]
Li, N.; Pan, X.; Yang, L.; Huang, Z.; Wu, Z.; Zheng, G. Adaptive CFAR method for SAR ship detection using intensity and texture feature fusion attention contrast mechanism. Sensors 2022, 22, 8116. [Google Scholar] [CrossRef]
Li, J.; Qu, C.; Shao, J. Ship detection in SAR images based on an improved faster R-CNN. In Proceedings of the 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017; pp. 1–6. [Google Scholar]
Chai, B.; Nie, X.; Zhou, Q.; Zhou, X. Enhanced cascade R-CNN for multi-scale object detection in dense scenes from SAR images. IEEE Sens. J. 2024, 24, 20143–20153. [Google Scholar] [CrossRef]
Zhou, F.; He, F.; Gui, C.; Dong, Z.; Xing, M. SAR target detection based on improved SSD with saliency map and residual network. Remote Sens. 2022, 14, 180. [Google Scholar] [CrossRef]
Guo, Y.; Chen, S.; Zhan, R.; Wang, W.; Zhang, J. LMSD-YOLO: A lightweight YOLO algorithm for multi-scale SAR ship detection. Remote Sens. 2022, 14, 4801. [Google Scholar] [CrossRef]
Guo, H.; Yang, X.; Wang, N.; Gao, X. A CenterNet++ model for ship detection in SAR images. Pattern Recognit. 2021, 112, 107787. [Google Scholar] [CrossRef]
Chen, B.; Yu, C.; Zhao, S.; Song, H. An anchor-free method based on transformers and adaptive features for arbitrarily oriented ship detection in SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 2012–2028. [Google Scholar] [CrossRef]
Shen, J.; Bai, L.; Zhang, Y.; Momi, M.C.; Quan, S.; Ye, Z. ELLK-Net: An efficient lightweight large kernel network for SAR ship detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5221514. [Google Scholar] [CrossRef]
Li, Y.; Liu, J.; Li, X.; Zhang, X.; Wu, Z.; Han, B. A Lightweight Network for Ship Detection in SAR Images Based on Edge Feature Aware and Fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 99, 1–15. [Google Scholar] [CrossRef]
Zhao, W.; Huang, L.; Liu, H.; Yan, C. Scattering-Point-Guided Oriented RepPoints for Ship Detection. Remote Sens. 2024, 16, 933. [Google Scholar] [CrossRef]
Li, W.; Chen, Y.; Hu, K.; Zhu, J. Oriented reppoints for aerial object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1829–1838. [Google Scholar]
Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
Congan, X.; Hang, S.; Jianwei, L.; Yu, L.; Libo, Y.; Long, G.; Wenjun, Y.; Taoyang, W. RSDD-SAR: Rotated ship detection dataset in SAR images. J. Radars 2022, 11, 581–599. [Google Scholar]
Zhang, T.; Zhang, X.; Li, J.; Xu, X.; Wang, B.; Zhan, X.; Xu, Y.; Ke, X.; Zeng, T.; Su, H.; et al. SAR ship detection dataset (SSDD): Official release and comprehensive data analysis. Remote Sens. 2021, 13, 3690. [Google Scholar] [CrossRef]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 3520–3529. [Google Scholar]
Han, J.; Ding, J.; Li, J.; Xia, G.S. Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–11. [Google Scholar] [CrossRef]
Yang, X.; Yan, J.; Feng, Z.; He, T. R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 3163–3171. [Google Scholar]
Lyu, C.; Zhang, W.; Huang, H.; Zhou, Y.; Wang, Y.; Liu, Y.; Zhang, S.; Chen, K. Rtmdet: An empirical study of designing real-time object detectors. arXiv 2022, arXiv:2212.07784. [Google Scholar]
Meng, F.; Qi, X.; Fan, H. LSR-Det: A Lightweight Detector for Ship Detection in SAR Images Based on Oriented Bounding Box. Remote Sens. 2024, 16, 3251. [Google Scholar] [CrossRef]

Figure 1. Representative scenarios of SAR ship detection. (a) Discontinuity of ships. (b) Multi-directional, multi-scale, and densely arranged ship scenes.

Figure 2. The overall structure architecture of the proposed method.

Figure 3. The structure of the EDConv feature extraction module.

Figure 4. Visualization of the ablation experiment detection results. (a) Image examples of the SSDD dataset. (b) Examples of images from the RSDD-SAR dataset. From left to right: the ground-truth, the Oriented Reppoints algorithm, the detection results with EDConv added, and the proposed method. The green boxes represent the ground-truth, the red boxes represent the true-positive prediction boxes of the method, the yellow boxes represent the false alarms, and the blue circles represent the missed detections.

Figure 5. Visualization of the detection results of various methods on the RSDD-SAR dataset. (a) Ground-truth. (b) Oriented Faster R-CNN. (c) Oriented Reppoints. (d) LSR-Det. (e) The proposed method. The green boxes indicate the ground-truths, the red boxes indicate the prediction boxes of the method, the yellow boxes indicate false alarms, and the blue circles indicate missed detections.

Figure 6. Visualization of the detection results of various methods on the SSDD dataset. (a) Ground-truth. (b) Oriented Faster R-CNN. (c) Oriented Reppoints. (d) LSR-Det. (e) The proposed method. The green boxes indicate the ground-truths, the red boxes indicate prediction boxes of the method, the yellow boxes indicate false alarms, and the blue circles indicate missed detections.

Table 1. Detailed descriptions of SSDD and RSDD-SAR.

Dataset	SSDD	RSDD-SAR
Image number	1160	7000
Ship number	2587	10,263
Image size	190–668	$512 \times 512$
Number of sensors	3	2
Resolution (m)	1–15	2–20
Frequency bands covered	$C, X$	$C, X$

Table 2. Results of ablation experiment on the RSDD-SAR dataset. Bold indicates the optimal value.

Method	EDConv	Point Set Transformation	P (%)	R (%)	AP (%)
Baseline	×	×	90.24	89.12	89.48
	✓	×	92.10	90.09	90.23
	×	✓	93.22	90.23	90.74
Proposed	✓	✓	94.91	91.35	91.62

Table 3. Results of ablation experiment on the SSDD dataset. Bold indicates the optimal value.

Method	EDConv	Point Set Transformation	P (%)	R (%)	AP (%)
Baseline	×	×	92.25	88.33	88.40
	✓	×	93.47	90.40	90.66
	×	✓	95.02	91.70	91.85
Proposed	✓	✓	95.45	92.12	92.81

Table 4. Results of comparison experiments on the RSDD-SAR dataset. Bold indicates the optimal value.

Method	P (%)	R (%)	AP (%)
Rotated-RetinaNet	89.77	84.30	85.22
Oriented Faster R-CNN	90.20	85.45	86.98
S2A-Net	89.82	86.45	88.37
R3Det	90.61	86.50	88.86
Oriented Reppoints	90.24	89.12	89.48
Rotated-RTMDet-s	91.60	88.50	90.16
LSD-Det	92.45	90.25	90.34
Proposed	94.91	92.35	91.62

Table 5. Results of comparison experiments on the SSDD dataset. Bold indicates the optimal value.

Method	P (%)	R (%)	AP (%)
Rotated-RetinaNet	89.35	83.52	85.58
Oriented Faster R-CNN	90.53	86.79	87.36
S2A-Net	88.80	90.39	89.78
R3Det	90.76	88.46	89.62
Oriented Reppoints	92.25	88.33	88.40
Rotated-RTMDet-s	92.43	89.94	90.65
LSD-Det	93.86	91.02	91.45
Proposed	95.45	92.12	92.81

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guan, T.; Chang, S.; Deng, Y.; Xue, F.; Wang, C.; Jia, X. Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation. Remote Sens. 2025, 17, 1612. https://doi.org/10.3390/rs17091612

AMA Style

Guan T, Chang S, Deng Y, Xue F, Wang C, Jia X. Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation. Remote Sensing. 2025; 17(9):1612. https://doi.org/10.3390/rs17091612

Chicago/Turabian Style

Guan, Tianyue, Sheng Chang, Yunkai Deng, Fengli Xue, Chunle Wang, and Xiaoxue Jia. 2025. "Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation" Remote Sensing 17, no. 9: 1612. https://doi.org/10.3390/rs17091612

APA Style

Guan, T., Chang, S., Deng, Y., Xue, F., Wang, C., & Jia, X. (2025). Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation. Remote Sensing, 17(9), 1612. https://doi.org/10.3390/rs17091612

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Oriented SAR Ship Detection Based on Edge Deformable Convolution and Point Set Representation

Abstract

1. Introduction

2. Related Work

2.1. Feature Extraction of Ship Targets in SAR Images

2.2. Ship Detection Method in SAR Images

3. Methodology

3.1. Method Overview

3.2. Feature Extraction Module via Edge Deformable Convolution

3.3. Multi-Directional Ship Detection Module via Point Set Transformation

4. Experiments

4.1. Datasets and Experimental Settings

4.2. Evaluation Metrics

4.3. Ablation Experiments

4.4. Comparison Experiments

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI