1. Introduction
As an active microwave imaging system for earth observation, Synthetic Aperture Radar (SAR) can work in any weather and at any time and is widely used in the field of ship detection [
1,
2,
3]. In the early stages of ship detection, the primary goal is to better manually select some features to detect ships [
4,
5,
6,
7]. For instance, when detecting ships at the superpixel level, gray value differences between a ship’s area and neighboring regions is mainly used. Liu et al. used superpixels to segment object areas before applying a Constant False Positive Rate (CFAR) [
8]. Wang et al. combined a superpixel segmentation with local contrast measurement [
9]. Deng et al. first used the Simple Linear Iterative Clustering (SLIC) algorithm to generate superpixel regions and then used the corners for ship detection [
10]. Wang et al. detect ship objects based on intensity differences between object and clutter pixels [
11]. Liu et al. proposed the CFAR. The main principle of the CFAR is to apply a sliding window to the SAR image and then judge whether the window area is a ship according to the signal threshold [
12]. A few extensions of the CFAR were progressively suggested including Smallest of CFAR, Order Statistic CFAR, Cell Averaging CFAR, and Greatest of CFAR [
13,
14,
15,
16]. Yang et al. developed ship detection based on the initial CFAR detection algorithm and improved it with a geometric optimization framework [
17]. Wang et al. proposed a superpixel detector that improves the performance of ship target detection in SAR images via the local contrast of fisher vectors [
18].
In the past few years, due to the rapid development of deep learning [
19], a series of object detection methods based on deep learning have been applied to SAR images. These detection methods can be categorized into one-stage and two-stage [
20]. Two-stage methods first extract a region of interest, then identify candidate objects, so the object detection accuracy is improved [
21,
22,
23,
24,
25]. Unlike the two-stage algorithm, the one-stage algorithm performs in one step and is particularly appropriate for real-time detections. As one of the most advanced one-stage object detection algorithms, You Only Look Once V5 (YOLO V5) has the characteristics of fast detection speed and high accuracy, so it is selected as the baseline for our algorithm.
Some researchers have optimized the object detection algorithm based on deep learning according to the characteristics of SAR images. Li et al. proposed to use the K-means method to obtain the target scale distribution, optimize the selection of anchor boxes, and reduce the difficulty of network learning [
26]. To reduce memory consumption and the computational complexity of convolutional neural networks, Xiong et al. proposed a lightweight detection model that utilizes channel shearing and knowledge distillation techniques, thereby reducing the YOLO V4 parameters [
27]. Zhou et al. extended YOLO V5 by performing a K-means dimension clustering on the object frame and a mosaic enhancement image scale transformation and finally optimizing the loss function [
28]. Lin et al. introduced a squeeze and excitation faster R-CNN that provides better detection results [
29]. A Feature Pyramid Network (FPN) improved the performance of the detection network by utilizing the information between different feature layers [
30]. Wei et al. proposed a high-resolution FPN for multiscale detection [
31]. Cui et al. extended the convolutional network with an attention module that can learn weight coefficients between different channels [
32]. Zhang et al. proposed using Neural Architecture Search (NAS) to automatically design a convolutional neural network structure on the dataset, which further improved the accuracy of ship detection [
33]. Ke et al. fused the network feature layers to optimize the detection process [
34].
Overall, these detection networks have achieved considerable performance when applied to SAR images, but so far, these SAR ship detection algorithms based on deep learning have not fully considered the differences between SAR ships and the surrounding environment. For instance, differences in the backscattering properties between ships and sea clutter result in the gray value of the ship is larger than that of the noise region. In addition, there are distinct geometrical features between the ship and the coast. Based on these differences, we introduced a combination of two modules: Feature Enhancement Module (FEM) and Land Burial Module (LBM). FEM is designed to enhance the characteristics of ships by eliminating noisy areas, while LBM eliminates coastal areas. The main contributions of this paper are as follows:
We introduce a Salient Otsu (S-Otsu) threshold segmentation method to deal with SAR images noise;
The FEM highlights ship features by suppressing background and noise information;
The LBM is added to YOLO V5, removing coastal areas, improving convolutional neural network training efficiency, and reducing the impact of coastal features on detection performance.
2. Related Work
With the development of SAR technology, its application for object detection has gradually become a research hotspot. SAR can work in any location and in any weather as an active microwave detection technology. Optical remote sensing images contain information on multiple bands, which is more conducive to object identification. In contrast, SAR images only contain echo information from a single band, which is usually recorded in binary form, therefore object detection in SAR images is still a significant challenge. SAR ship detection algorithms can be divided into traditional and based on deep learning ship detection algorithms.
- (1)
Traditional SAR ship detection algorithms
Most early ship detection algorithms are based on ship features to achieve detection. Gao et al. proposed a SAR ship detection algorithm based on a notch filter, which combines a notch filter designed according to the characteristics of hybrid polarization SAR with CFAR to achieve accurate detection of SAR ships [
35]. This algorithm mainly applies in hybrid polarization SAR images, so it has poor robustness. Leng et al. introduced a ship detection model based on the distribution difference between the complex signal kurtosis in object and non-object areas, which can alleviate Radio Frequency Interference as compared with CFAR [
36]. Lang et al. first selected pixels of specific values in a SAR image to form a high-dimensional space to enhance differences between ships and sea clutter and finally applied a spatial clustering algorithm for ship detection [
37]. Liu introduced a detection algorithm for identifying small ships in a SAR image. The algorithm first models the polarization scattering in the ship and ocean areas and then uses a two-parameter constant false alarm rate for ship detection [
38]. Zhang et al. proposed a Non-Window CFAR that combines superpixel segmentation with CFAR, and performed ship detection at the superpixel level, alleviating the dependence on sliding windows as in common CFAR [
39]. Compared with CFAR, this algorithm makes a better difference between cluttered and ship areas. However, the algorithm also has certain limitations, and the detection effect of near-shore ships does not perform very well. Wang et al. first performed superpixel segmentation on SAR images to enhance the efficiency of ship detection. Then, Fisher vectors captured feature differences between ships and sea clutter using superpixels [
40]. Overall, these methods still require efficient characterization of the ship and neighboring sea areas according to a selection of the most appropriate parameter settings to ensure good performance. However, complex and variable ocean environments make it difficult to build generic and successful modeling approaches.
- (2)
Object detection algorithms based on deep learning
Object detection algorithms based on deep learning have shown significant advantages. Hu et al. designed a specific Squeeze-and-Excitation module in a Squeeze-and-Excitation Network, which suppresses background information by learning channel weights. Experiments show that the SE module can effectively improve the performance of the object detection network [
41]. Woo et al. designed a Convolutional Block Attention Module (CBAM), which not only pays attention to channel information but also spatial information compared with the SE module and suppresses background information more effectively [
42]. Wang et al. showed that capturing all channel dependencies in SE and CBAM is inefficient, therefore they replaced fully connected layers in the former two with one-dimensional convolutions in an Efficient Channel Attention Networks. Compared with the previous two, Efficient Channel Attention increases the accuracy of the object detection network while reducing the number of parameters. The above attention mechanisms have all been added to ship detection models to suppress SAR background information [
43]. However, because these mechanisms are designed based on natural images and are directly applied to SAR images, they cannot achieve excellent results. Liu et al. added a top-to-bottom feature fusion path to the feature fusion pyramid, which alleviates the loss of feature details [
44]. Chen et al. proposed an Atrous Spatial Pyramid Pooling (ASPP), which extracts features at different scales and increases detection capabilities for multi-scale objects [
45].
- (3)
SAR ship detection algorithms based on deep learning
With the development of deep learning, SAR ship detection algorithms are now mostly based on deep learning. Yang et al. introduced a multi-scale ship detector of complex backgrounds. Firstly, the attention module was introduced to mitigate the influence of the background, and then the ship features of different scales were captured by pooling operation [
46]. However, the excessive use of pooling does not maintain the characteristics of small ships and then increases the possibility of missed detections. Guo et al. introduced CenterNet++ for the detection of small ships. This algorithm first uses dilated convolutions with different expansion rates to extract ship features of different scales and then further fuses the feature maps of different convolutional layers through a feature pyramid. Finally, the regression of ship position and class is achieved by a 1 × 1 convolution [
47]. Although the algorithm shows excellent performance for the detection of small ships, the influence of background information on ship detection is not considered in the feature extraction process. Zhang et al. developed a lightweight ship detection model with only 20 convolutional layers, which proposed a feature reuse strategy that enhanced the ship detection ability of the model by repeatedly stacking feature maps of the same scale [
48]. The shortcomings of this algorithm are similar to those of CenterNet++, both of which do not consider the impact of background information on ship detection. Sun et al. proposed to replace regular convolutions with atrous convolutions in the backbone to mitigate the loss of ship position information. Then, the ship detection model is made to pay more attention to ships by introducing an attention module [
49]. Ge et al. designed a Spatially Oriented Attention Module (SOAM) and added it to You Only Look Once VX to achieve accurate detection of SAR objects. Its main principle is to utilize two one-dimensional global pooling operations to decode features into one-dimensional features in vertical and horizontal directions, further highlighting the location information of the target object [
50]. Yu et al. proposed a bidirectional convolutional network in which a pooling operation is first used to suppress noise in SAR images. Then, the multi-scale mapping is used to further improve the performance of the ship detection model [
51]. However, the attention modules used in these algorithms can only suppress non-ship information during feature generation and do not directly remove noise or coastal environment from the image. Furthermore, the attention module increases the computational cost of the algorithm.