Multi-Scale Proposal Generation for Ship Detection in SAR Images

: The classic ship detection methods in synthetic aperture radar (SAR) images suffer from an extreme variance of ship scale. Generating a set of ship proposals before detection operation can effectively alleviate the multi-scale problem. In order to construct a scale-independent proposal generator for SAR images, we suggest four characteristics of ships in SAR images and the corresponding four procedures in this paper. Based on these characteristics and procedures, we put forward a framework to explore multi-scale ship proposals. The designed framework mainly contains two stages: hierarchical grouping and proposal scoring. Firstly, we extract edges, superpixels and strong scattering components from SAR images. The ship proposals are obtained at hierarchical grouping stage by combining the strong scattering components with superpixel grouping. Considering the difference of edge density and the completeness and tightness of contour, we obtain the scores to measure the conﬁdence that a proposal contains a ship. Finally, the ranking proposals are obtained. Extensive experiments demonstrate the effectiveness of the four procedures. Our method achieves 0.70 the average best overlap (ABO) score, 0.59 the area under the curve (AUC) score and 0.85 best recall on a challenging dataset. In addition, the recall of our method on three scale subsets are all above 0.80. Experimental results demonstrate that our algorithm outperforms the approaches previously used for SAR images. our we effectiveness on validation dataset. The results show that hierarchical outperforms the classic multi-scale proposal extraction scheme. With strong scattering components, the increases to and the average number of proposals decreases to 868 per image. The experiments’ results also show the effectiveness of contour and edge scoring. We compare our approach with the state-of-the-art methods on the test dataset. The proposed method obtains the highest best and the highest quality of proposals. We can draw a conclusion that the proposed method outperforms other methods. The results also show that our method has more stable behavior and better performance on multi-scale datasets.


Introduction
Ships are valuable transportation and used in many areas of human activity. As synthetic aperture radar (SAR) can provide images of the ocean in all weather operating conditions, automatic ship detection based on these data has attracted considerable attention [1].
Ship detection in SAR images experienced significant improvement in recent years. Previous studies on ship detection using SAR images can be categorized into four types: (1) the constant false-alarm rate-based (CFAR) detectors, (2) the generalized-likelihood ratio test-based (GLRT) detectors, (3) the saliency-based methods and (4) the deep learning-based methods. As we know, different classes of ships are of different lengths and widths, and different SAR images may have different resolution. Therefore, ships always have different scales in SAR images as shown in Figure 1. However, the above-mentioned methods either lose information of small objects or require a priori knowledge about the ship scale. They are inadequate to deal with multi-scale ship detection.
The detectors based on CFAR are widely used ship detection methods [2,3]. In order to improve ship detection performance, many studies attempt to modify the classic CFAR detector [4][5][6][7][8][9][10][11][12][13][14]. Like multilayer CFAR detectors, researchers integrate the idea of iterative censoring with traditional CFAR at pixel level [15,16]. Superpixels can retain the ship outline and reduce the influence of speckle noise. Thus, the conventional CFAR detector is modified at the superpixel level [17][18][19][20]. For example, the multi-scale superpixel and three superpixel-level dissimilarity are unified to design a detection algorithm for polarimetric SAR images in [20]. In order to improve the performance of ship detection in SAR images, many statistical models have also been researched to fit sea clutter [21][22][23]. The pixel-level CFAR detectors generally adopt a sliding window scheme that cannot cover the various ship sizes. The superpixel-level CFAR detectors have the probability of losing some components of large scale ships in the detection results. The second type approach is based on GLRT [25]. Considering the electromagnetic aspects behind the interactions of SAR signals with the ship and the surrounding sea, Iervolino et al. present a closed form expression for the Radar Cross Section (RCS) backscattering of a ship and a new detector based on GLRT [26,27]. Similar to CFAR detectors, GLRT detectors require a priori knowledge about the ship scale and cannot handle the detection of multi-scale ships.
In recent years, several methods have employed the biologically inspired saliency model for ship detection [28][29][30][31][32]. For example, Wang et al. design a random-forest-based hierarchical sparse model for ship proposal selection. Then, the false alarms are filtered out by a dynamic CFAR-based contour saliency model [29]. The saliency-based methods pay attention to the ships that are relatively prominent. However, as shown in Figure 1, the prominent objects are those large scale ships and complex background. Small ships would be ignored by the saliency-based methods. With the development of deep learning, many detection approaches based on convolutional neural networks (CNNs) have been proposed. Fully convolutional networks (FCN) are used to separate the sea from the land [33,34]. The Faster-RCNN, cascade coupled CNN (3C2N) and the single shot multiBox detector (SSD) for SAR ship detection are reported in [35][36][37][38][39][40]. Furthermore, there are two SAR image datasets for marine surveillance and ship detection. OpenSARShip is a dataset dedicated to ship interpretation using Sentinel-1 images [41]. Li et al. construct a dataset named SAR Ship Detection Dataset (SSDD) to evaluate the performance of SAR ship detectors [42]. Besides the above-mentioned approaches, there are also several ship detectors for SAR images [43][44][45][46]. Generally, deep learning-based models adopt pooling and convolution operations. These operations lose information of small objects and confuse the localizations of densely distributed ships.
One way that can effectively alleviate the multi-scale problem is constructing a ship proposal generator before detection operation. This generator can rapidly search a set of candidate bounding boxes which are different sizes. These multi-scale candidate bounding boxes are called object proposals that cover all ship bounding boxes from a SAR image. Thus, proposal generator is a class-agnostic and scale-independent object detector that can solve the multi-scale ship detection effectively. It can speed up the computation and improve the ship detection accuracy by allowing the usage of more sophisticated learning schemes.
Recently, object proposal generation has become a superior technique in the computer vision field [47][48][49][50][51][52]. Object proposal generation methods can be divided into grouping proposal and windows scoring methods. For example, the selective search hierarchically merges superpixels to generate proposals [47]. The contour box uses the selective search to explore proposals and then rejects proposals without explicit contours [51]. We refer the reader to [53] for a survey of object proposal generation. SAR images are totally different from natural images that were processed by the computer vision field. There are few proposal generation methods for SAR images. Dai et al. train a linear SVM to learn objectness measure of ships [54]. The performance of this generator is limited since the proposal generator uses several fixed-size sliding windows.
Under the assumption that all ships share common visual properties that distinguish them from the background, we can design a generator that outputs a set of ship proposals. We argue that a desired SAR image-specific ship proposal generator should generate scale-independent proposals with high recall, and score highest for those proposals that fit a ship tightly. Therefore, the common visual properties of ships adopted by our proposal generator should be scale-independent. For example, in Figure 1c, we use the sliding windows with different scale and aspect ratio to explore initial proposals. This search strategy misses two small ships and cannot fit all ships tightly since it needs a prior knowledge of ship size and is scale-dependent. At ranking stage, we use saliency cues to measure the ship proposals. However, this scale-dependent cues result in ranking the non-ship but prominent proposals higher.
According to the above paragraph, we should design visual properties of ships which are scale-independent and common-shared before constructing a good ship proposal generator for SAR images. Thus, we propose that ships in SAR images have the following characteristics in this paper: • A ship generally has a different scattering characteristics from its surroundings; • A ship generally has some components with strong scattering; • There are edges between a ship and its surroundings; • A ship generally has a closed contour.
Our generator should be scale-independent at proposal extraction and scoring stage. Considering that the bottom-up strategy can search ships from small to large scales, we can utilize hierarchical grouping to extract multi-scale proposals. The scale-independent measures can be obtained using edge and contour information which are irrelevant with ship sizes. Therefore, we further propose four procedures that correspond to the four characteristics to explore multi-scale ship proposals:

•
The generator obtains components of ships by superpixel algorithm, and explores ships by hierarchical grouping; • The generator obtains proposals from the superpixels that contain at least one strong scattering component; • The generator measures a proposal by the difference of the edge density between the inside and near the borders of the proposal; • The generator measures a proposal by the completeness and the tightness of the contour.
The framework of our proposed method includes hierarchical grouping and proposal scoring stage. The information of strong scattering components is injected at hierarchical superpixel grouping stage. The hierarchical grouping stage generates ship proposals without scores. These proposals are further processed by the scoring stage, each of them is scored by edge and contour information. Finally, the ranking ship proposals are obtained.
The remainder of this paper is organized as follows. The suggested ship proposal generation method is discussed in detail in Section 2. In Section 3, we evaluate the effectiveness of the four procedures and compare the generator with the state-of-the-art methods. The results are discussed in Section 4. Section 5 presents the conclusions.

Ship Proposal Generator
The proposed methodology is described in detail in this section. The framework that combines all four procedures into one proposal generator are presented in Section 2.1. In Section 2.2, we put forward edge and superpixel extraction which are the foundation of our generator. The rest of this section introduces the two stages of the proposed method.

Framework
In Section 1, we propose the four characteristics of ships in SAR images and the corresponding four procedures. In this part, we further introduce the framework in detail. We need to extract edge and superpixel of SAR images firstly since the four procedures are based on edges and superpixels. Then, we need to combine all four procedures into one proposal generator. Scale-independent proposal extraction and scoring compose our method. In order to extract multi-scale initial proposals, superpixel hierarchical grouping is adopted. This bottom-up strategy can search proposals from small scale to large scale. Combining the strong scattering components information, we further reject non-ship proposals. This top-down strategy can reduce the number of initial proposals and the computation burden of proposal scoring. We use the edge information to rank initial proposals at proposal scoring stage. Specifically, edge score and contour score are used to measure ship proposals. This two scores only focus on the edge density, the contour completeness and the contour tightness which are independent of proposal scales. Therefore, the ranking results are scale-independent.
The framework of our proposed method is shown in Figure 2. Our framework including hierarchical grouping and proposal scoring. Firstly, we obtain initial proposals at the hierarchical grouping stage: the input SAR image is processed by superpixel and CFAR algorithm; then, the results are inputted into Algorithm 1 to acquire initial proposals. Then, we obtain a score for each proposal at the proposal scoring stage: the input SAR image is processed by our edge extraction scheme which will be introduced in Section 2.2. For each proposal, we can compute its score using contour scoring and edge scoring which will be introduced in Section 2.4. Finally, the ship proposals with scores are output.

Algorithm 1: Hierarchical Grouping using Superpixels and Strong Scattering Components.
Input: An input SAR image Output: A list of ship proposals OutBox 1 Extract strong scattering components map Label using local CFAR; 2 Obtain initial ship proposals P = {p 1 , . . . , p n } using method proposed in [55]; 3 Obtain neighbouring superpixels of each proposal in P; 4 Set histogram similarity set S = ∅; 5 foreach p i ∈ P do 6 foreach neighbouring superpixel p j of p i do 7 Calculate histogram similiarty s(p i , p j ) foreach neighbouring superpixel p m of p i , p n of p j , p q of p t do 20

Edges and Superpixels
In the hope of exploring ship proposals using the four procedures mentioned above, we extract edges and superpixels firstly. We adopt the superpixel method proposed in [55] since it is robust for SAR images. The authors in [55] construct an energy function which includes two terms: data term defined by the statistical characteristic of SAR images and regularization term defined by ratio of mean intensity. The superpixel can be explored by graph cut based energy minimization method. There are few edge extraction algorithms with good generalization performance for SAR images. We construct an edge extraction scheme using an existing algorithm as shown in Figure 3. The ratio of exponentially weighted averages (ROEWA) detector is a classical edge detector for SAR images [56]. ROEWA is optimal under a stochastic multi-edge model. It computes a normalized ratio of exponentially weighted averages (ROEWA) on opposite sides of the central pixel. The magnitude on the horizontal and vertical direction construct the edge response of the central pixel. Finally, the authors in [56] apply a modified watershed algorithm to eliminate false edges response. Non-maximal suppression (NMS) can find edge peaks [52]. As shown in Figure 3, we filter the input SAR image by a non-local approach proposed in [57] firstly. In order to detect multi-scale edges, the filtered SAR image is utilized to construct an image pyramid: we resample the spatial resolution by a factor of 0.5 and 2. These two resampled images and the input image construct a image pyramid which include three octaves; each octave contains one interval. For each interval in each octave, we apply ROEWA to extract edges. Then, we can acquire three edge maps. We resample these maps to the original resolution of the input image. These resampled edge maps are added at pixel level. After normalization of these maps, the dense edge responses are obtained. We preform NMS to the edge responses and obtain final edge map.

Hierarchical Grouping
Compared to the complexity of natural targets, single frequency and single linear polarization SAR images contain limited information and cover a large area. Therefore, multi-scale ship proposal generation using SAR images is a challenging problem. The traditional idea to handle this problem is using sliding windows with different aspect ratios and scales. However, due to orientation and scale variations of ships in SAR images, the performance of sliding windows strategy is not good. According to the four characteristics mentioned in Section 1, we can obtain ships components by superpixels and search ships using a hierarchical grouping strategy similar to the selective search proposed in [47]. A strong scattering component does not always correspond to a semantic ship component. On the other hand, if a proposal does not contain a strong scattering component, it stands a high probability that this is a non-ship proposal. Therefore, different from the selective search, we take advantage of strong scattering components to censor useless proposals.
Our algorithm is shown in Algorithm 1. The algorithm can explore a list of ship proposals OutBox using an input SAR image. We adopt local CFAR to extract strong scattering components which are labeled in map Label. The superpixels obtained by the method proposed in [55] are set as initial proposals P = {p 1 , p 2 , · · · , p n }. Then, for each initial proposal in P, we find its neighbouring superpixels and compute the histogram similarity between each initial proposal and the neighbouring superpixels. Next, we find the pair of superpixels with maximal similarity in P and merge these two superpixels into one new proposal. If the new proposal does not contain any strong scattering components, then delete the new proposal. On the contrary, we retain this new proposal and delete the pair of superpixels from P. This procedure is repeated until there is no superpixel in P. The histogram similarity of two superpixels p i and p j is defined as follows: where the h i and h j are histograms of superpixels p i and p j , respectively. In our method, we set the number of bins N = 32.

Proposal Scoring
There are hundreds of ship proposals after hierarchical grouping. Only a few proposals contain real ships. In this section, we utilize the edge and contour scores to measure the confidence that a proposal contains a ship.

Edge Scoring
A ship has different scattering characteristics from its surroundings while its components have similar scattering characteristics. Therefore, there are dense edges near the border of ship proposals, and only a few weak edges inside the ship proposals. In this paper, we take advantage of edge density and edge density ratio to measure proposals.
As shown in Figure 4, the red boxes are ship proposals, and the yellow boxes are inner boxes of proposals. Edge density measures the density of edges near the ship proposals' borders [50]. Given a ship proposal P and its inner box P in , the edge density is calculated as the density of edges in the inner ring: where E is the edge map obtained in Section 2.2. According to the ship characteristics in SAR images stated in Section 1, we suggest a ship proposal measure using edge density difference of the inner ring and the inner box. The edge density ratio is calculated as follows: Finally, we obtain the edge score of the ship proposal P: For fast computation, we use the integral image of E to calculate the sum of edges in a proposals.

Contour Scoring
Ships in SAR images generally have closed contours. A closed contour is defined in [51] using the completeness and tightness. The completeness of a contour is defined as follows: where c is a closed curve, and the numerator is the length of c. We can explore a most closed contour using Equation (5). Here, the φ(·) is defined as: This function is adopted to adjust the edge map. According to [51], we set τ = 0.001 and γ = 1. A ship proposal should enclose a ship tightly. Therefore, the contour of a ship is near the proposal border. We explore a closed contour whose pixels are far from the center of the proposal. The tightness of a contour is defined as follows: where the distance to the center x 0 of the proposal is defined as: where w and h are respectively the width and height of the ship proposal. ϕ(·) is adopted to adjust the distance: Finally, the contour score of a proposal is as follows: We rank the ship proposals using both the edge and contour score:

Results
In this section, we evaluate the effectiveness of the procedures proposed in Section 2, and compare our method with the state-of-the-art methods. We use the SSDD dataset to test our method. The SSDD dataset includes 1160 SAR images with different resolution, scales, sea conditions, sensors and polarization. It is a challenging dataset. In order to evaluate the performance of our ship proposal generator, the dataset is divided into two parts (validation set and test set) with the proportion 7:3. Three measures are adopted here: the average best overlap (ABO), the area under the curve (AUC) and the recall measures. The AUC is the area under the curve measuring the recall versus numbers of ship proposals. The ABO between a ground truth set G and a proposal set P is calculated as follows: where the intersection over union (IoU) can be obtained from the following equation: In the experiments here, we use the threshold IoU = 0.5 for evaluation.

Evaluation of Four Procedures
In order to generate multi-scale ship proposals, we suggest four procedures in Section 1. In this part, we evaluate the four procedures on validation set by comparing with the variation of our method.

Evaluation of Hierarchical Superpixels Grouping
The first procedure is hierarchical superpixels grouping. The generation of superpixels is the foundation of our method. Therefore, we firstly evaluate its performance under different initial superpixel size. We also evaluate the performance by replacing hierarchical superpixels grouping with other procedures.

Variation of Initial Superpixels Size
For our method, the initial size of superpixels mainly influences the recall of multi-scale ship proposals. In [55], the initial size of superpixels is defined as k; it is a main parameter used to generate superpixels. We explore variants of our approach by changing k. Specifically, we vary the initial size from k = 5 to k = 100 in steps of 5. The results are shown in Figure 5a-c. Figure 5a is the curves measuring the recall versus number of ship proposals under different initial superpixels size k. Figure 5b is the curves measuring the recall versus IoU under different initial superpixels size k. Figure 5c is the ABO scores and AUC versus different initial superpixels size k. In our paper, we adopt the results when k = 15 and k = 20. Figure 5d shows the results. The orange curve is the recall versus number of ship proposals when k = 15 and k = 20. The purple curve is the recall versus IoU when k = 15 and k = 20.

Hierarchical Superpixels Grouping versus Multi-scale Superpixels Segmentation
We utilize hierarchical grouping for multi-scale ship proposals generation. In order to prove that the use of this strategy leads to better ship proposals, we evaluate the performance of hierarchical superpixels grouping here by comparing to multi-scale superpixels segmentation. In this experiment, we set four variants of multi-scale superpixels segmentation: (1) the k is varied from k = 5 to k = 50 in steps of 5; (2) the k is varied from k = 10 to k = 100 in steps of 10; (3) the k is varied from k = 10 to k = 50 in steps of 10; and (4) the k is varied from k = 5 to k = 45 in steps of 10. The four settings of k capture both small and large scale ships. For our method, we use the results when k = 15 and 20. The results are shown in Figure 6a,b. Figure 6a,b show the curves of the proposed method and multi-scale superpixels segmentation. Figure 6c,d show the curves of the proposed method and multi-scale sliding windows. Figure 6a,c are the curves measure the recall versus number of ship proposals. Figure 6b,d are the curves measuring the recall versus IoU. The comparison of ABO scores, AUC and average number of ship proposals are shown in Table 1. Table 1 shows the ABO scores, AUC and average number of ship proposals for the proposed approach and various variants of our approach with different parameters.

Hierarchical Superpixels Grouping versus Sliding Windows
As a classic scheme to extract multi-scale target proposals, multi-scale sliding windows with different aspect ratios are compared with our method in this paper. For multi-scale sliding windows, the width w of the window are set as follows: (1) the w is varied from w = 5 to w = 50 in steps of 5; (2) the w is varied from w = 10 to w = 100 in steps of 10; (3) the w is varied from w = 10 to w = 50 in steps of 10; (4) the w is varied from w = 5 to w = 45 in steps of 10. The aspect ratios are set to 0.5, 1 and 2. Figure 6c,d show the recall rates when varying the number of ship proposals and the IoU, respectively. The ABO scores, AUC and average number of ship proposals are shown in Table 1.

Evaluation of Strong Scattering Components Information
Strong scattering components information is valid since ships in SAR images generally contain strong scattering. Therefore, we utilize this information for hierarchical superpixels grouping. In this section, we evaluate the value of utilization of this information. Specifically, we set k = 15 and k = 15, 20, respectively. The results of our method and the variants without utilization of strong scattering components information are shown in Figure 7 and Table 1. Figure 7 shows the recall of our method and the variants without utilization of strong scattering components information. Figure 7a shows the curves measuring the recall versus numbers of ship proposals. Figure 7b shows the curves measuring the recall versus IoU.

Evaluation of Edges Scoring and Contours Scoring
Edge scoring is adopted to rank the ship proposals in our method. We set k = 15 and k = 15, 20, and evaluate the performance of variants without edge scoring. The results are shown in Figure 8 and Table 1. We also evaluate the performance of two variants without contour scoring. The results are shown in Figure 9 and Table 1.  Figure 8 shows the recall of our method and the variants without utilization of edge scoring. Figure 8a is the curves measuring the recall versus numbers of ship proposals. Figure 8b is the curves measuring the recall versus IoU. Figure 9 shows the recall of our method and the variants without utilization of contour scoring. Figure 9a is the curves measuring the recall versus numbers of ship proposals. Figure 9b is the curves measuring the recall versus IoU.

Evaluation of Multi-Scale Ship Proposal Generation
In order to evaluate the performance of multi-scale ship proposal generation, the ground truth is clustered into three subsets. These subsets contain small, middle and large scale ships, respectively. The small, middle, and large scale ship subsets contain 1181, 407 and 226 ground truth ships, respectively. We evaluate the multi-scale proposal generation of our method and the variants on these subsets. The results are shown in Table 2 and Figure 10. Table 2 shows the results of the proposed method and various variants of our approach on multi-scale validation datasets using ABO scores, AUC and best recall. Figure 10 shows the recall of our method and the variants on multi-scale validation datasets. Figure 10a-c is the curves measuring the recall versus number of ship proposals on large, middle and small scale ships. Figure 10d-f is the curves measuring the recall versus IoU on large, middle and small scale ships. Table 2. Comparison of ABO scores, AUC and best recall for the proposed method and various variants of our approach on validation datasets that contain multi-scale ships.

Comparison with the State-of-the-Art Methods
In this section, we extensively compare our method with more state-of-the-art methods on the test set, including saliency filtering [29], local contrast variance weighted information entropy (LCVIWE) [1], information theory-based target detection (ITBTD) [3] and objectness learning [54]. For our method, we set k = 15, 20. For other methods, we adopt the outputs of ship candidate extraction as final ship proposals. The results are shown in Figure 11 and Table 3. Figure 11 shows the recall of our method and the state-of-the-art methods. Figure 11a is the curves measuring the recall versus numbers of ship proposals. Figure 11b is the curves measuring the recall versus IoU. Table 3 shows the ABO scores, AUC and best recall for the proposed method and the state-of-the-art methods on test datasets that contain multi-scale ships.  We further evaluate the performance of multi-scale ship proposal generation. Specifically, the ground truth of test data is clustered into small ships, middle ships and large ships' subsets. These subsets contain 485, 179 and 62 ships, respectively. The results of all methods on three subsets are shown in Figure 12 and Table 3. Figure 12 shows the recall of our method and the state-of-the-art methods on multi-scale datasets. Figure 12a-c shows the curves measuring the recall versus numbers of ship proposals on test datasets that contain large, middle and small scale ships. Figure 12d-f shows the curves measuring the recall versus IoU on test datasets that contain large, middle and small scale ships. The computational time of different methods is shown in Table 4. All experiments were implemented on Intel(R) Core(TM) i5-6500 CPU at 3.19 GHz and 8 GB RAM. Finally, some results of the proposed method are shown in Figure 13. Blue bounding boxes are the closest produced object proposals to each ground truth. Green and red bounding boxes are ground truth bounding boxes. Green represents that a ship was found and red indicates that the ship was not found. The yellow numbers are the scores of bounding boxes.

Discussion
In this section, we discuss the effect of parameter and the four procedures. In addition, comparing with the state-of-the-art are also commented.
(1) The effect of parameter k. In the proposed method, the parameter k is very important. As show in Figure 5a-c, the performance is limited when k is set too small or too large since this setting cannot capture small or large scale ships. When k is set to 15 or 20, our method achieves a relatively good performance. The ABO scores are both higher than 0.6, and the AUC are both higher than 0.5. The best recall is 0.74642 when k = 15. In order to obtain ship proposals with good quality, we vary the initial size of superpixels and combine the results using k = 15 and k = 20. The performance of this combination is shown in Figure 5d. The ABO scores and the AUC are 0.70334 and 0.58, respectively.
(2) The effect of hierarchical superpixels grouping. The results of comparison with multi-scale superpixels segmentation and sliding windows are shown in Table 1 and Figure 6. The first and second variants of multi-scale superpixels segmentation both provide comparable results with our approach from Figure 6b. However, it can be observed that our method outperforms the four variants from the perspective of AUC and ABO scores. Note that our method generates a smaller number of proposals than other variants from Figure 6a and Table 1. This indicates that the ships can be efficiently detected within a smaller searching space using our method. We conclude that using all ship proposals generated by superpixel hierarchical grouping is much better than using multi-scale superpixel segmentation. As for sliding windows, it can be observed that the performance of four variants of sliding windows degrades quickly when the recall threshold is increased. Our method outperforms all four variants, and is much better at high IoU area. Moreover, our method obtains ship proposals with much better quality since the ABO score is the highest. We can conclude that sliding windows cannot handle the multi-scale ships proposal generation in SAR images. We attribute this to the high variation of ships size and orientation in SAR images.
(3) The effect of strong scattering components information. When comparing results over a variety of IoU thresholds (Figure 7b), we can see the recall of the variants are slightly higher than ours at low IoU area. As shown in Figure 7a, ours achieves competitive or higher recall when the number of proposals increases. It also can be observed that our method generates a high quality of proposals with higher ABO, AUC and less number of ship proposals. The results imply the effectiveness of strong scattering components information injection.
(4) The effect of edges scoring. As shown in Figure 8b, two variants have similar curves with our method over a variety of IoU thresholds. It is reasonable since edge scoring only influences the proposal scoring stage. Figure 8a shows that our method achieves competitive or higher recall with same number of proposals. A small drop in AUC can be observed in Table 1 when the edge scoring stage is removed. Therefore, edge scoring can perform better for true ship proposals.
(5) The effect of contour scoring. Similar to edge scoring, contour scoring is only used at a proposal scoring stage. Thus, our methods have similar curves with the two variants over a variety of IoU thresholds. Comparing to our methods, the two variants have a significant drop in AUC in Table 1. Figure 9a shows the recall when changing the number of ship proposals. Comparing to two variants, our methods show higher recall, especially when the number of ship proposals is less than 100. In conclusion, the performance of our method is significantly boosted with contour scoring.
(6) Multi-scale ship proposal generation. The performance of multi-scale sliding windows is the worst, especially on a large scale ship subset. We found the multi-scale superpixel segmentation to be unstable when k = 5, 10, · · · , 50. Although the multi-scale superpixel segmentation achieves higher best recall than ours on a large scale ship subset when k = 5, 10, · · · , 50, the performance of this variant degrades dramatically on a small scale ship subset, even worse than the multi-scale sliding windows. When k = 10, 20, · · · , 100, the performance of multi-scale superpixels segmentation is relatively stable. Our method can achieve comparable or better performance than these variants in terms of ABO, AUC and best recall. In contrast, our proposed method showed more stable and better behavior on both three subsets than other variants. Furthermore, as shown in Table 1, our method requires 868 ship proposals to achieve good results on three subsets, and multi-scale superpixels generates 12,603 ship proposals.
(7) Comparison with the state-of-the-art methods. The results in Figure 11a clearly illustrate that our approach achieves the highest best recall and AUC. The recall of LCVIWE and ITBTD increases slowly or remains unchangeable as the number of proposals increases. This is because these methods generally generate only a few proposals per image. The recall of objectness learning is significantly boosted with a large number of proposals. This is probably because the objectness measure cannot always score the highest for true ship proposals. It is found that saliency filtering performs slightly better than ours with a small number of proposals, but suffers from poor localization accuracy as can be seen from Figure 11b. From Table 3, it can be seen that our method explores proposals with the highest quality. In addition, our approach outperforms other methods and shows more stable behavior than others on all three subsets.

Conclusions
Currently, ship detection methods for SAR images suffer from an extreme variance of ship sizes. In this paper, we propose a ship proposal generator to explore multi-scale ship proposals. The proposed method utilizes four scale-independent characteristics and puts forward four corresponding procedures. Initial proposals are obtained by hierarchical superpixels grouping. Non-ship proposals are rejected using strong scattering components. At the scoring stage, we use contour and edge information to rank proposals. The four procedures in our method are scale-independent. Therefore, the final ship proposals are scale-irrelevant. In the experiments, we evaluate the effectiveness of the four procedures on a validation dataset. The results show that hierarchical superpixels grouping outperforms the classic multi-scale proposal extraction scheme. With strong scattering components, the AUC increases to 0.58 and the average number of proposals decreases to 868 per image. The experiments' results also show the effectiveness of contour and edge scoring. We compare our approach with the state-of-the-art methods on the test dataset. The proposed method obtains the highest best recall and the highest quality of proposals. We can draw a conclusion that the proposed method outperforms other methods. The results also show that our method has more stable behavior and better performance on multi-scale datasets.