Detection of Moving Ships in Sequences of Remote Sensing Images

: High-speed agile remote sensing satellites have the ability to capture multiple sequences of images. However, the frame rate is lower and the baseline between each image is much longer than normal image sequences. As a result, the edges and shadows in each image in the sequence vary considerably. Therefore, more requirements are placed on the target detection algorithm. Aiming at the characteristics of multi-view image sequences, we propose an approach to detect moving ships on the water surface. Based on marker controlled watershed segmentation, we use the extracted foreground and background images to segment moving ships, and we obtain the complete shape and texture information of the ships. The inter-frame difference algorithm is applied to extract the foreground object information, while Otsu’s algorithm is used to extract the image background. The foreground and background information is fused to solve the problem of interference with object detection caused by long imaging baseline. The experimental results show that the proposed method is effective for moving ship detection.


Introduction
With the development of economic globalization, sea safety and economic conflicts between countries on the seas are becoming more and more prominent.Therefore, through highly mobile remote sensing satellites, continuous observations can be carried out over a certain area within a certain range.They can provide real-time, fast and accurate access to dynamic marine information, providing for timely and effective decisions, as well as the rapid settlement of marine emergencies.Meanwhile, agile satellites have the ability of rapid maneuvering and multi-view imaging over the same place.Therefore, using multi-image data, we can observe and track ships as targets, to obtain their orientation, speed and other dynamic information.This not only provides the basis for decision-making and guidance, but also reflects an important aspect of remote sensing satellite applications [1].
However, long imaging intervals and large base-height ratios exist in multi-view sequence images which are captured by agile satellites.Therefore, the appearance of the same surface feature in different image sequences is greatly changed.Figure 1 shows the method of capturing sequence images by agile satellites.It is obvious that the displacement of building shadows, different angle of buildings, and changes in illumination, all can exist in one group of sequence image, and they strongly impact the target detection method.At present, mainstream target detection methods are divided into two classes: background difference methods and inter-frame difference methods [2].The background difference method uses various algorithms to obtain the background image, and then subtracts the background background difference method uses various algorithms to obtain the background image, and then subtracts the background image from the current frame to locate the moving target in the current image.There are many methods for building the background model.For example, considered the fact that the background is always observed in image sequences, the background is extracted based on partial differential equations [3,4]; using the non-parametric kernel of the Gaussian distribution to estimate the probability of tie-points, an estimation model is established with a non-parametric density kernel, which can detect objects in a slightly shaking background [5,6]; based on the mixture of the k numbers of a Gaussian distribution model, the characteristics of the background distribution are obtained [7][8][9].However, the background model established by these algorithms is highly correlated with the background of the images, which means the background model cannot be established accurately when the scene in every image differs greatly.The inter-frame difference method is another method of target detection that uses the difference between two or more frames to obtain the shape, position and other information about the moving object [9][10][11].Based on the continuous difference between two or more frames to update the background information, an entire background model is obtained to extract moving objects [10].Moreover, the moving objects in image sequences are obeyed high-order statistical distributions.Therefore, the moving objects can be subsequently obtained by a high-order statistical operator to filter the differential images in certain areas [11].Moreover, Canny feature points can be extracted from image sequences, and the feature images are differentiated, so the moving targets are extracted by the characteristics of the feature points [12].Although these algorithms are more adaptable to real environments than the background difference algorithms, they are influenced by the displacement of the moving target.Sometimes they cannot detect the complete target, and only part of information about the moving target can be obtained.Therefore, this paper proposes a method of combining the background difference algorithm and the inter-frame difference algorithm, to effectively solve the problem that large base-height ratios and large ground feature changes can interfere with the target detection in the multi-view image sequences.The accuracy of target detection is greatly improved, which provides technical support for the rapid discovery and tracking of key targets using agile satellites.

Methods
In this paper, the target detection algorithm for moving ships in multi-view image sequences is divided into three parts: foreground extraction; background extraction; target segmentation.The flow chart is shown in Figure 2. First, histogram matching is performed using three frames of a multi-view image sequence, so the gray level difference between the images is eliminated.Then, computing the The inter-frame difference method is another method of target detection that uses the difference between two or more frames to obtain the shape, position and other information about the moving object [9][10][11].Based on the continuous difference between two or more frames to update the background information, an entire background model is obtained to extract moving objects [10].Moreover, the moving objects in image sequences are obeyed high-order statistical distributions.Therefore, the moving objects can be subsequently obtained by a high-order statistical operator to filter the differential images in certain areas [11].Moreover, Canny feature points can be extracted from image sequences, and the feature images are differentiated, so the moving targets are extracted by the characteristics of the feature points [12].Although these algorithms are more adaptable to real environments than the background difference algorithms, they are influenced by the displacement of the moving target.Sometimes they cannot detect the complete target, and only part of information about the moving target can be obtained.Therefore, this paper proposes a method of combining the background difference algorithm and the inter-frame difference algorithm, to effectively solve the problem that large base-height ratios and large ground feature changes can interfere with the target detection in the multi-view image sequences.The accuracy of target detection is greatly improved, which provides technical support for the rapid discovery and tracking of key targets using agile satellites.

Methods
In this paper, the target detection algorithm for moving ships in multi-view image sequences is divided into three parts: foreground extraction; background extraction; target segmentation.The flow chart is shown in Figure 2. First, histogram matching is performed using three frames of a multi-view image sequence, so the gray level difference between the images is eliminated.Then, computing the difference between these three frames, we can obtain the partial information about the moving targets.Next, the differential results are filtered using a multi-construction operator and binarized, which can eliminate the interference caused by movement between frames.After that step, we generate the mask images, which provide partial information on the moving targets as foreground images, and the original images are threshold using Otsu's method and binarized such that we can extract the background information from the image sequences.Finally, combining the foreground and background images, we use marker-based watershed segmentation to segment the multi-view image sequences, to quickly and completely detect the moving targets.
ISPRS Int.J. Geo-Inf.2017, 6, 334 3 of 12 difference between these three frames, we can obtain the partial information about the moving targets.Next, the differential results are filtered using a multi-construction operator and binarized, which can eliminate the interference caused by movement between frames.After that step, we generate the mask images, which provide partial information on the moving targets as foreground images, and the original images are threshold using Otsu's method and binarized such that we can extract the background information from the image sequences.Finally, combining the foreground and background images, we use marker-based watershed segmentation to segment the multi-view image sequences, to quickly and completely detect the moving targets.

Inter-Frame Difference Algorithm
The inter-frame difference algorithm can quickly obtain dynamic targets in the foreground [13].The basic principle of the inter-frame difference is that two or more frames in an image sequence are subtracted, so that the difference images containing part of moving target area are obtained.We then use a threshold to binarize these images.Assume that three consecutive frames of multi-view images .The first two images are processed as follows: where ( 1, ) ( , ) is the result of the subtraction.Because of large changes in the appearance of ground features for the high mobility of agile satellites, the result of this single subtraction contains a large amount of noise.This interferes greatly with the target information in the difference images, so the third frame is subtracted with the result: ( , ) ( , ) ( , ) ( , ) where ( 1, , 1) ( , ) is the resulting image after the twice difference.Then, we transform the result ( 1, , 1) ( , ) into a binary image ( , ) T i j : ( 1, , 1) where Th is the binarization threshold. 0and 1 represent the non-target area and target area, respectively.Because of the following filtering step, in this place we need as much information on the differential images as possible, so we just take Th as zero-value.

Multi-Structuring Element Morphological Filtering
Because of noise generated by the dynamic change of ground features in the binary images after computing the inter-frame difference, the binary images cannot directly provide information on the moving targets.It is necessary to filter the difference images using morphological filtering operators.Meanwhile, in order to avoid the pitfall of morphological filtering with a single structuring element [13], we perform morphological filtering using several structuring elements.Knowing that building corners, shadows and other features occur mostly in linear and angular combinations, we

Inter-Frame Difference Algorithm
The inter-frame difference algorithm can quickly obtain dynamic targets in the foreground [13].The basic principle of the inter-frame difference is that two or more frames in an image sequence are subtracted, so that the difference images containing part of moving target area are obtained.We then use a threshold to binarize these images.Assume that three consecutive frames of multi-view images are f (k−1) (x, y), f (k) (x, y) and f (k+1) (x, y).The first two images are processed as follows: where is the result of the subtraction.Because of large changes in the appearance of ground features for the high mobility of agile satellites, the result of this single subtraction contains a large amount of noise.This interferes greatly with the target information in the difference images, so the third frame is subtracted with the result: where d (k−1,k,k+1) (x, y) is the resulting image after the twice difference.Then, we transform the result where Th is the binarization threshold.0 and 1 represent the non-target area and target area, respectively.Because of the following filtering step, in this place we need as much information on the differential images as possible, so we just take Th as zero-value.

Multi-Structuring Element Morphological Filtering
Because of noise generated by the dynamic change of ground features in the binary images after computing the inter-frame difference, the binary images cannot directly provide information on the moving targets.It is necessary to filter the difference images using morphological filtering operators.Meanwhile, in order to avoid the pitfall of morphological filtering with a single structuring element [13], we perform morphological filtering using several structuring elements.Knowing that building corners, shadows and other features occur mostly in linear and angular combinations, we continuously use 0 • , 45 • , 90 • and 135 • and different dimensions of the linear structuring elements to process opening and closing operations, and finally obtain the results from using these morphological filters, resulting in enhancement of the target with background suppression.The structuring elements are shown in Figure 3. Using different scales and different structuring elements, positive and negative noise signals can be simultaneously suppressed [14].This maximizes the filtering out of signals not related to target information.The formula of the algorithm is as follows: (( ) ) where i and j represent the number of different structuring elements and different scales, ω represents the weight, f represents the original image, b is the structuring elements, symbol "  " represents opening operation, and " ⋅ " represents closing.

Otsu's Method
Otsu's method is simple, efficient and adaptable for finding the optimal threshold between the background and the targets in the foreground [15].The larger grayscale (the range between minimum and maximum grey value) difference between the background and the targets, the more accurate the threshold [15,16].In the optical images, the grayscale value of the water surface is always less than the grayscale value of the target ships.Therefore, Otsu's method can effectively segment the target ship from the background.The Otsu's method assumes that the image can statistically be divided into two parts: background and foreground.That is, the histogram of the image is bimodal.The main purpose of Otsu's method is to find the threshold that minimizes the internal variance of both the background and the foreground.Define the sum of variance weight of the two classes as follows: where the weight 0,1 ω represents the probabilities of the two classes (background and foreground) divided by the threshold t, and σ are the variances of the two classes.The method of calculation for the probabilities of classes 0,1 ω is as follows: where L is the grayscale level of the image, and ( ) p i is the probability of gray level value i. Otsu has shown that the smallest inter-class variance is equal to the largest outer-class variance [15], that is:

Otsu's Method
Otsu's method is simple, efficient and adaptable for finding the optimal threshold between the background and the targets in the foreground [15].The larger grayscale (the range between minimum and maximum grey value) difference between the background and the targets, the more accurate the threshold [15,16].In the optical images, the grayscale value of the water surface is always less than the grayscale value of the target ships.Therefore, Otsu's method can effectively segment the target ship from the background.The Otsu's method assumes that the image can statistically be divided into two parts: background and foreground.That is, the histogram of the image is bimodal.The main purpose of Otsu's method is to find the threshold that minimizes the internal variance of both the background and the foreground.Define the sum of variance weight of the two classes as follows: where the weight ω 0,1 represents the probabilities of the two classes (background and foreground) divided by the threshold t, and σ 2 0,1 are the variances of the two classes.The method of calculation for the probabilities of classes ω 0,1 is as follows: ) where L is the grayscale level of the image, and p(i) is the probability of gray level value i. Otsu has shown that the smallest inter-class variance is equal to the largest outer-class variance [15], that is: where µ 0,1,T (t) is the average of classes, the calculation is as follows: It can be seen that the probability ω and the mean value µ can be calculated iteratively for the threshold t, as well as the outer-class variance σ 2 b (t).When obtaining the largest outer-class variance σ 2 b (t), the threshold t at that time is the optimal threshold for image segmentation.

Marker-Based Watershed Segmentation Algorithm
The watershed algorithm is a regional image segmentation method based on mathematical morphology, which was first proposed and applied by Beucher and Lantuejoul in the late 1970s [17][18][19][20].The watershed algorithm based on the immersion principle [18] is one of the forms of its expression.The algorithm first simulates the topological map of geodesics, and simulates the lowest points of the geodesics at each local minimum, where we insert a small hole, and then submerge the whole topographic map in water.With the water level rising, at the lowest of the confluence of the establishment of the "dam", and ultimately these "dams" are the final division of the border, that is, a "watershed".
However, the traditional watershed algorithm also has several defects [19]: (1) when the noise or the texture of the image is very noticeable, it is easy to produce the effect of excessive division; (2) because of the weak response for the low contrast images, the traditional algorithm cannot obtain a better segmentation effect.Because of the remote sensing images' high resolution and clarity, as well as a relatively complex image texture, the final segmentation results of the traditional watershed algorithm will exhibit a significant over-segmentation phenomenon.
The marker-based watershed algorithm can resolve all the defects of the traditional watershed algorithm [20,21].The marker-based watershed algorithm uses the known segmented area as a local minimum, that is, the lowest point in the topographic map.In addition, then in the immersion, it begins from the known lowest point, and ultimately obtains the final segment boundary [20].Because of the a priori knowledge involved in the division, this method can effectively avoid the problem of over-segmentation caused by texture or noise, greatly improving segmentation accuracy.The process of marker-based watershed algorithm is as follows: (1) Different markers are given different labels, and pixels of the markers are the start of the immersion.
(2) Corresponding to the gradient magnitude of the pixels neighboring markers, we insert the neighboring pixels of markers into a queue with a priority level.The gradient magnitude of the pixels is calculated as follows: where grad(i, j) is the gradient of the pixel located at (i, j), and f (i, j) is the pixel value.
(3) The pixel with the lowest priority level is extracted from the priority queue.If the neighbors of the extracted pixel that have already been labeled all have the same label, then the pixel is labeled with their label.All non-marked neighbors that are not yet in the priority queue are put into the priority queue.(4) Redo step 3 until the priority queue is empty.
In this paper, the foreground information of the moving targets obtained using the inter-frame difference algorithm, as well as the background information obtained using Otsu's method, are the regions of interest and provide a priori knowledge, that is, the markers for the watershed algorithm.We then divide the whole image using the marker-based watershed algorithm to capture the targets' shape information we need.By considering the known regions as priori knowledge, over-segmentation can be greatly reduced in the final segmentation result.

Data
In this paper, we used three continuous frames of multi-view images captured by a Chinese agile remote sensing satellite.The images' size is 3440 × 2492, and they are panchromatic images, which contain the whole visible spectra and higher resolution than the multispectral images, as shown in Figure 4. From the figure, we can see that the number of moving ships is three, for convenient explanation, we zoomed in on the three areas containing the targets, and numbered the three targets.The movement of these ships is also evident.The speed of Ship 1 and Ship 3 is relatively slow, so there are overlapping parts of each ship in the images.The Ship 2 is relatively faster than two other ships, so there is almost no overlapping part of Ship 2 in the three frames.Meanwhile, all three ships have very clear trails.Table 1 shows the interval time and the base-height ratio between two adjacent images.We can see that because of the time that the agile satellite needs to maneuver between capturing two frames, the interval time and base-height ratio between two adjacent frames is large.As a result, there are large changes to background features.This finding also raises difficulties with post-processing.
ISPRS Int.J. Geo-Inf.2017, 6, 334 6 of 12 (3) The pixel with the lowest priority level is extracted from the priority queue.If the neighbors of the extracted pixel that have already been labeled all have the same label, then the pixel is labeled with their label.All non-marked neighbors that are not yet in the priority queue are put into the priority queue.(4) Redo step 3 until the priority queue is empty.
In this paper, the foreground information of the moving targets obtained using the inter-frame difference algorithm, as well as the background information obtained using Otsu's method, are the regions of interest and provide a priori knowledge, that is, the markers for the watershed algorithm.We then divide the whole image using the marker-based watershed algorithm to capture the targets' shape information we need.By considering the known regions as priori knowledge, oversegmentation can be greatly reduced in the final segmentation result.

Data
In this paper, we used three continuous frames of multi-view images captured by a Chinese agile remote sensing satellite.The images' size is 3440 × 2492, and they are panchromatic images, which contain the whole visible spectra and higher resolution than the multispectral images, as shown in Figure 4. From the figure, we can see that the number of moving ships is three, for convenient explanation, we zoomed in on the three areas containing the targets, and numbered the three targets.The movement of these ships is also evident.The speed of Ship 1 and Ship 3 is relatively slow, so there are overlapping parts of each ship in the images.The Ship 2 is relatively faster than two other ships, so there is almost no overlapping part of Ship 2 in the three frames.Meanwhile, all three ships have very clear trails.Table 1 shows the interval time and the base-height ratio between two adjacent images.We can see that because of the time that the agile satellite needs to maneuver between capturing two frames, the interval time and base-height ratio between two adjacent frames is large.As a result, there are large changes to background features.This finding also raises difficulties with post-processing.To compensate for changes in the overall radiance of the images, which are caused by the angle changes between the sensor and the sun when the satellite maneuvers, we first adjust the histograms of the last two images to match with the first frame.The result is shown in Figure 5.We can intuitively see that after histogram adjustment, the brightness of second and third frame is basically the same as the first.Meanwhile, comparing the histograms between before and after adjustment, we can see that the gray level distributions after adjustment are more consistent with the gray level distribution of first frame's histogram than before.Therefore, this method can solve the problem where difference processing in next step cannot obtain a good result because of offsets between the frames' overall radiance.At the same time, in order to measure the improvement between before and after histogram adjustment, we evaluate the correlation coefficient, the intersection test coefficient, the chi-square test coefficient and the Bhattacharyya Distance between two histograms.It can be seen from Table 2 that after the histogram matching, the correlation coefficient and chi-square coefficient between Frame 1 and Frame 2 and between Frame 1 and Frame 3 are obviously closer to 1, and the intersection test coefficient is obviously increased.At the same time, the Bhattacharyya Distance is also reduced.The changes of these factors show that after histogram matching, the correlation of the second frame and third frame to the first frame is stronger.The gray level distribution among these frames in Figure 6 is also improved, which lays the solid foundation for the improvement of the following differential precision.At the same time, in order to measure the improvement between before and after histogram adjustment, we evaluate the correlation coefficient, the intersection test coefficient, the chi-square test coefficient and the Bhattacharyya Distance between two histograms.It can be seen from Table 2 that after the histogram matching, the correlation coefficient and chi-square coefficient between Frame 1 and Frame 2 and between Frame 1 and Frame 3 are obviously closer to 1, and the intersection test coefficient is obviously increased.At the same time, the Bhattacharyya Distance is also reduced.The changes of these factors show that after histogram matching, the correlation of the second frame and third frame to the first frame is stronger.The gray level distribution among these frames in Figure 6 is also improved, which lays the solid foundation for the improvement of the following differential precision.
after the histogram matching, the correlation coefficient and chi-square coefficient between Frame 1 and Frame 2 and between Frame 1 and Frame 3 are obviously closer to 1, and the intersection test coefficient is obviously increased.At the same time, the Bhattacharyya Distance is also reduced.The changes of these factors show that after histogram matching, the correlation of the second frame and third frame to the first frame is stronger.The gray level distribution among these frames in Figure 6 is also improved, which lays the solid foundation for the improvement of the following differential precision.

Inter-Frame Difference
Through the inter-frame difference algorithm, we can quickly detect information on the moving targets.The results are shown in Figure 7. From Figure 7a, we see that because of large changes of attitude and the long imaging interval time when the agile satellite captures the multi-view sequence images, the ground features change significantly in the sequence of images.The error or noise in final result obtained by the traditional difference method is more than the information we want to extract about the moving target.Therefore, based on the traditional difference method, we combine the third frame in a second differential processing step, and the result is shown in Figure 7b.We can see that after computing the second difference, the information on the moving targets is saved, as well as a small part of error information caused by the changes of building contours, shadows or light.The result is greatly improved over the traditional method.

Inter-Frame Difference
Through the inter-frame difference algorithm, we can quickly detect information on the moving targets.The results are shown in Figure 7. From Figure 7a, we see that because of large changes of attitude and the long imaging interval time when the agile satellite captures the multi-view sequence images, the ground features change significantly in the sequence of images.The error or noise in final result obtained by the traditional difference method is more than the information we want to extract about the moving target.Therefore, based on the traditional difference method, we combine the third frame in a second differential processing step, and the result is shown in Figure 7b.We can see that after computing the second difference, the information on the moving targets is saved, as well as a small part of error information caused by the changes of building contours, shadows or light.The result is greatly improved over the traditional method.

Multi-Structuring Element Morphological Filter
From the results shown above, although using the difference procedure twice removes most of the noise caused by the dynamic changes of background features, there is still a small part of the noise remaining in the results.We need a morphological filtering operator to filter the images and remove the remaining errors.
However, as shown in Figure 8a, the noise in the difference results is largely due to changes of feature edges or shadows caused by the different viewing angles of agile satellites when they capture the images.Therefore, the noise in these images has a generally linear aspect, which cannot be eliminated with a filter using a single structuring element.Figure 8b shows that after processing with a top-hat morphologic filter using a single structuring element, there still remains lots of noise.To

Multi-Structuring Element Morphological Filter
From the results shown above, although using the difference procedure twice removes most of the noise caused by the dynamic changes of background features, there is still a small part of the noise remaining in the results.We need a morphological filtering operator to filter the images and remove the remaining errors.
However, as shown in Figure 8a, the noise in the difference results is largely due to changes of feature edges or shadows caused by the different viewing angles of agile satellites when they capture the images.Therefore, the noise in these images has a generally linear aspect, which cannot be eliminated with a filter using a single structuring element.Figure 8b shows that after processing with a top-hat morphologic filter using a single structuring element, there still remains lots of noise.To solve this problem, we can perform morphological filtering using multiple structuring elements.The results are shown in Figure 8c.We can see that after multi-structuring element morphological filtering, the linear noise which remained in the original image is mostly eliminated.At the same time, the part of moving targets information which we want to extract is saved.

Multi-Structuring Element Morphological Filter
From the results shown above, although using the difference procedure twice removes most of the noise caused by the dynamic changes of background features, there is still a small part of the noise remaining in the results.We need a morphological filtering operator to filter the images and remove the remaining errors.
However, as shown in Figure 8a, the noise in the difference results is largely due to changes of feature edges or shadows caused by the different viewing angles of agile satellites when they capture the images.Therefore, the noise in these images has a generally linear aspect, which cannot be eliminated with a filter using a single structuring element.Figure 8b shows that after processing with a top-hat morphologic filter using a single structuring element, there still remains lots of noise.To solve this problem, we can perform morphological filtering using multiple structuring elements.The results are shown in Figure 8c.We can see that after multi-structuring element morphological filtering, the linear noise which remained in the original image is mostly eliminated.At the same time, the part of moving targets information which we want to extract is saved.

Otsu's Method
Generally, in remote sensing images, the number of pixels about moving ship targets is much less than the number of pixels about the water surface, and the overall grayscale values of the water surface are smaller than the grayscale values of the ship targets.Therefore, Otsu's method is used to extract the background of the moving ship targets, that is, the water surface.Meanwhile, in order to effectively suppress the interference of noise, we also carry out multi-structuring element morphological filtering, and binarize the results.Figure 9b shows the result of background extraction, where the highlighted positions in the figure is the background we extracted.We can see that because of the huge contrast between the water surface and the moving ship targets, the background is completely extracted, which provide a good foundation for the next segmentation step.

Otsu's Method
Generally, in remote sensing images, the number of pixels about moving ship targets is much less than the number of pixels about the water surface, and the overall grayscale values of the water surface are smaller than the grayscale values of the ship targets.Therefore, Otsu's method is used to extract the background of the moving ship targets, that is, the water surface.Meanwhile, in order to effectively suppress the interference of noise, we also carry out multi-structuring element morphological filtering, and binarize the results.Figure 9b shows the result of background extraction, where the highlighted positions in the figure is the background we extracted.We can see that because of the huge contrast between the water surface and the moving ship targets, the background is completely extracted, which provide a good foundation for the next segmentation step.
extract the background of the moving ship targets, that is, the water surface.Meanwhile, in order to effectively suppress the interference of noise, we also carry out multi-structuring element morphological filtering, and binarize the results.Figure 9b shows the result of background extraction, where the highlighted positions in the figure is the background we extracted.We can see that because of the huge contrast between the water surface and the moving ship targets, the background is completely extracted, which provide a good foundation for the next segmentation step.

Marker-Based Watershed Segmentation
The texture of high resolution remote sensing images is very rich.From these three images, we can see that the ground features on the riverbank and the trails the ships leave behind are very clear, which will cause an over-segmentation phenomenon when using the traditional watershed algorithm, and greatly reduce the accuracy of detection for the moving ship targets [22], as shown in Figure 10a.Therefore, the marker-based watershed algorithm is used to solve this problem.First, the foreground and background images we obtained in the aforementioned steps are superimposed.Next, using the foreground and background regions as the known local minimum to rectify the original images, we carry out the marker-based watershed segmentation to obtain the final result,

Marker-Based Watershed Segmentation
The texture of high resolution remote sensing images is very rich.From these three images, we can see that the ground features on the riverbank and the trails the ships leave behind are very clear, which will cause an over-segmentation phenomenon when using the traditional watershed algorithm, and greatly reduce the accuracy of detection for the moving ship targets [22], as shown in Figure 10a.Therefore, the marker-based watershed algorithm is used to solve this problem.First, the foreground and background images we obtained in the aforementioned steps are superimposed.Next, using the foreground and background regions as the known local minimum to rectify the original images, we carry out the marker-based watershed segmentation to obtain the final result, which is shown in Figure 10b.It can be seen that the marker-based watershed algorithm can effectively split the moving ship targets, avoid over-segmentation, and greatly improve the accuracy of segmentation.Figure 11 show the detection results overlaid on the original image.We can see that the detection of the shapes of Ship 1 and Ship 2 is relatively complete, effectively separating the ship's body and its trail.Because of the special shape of Ship 3, there is a "fault" of gray values between the latter half and the first half of the ship, which is rarely close to the grayscale of water surface.As a result, watershed segmentation cannot detect the trailing part of the ship, and ultimately separates it from the first half of Ship 3. Overall, the target detection algorithm we proposed in this paper can effectively detect moving ship targets in multi-view image sequences, and the accuracy of this detection is high.The shape information of the moving ship targets is obtained more completely.

Conclusions
To solve the problems caused by the long baseline and a large optical parallax of agile satellites Figure 11 show the detection results overlaid on the original image.We can see that the detection of the shapes of Ship 1 and Ship 2 is relatively complete, effectively separating the ship's body and its trail.Because of the special shape of Ship 3, there is a "fault" of gray values between the latter half and the first half of the ship, which is rarely close to the grayscale of water surface.As a result, watershed segmentation cannot detect the trailing part of the ship, and ultimately separates it from the first half of Ship 3. Figure 11 show the detection results overlaid on the original image.We can see that the detection of the shapes of Ship 1 and Ship 2 is relatively complete, effectively separating the ship's body and its trail.Because of the special shape of Ship 3, there is a "fault" of gray values between the latter half and the first half of the ship, which is rarely close to the grayscale of water surface.As a result, watershed segmentation cannot detect the trailing part of the ship, and ultimately separates it from the first half of Ship 3. Overall, the target detection algorithm we proposed in this paper can effectively detect moving ship targets in multi-view image sequences, and the accuracy of this detection is high.The shape information of the moving ship targets is obtained more completely.

Conclusions
To solve the problems caused by the long baseline and a large optical parallax of agile satellites when they capture a sequence of images, we propose a moving target detection algorithm based on Overall, the target detection algorithm we proposed in this paper can effectively detect moving ship targets in multi-view image sequences, and the accuracy of this detection is high.The shape information of the moving ship targets is obtained more completely.

Conclusions
To solve the problems caused by the long baseline and a large optical parallax of agile satellites when they capture a sequence of images, we propose a moving target detection algorithm based on marker-based watershed segmentation with foreground extraction using an inter-frame difference algorithm and background extraction using Otsu's method.Using these methods, we take advantage of the relevance and continuity of the moving ship targets in a multi-view sequence of images, and overcome the disadvantages of traditional object detection methods which are sensitive to changes in background features and have low reliability when extracting target information.Therefore, our method effectively results in the improvement of moving target detection.The method shows good results for the detection of moving ships on the water surface, and is capable of providing the information on the positions, shapes and textures of targets rapidly, which builds the foundation for the observation and tracking of ships [23], provides timely and effective guidance for decision-making on the ground, and has broad prospects in many applications.

Figure 1 .
Figure 1.The method of capturing image sequence by agile satellites.

Figure 1 .
Figure 1.The method of capturing image sequence by agile satellites.
) where i and j represent the number of different structuring elements and different scales, ω represents the weight, f represents the original image, b is the structuring elements, symbol "•" represents opening operation, and "•" represents closing.ISPRS Int.J. Geo-Inf.2017, 6, 334 4 of 12 continuously use 0°, 45°, 90° and 135° and different dimensions of the linear structuring elements to process opening and closing operations, and finally obtain the results from using these morphological filters, resulting in enhancement of the target with background suppression.The structuring elements are shown in Figure 3. Using different scales and different structuring elements, positive and negative noise signals can be simultaneously suppressed [14].This maximizes the filtering out of signals not related to target information.The formula of the algorithm is as follows:

Figure 5 .
Figure 5.The experimental results of histogram matching: (a) Sequence images of before histogram matching; (b) Sequence images of after histogram matching.

Figure 5 .
Figure 5.The experimental results of histogram matching: (a) Sequence images of before histogram matching; (b) Sequence images of after histogram matching.

Figure 6 .
Figure 6.Histograms before and after histogram matching: (a) Before histogram matching; (b) After histogram matching.

Figure 6 .
Figure 6.Histograms before and after histogram matching: (a) Before histogram matching; (b) After histogram matching.

Figure 7 .
Figure 7. Inter-frame difference results: (a) the first difference; (b) the second difference.

Figure 7 .
Figure 7. Inter-frame difference results: (a) the first difference; (b) the second difference.

Figure 7 .
Figure 7. Inter-frame difference results: (a) the first difference; (b) the second difference.

Figure 8 .
Figure 8.Comparison between top-hat filtering and multi-structuring element filtering: (a) Original Data; (b) Top-hat filtering result; (c) Multi-structural filtering result.

Figure 8 .
Figure 8.Comparison between top-hat filtering and multi-structuring element filtering: (a) Original Data; (b) Top-hat filtering result; (c) Multi-structural filtering result.

Figure 10 .
Figure 10.The comparison between traditional and marker-based watershed segmentation: (a) Traditional watershed result; (b) Marker-based watershed result.

Figure 11 .
Figure 11.The final results: (a) Contour line of Ship 1; (b) Contour line of Ship 2; (c) Contour line of Ship 3.

Figure 10 .
Figure 10.The comparison between traditional and marker-based watershed segmentation: (a) Traditional watershed result; (b) Marker-based watershed result.

Figure 10 .
Figure 10.The comparison between traditional and marker-based watershed segmentation: (a) Traditional watershed result; (b) Marker-based watershed result.

Figure 11 .
Figure 11.The final results: (a) Contour line of Ship 1; (b) Contour line of Ship 2; (c) Contour line of Ship 3.

Figure 11 .
Figure 11.The final results: (a) Contour line of Ship 1; (b) Contour line of Ship 2; (c) Contour line of Ship 3.

Table 1 .
Interval time and base-height ratio of sequence images.

Table 1 .
Interval time and base-height ratio of sequence images.

Table 2 .
Correlation test before and after histogram matching.

Table 2 .
Correlation test before and after histogram matching.