Detecting Moving Target on Ground Based on Its Shadow by Using VideoSAR

: Video synthetic aperture radar (VideoSAR) can detect and identify a moving target based on its shadow. A slowly moving target has a shadow with distinct features, but it cannot be detected by state-of-the-art difference-based algorithms because of minor variations between adjacent frames. Furthermore, the detection boxes generated by difference-based algorithms often contain such defects as misalignments and fracture. In light of these problems, this study proposed a robust moving target detection (MTD) algorithm for objects on the ground by fusing the background frame detection results and the difference between frames over multiple intervals. We also discuss defects that occur in conventional MTD algorithms. The difference in background frame was introduced to overcome the shortcomings of difference-based algorithms and acquire the shadow regions of objects. This was fused with the multi-interval frame difference to simultaneously extract the moving target at different velocities while identifying false alarms. The results of experiments on empirically acquired VideoSAR data veriﬁed the performance of the proposed algorithm in terms of detecting a moving target on the ground based on its shadow.


Introduction
The synthetic aperture radar (SAR) combines pulse compression and the synthetic aperture to obtain high resolution images along the directions of the range and the azimuth. It is regarded as an important means of earth observations and battlefield surveillance [1][2][3][4]. A military target may not stay motionless but seek to attack while constantly moving to improve its probability of survival. The conventional SAR imaging technique can provide only static observations of the given scene. The combing ground moving target indication (GMTI) technique partially enables SAR to detect moving targets. This method extracts some parameters of moving targets from moving bright spots after SAR imaging, such as the along track velocity [5], the variations in traffic [6], etc. However, such a GMTI/SAR technique incurs a significant computational load and has a complex implementation procedure [7][8][9]. Researchers at the Sandia National Laboratories have integrated SAR imaging with video-capturing technology to develop the video synthetic aperture radar (VideoSAR) technique [10]. Owing to its high frame rate of ground imaging, this technique can represent dynamic changes in a specific region on the ground at a high resolution in real time and is regarded as an innovation in military surveillance and reconnaissance [11].
The shadow of a moving target can be used as an object of detection in VideoSAR [12][13][14][15]. This has the theoretical advantages of a high positioning accuracy, high rate of detection, and low minimum detectable velocity. In [15], the authors discussed the mechanism of formation of the shadow of a moving target in a SAR image, and proposed a moving target detection (MTD) method as well as a method to assess it. The authors of [16] used A moving target can be detected through its shadow in VideoSAR, where a moving shadow can in turn be detected via the difference operation. Difference-based algorithms are widely used to detect changes in videos due to their simplicity and high efficiency. They include background difference and frame difference algorithms. This section provides a brief introduction and analysis of difference-based algorithms.
The background difference algorithm builds a background template and compares it with a sequence of images one by one to separate the static background from the moving target [15]. This process is simple in principle, and easy to implement and adapt to a dynamic background. It is frequently used in scenes where the background does not change or does so only slowly. Background modeling is the key step here, and is commonly implemented via mean background modeling, Gaussian background modeling, and nuclear density estimation-based background modeling [22].
Frame difference is an algorithm that removes the static region from an image and preserves the moving target by subtracting adjacent frames. It incurs a low computational load and is easy to implement. In applications, however, this algorithm exhibits such defects as misaligned detection boxes and sensitivity to the intensity of ground scattering. Moreover, targets with low velocities are difficult to detect because variations between adjacent frames are minor. Figure 1 shows the basic principle of the background difference and frame difference algorithms. Three frames before the given one are used to form the background template. The grayscales of the background and the target shadow are assumed to be homogeneous and constant to simplify the derivation. C t and C b in the figure represent the grayscales of the target shadow and the background shadow, respectively. It is clear that both the background difference algorithm and the frame difference algorithm can partially extract the moving region. The position of the moving target obtained via background difference has a significant offset (see the center of the white rectangle in Figure 1a), and the detection box obtained using frame difference is fractured. Such problems affect the accuracy of shadow detection and are thus important to solve. tional load and is easy to implement. In applications, however, this algorithm exhibits such defects as misaligned detection boxes and sensitivity to the intensity of ground scattering. Moreover, targets with low velocities are difficult to detect because variations between adjacent frames are minor. Figure 1 shows the basic principle of the background difference and frame difference algorithms. Three frames before the given one are used to form the background template. The grayscales of the background and the target shadow are assumed to be homogeneous and constant to simplify the derivation. t C and b C in the figure represent the grayscales of the target shadow and the background shadow, respectively. It is clear that both the background difference algorithm and the frame difference algorithm can partially extract the moving region. The position of the moving target obtained via background difference has a significant offset (see the center of the white rectangle in Figure 1a), and the detection box obtained using frame difference is fractured. Such problems affect the accuracy of shadow detection and are thus important to solve.

Background gray value : Cb
Target shadow gray value : Ct

Fusing Background Frame Detection and Multi-Interval Frame Difference for MTD
To address the deficiencies of difference-based algorithms in MTD by using Vide-oSAR, this section combines background frame detection and multi-interval frame difference. By accumulating the results of multi-interval frame difference, the proposed algorithm can drastically improve the detection of a slowly moving target, and background frame detection can yield more accurate positions of the moving target. The main procedure of the proposed algorithm is discussed in this section.

Background Frame Detection
The shadow of a moving target appears as a low grayscale area in the SAR image, and its grayscale range is related to the total time for which the shadow of the ground object is detected. The intensity of the shadow region of the moving target can be written as (detailed deduction has been provided in [21]): where k is the Boltzmann constant, n F are the noise coefficients of noise of the receiver, 0 T is the effective temperature of noise in receiver, n B is the spectral density of effective noise, area SNR is the signal-to-noise ratio of the area target, and q is a scale factor related to the duration for which the shadow appears. It is given by:

Fusing Background Frame Detection and Multi-Interval Frame Difference for MTD
To address the deficiencies of difference-based algorithms in MTD by using VideoSAR, this section combines background frame detection and multi-interval frame difference. By accumulating the results of multi-interval frame difference, the proposed algorithm can drastically improve the detection of a slowly moving target, and background frame detection can yield more accurate positions of the moving target. The main procedure of the proposed algorithm is discussed in this section.

Background Frame Detection
The shadow of a moving target appears as a low grayscale area in the SAR image, and its grayscale range is related to the total time for which the shadow of the ground object is detected. The intensity of the shadow region of the moving target can be written as (detailed deduction has been provided in [21]): where k is the Boltzmann constant, F n are the noise coefficients of noise of the receiver, T 0 is the effective temperature of noise in receiver, B n is the spectral density of effective noise, SNR area is the signal-to-noise ratio of the area target, and q is a scale factor related to the duration for which the shadow appears. It is given by: where L a is the azimuthal length of the target, ρ a is its azimuthal resolution, v t represents the velocity of the radar platform, r is the nearest range between the radar and the target, and λ c is the carrier wavelength. Considering Equations (1) and (2), it is clear that the intensity of the shadow of the moving target is closely related to the velocity of the target. Only a target moving within a certain range of velocity forms a macroscopic shadow. When the velocity of the target exceeds a certain value (q decreases to close to zero), the shadow of the moving target blends into the background instead of appearing as a distinct feature. Thus, the boundary value of the grayscale range of the shadow of the moving target is: Assuming that the given frame is I c (x, y), where (x, y) represents the coordinate of an arbitrary pixel, I se (x, y) is the binary image obtained via processing for shadow detection, the background frame detection can be expressed as: This operation can extract all regions with similar gray values to those of the shadow of the moving target, including targets moving at different velocities in a static, low RCS background. This process thus yields many false alarms while accurately extracting a moving target, where the number of false alarms is much higher than that of instances of the real target.

Multi-Interval Frame Difference
Conventional frame difference involves subtracting the frame adjacent to the given one from it to highlight moving objects. It has the advantages of low computational load and a simple implementation, but yet has an inherent defect in the case of detecting a slowly moving target. A modified method called symmetry difference is proposed to solve this problem. It involves fusing the results of differences of three frames (from front and rear to I c ) [23]. An efficient algorithm called multi-interval frame difference is used in this paper, and the details of its implementation are as follows: Assume that N frames in front of and behind I c are used and stacked into a 3D matrix F. For each frame F i in F, we calculate the difference between it and I c , and add all these results to form I temp . Then, I temp is compared with a preset threshold T S in a pixel-by-pixel manner and transferred into a binary image I md (logical(·) represents a logical operation), which can be written as: The interval used in frame difference affects detection-related performance. Assume that v t is the velocity of the target, f rate is the frame rate, and L t is the projected length along its velocity. It is clear that the best moment to calculate the frame difference is when the target moves exactly by distance L t . The ideal interval N ideal must satisfy the following equation: A slowly moving target cannot be extracted via conventional frame difference because changes between adjacent frames are minor. Hence, it is necessary to calculate the differences between frames over a wider interval. The value of N is related to the target's velocities in the observed scene. Normally, we need to ensure that the distance moved by the slowest target during (N − 1)/2 frames is longer than half of its own length. This can be expressed as: where V min denotes the minimum velocity of the target. It is easy to deduce the range of values of N: where Z contains all integers. The minimum N value required to meet the detection of slow moving targets is determined according to Equation (9), and the N-frame difference results are accumulated according to Equation (5). By accumulating the results of multi-interval frame difference, we can simultaneously detect a target moving with different velocities. However, due to characteristics of difference-based algorithms, the detection box and the real position of the shadow of the target are misaligned, and lead to erroneous results.

Fusion and Threshold Determination
Background frame detection can be used to obtain accurate positions of the moving target and the intact regions of coverage, but also leads to a large number of false alarms in a low-RCS background. Multi-interval frame difference can simultaneously detect the target at different velocities but incurs problems including the misalignment and fracture of the detection boxes. Therefore, this paper proposes an MTD algorithm by fusing these two algorithms.
The area of overlap of shadow is obtained via the logical AND operation: I lap is then added to the result of background frame detection I se , and the weighted shadow is extracted. Pixels that are detected by both algorithms are assigned the value "2", and other pixels are assigned "1".
Finally, we extracted the region of interest (ROI) from I f usion . The ROI of a moving target is normally detected by both algorithms, and thus has many pixels with the value 2. The ROI of a low RCS, static background is detected only by background frame detection, and thus has pixels of value 1. In practice, a threshold ROI (TROI) can be set to distinguish between the real target and a false alarm. This is called threshold determination: where S weight = sum(ROI(k)) is the weighted area of and S unweight = numel(ROI(k)) is its unweighted area. It is easy to obtain the range of values of T ROI , as in [1,2]. In general, T ROI is set between 1.3 and 1.5 for satisfactory detection probability of moving target at all speeds, while suppressing false alarms as much as possible. Figure 2 shows the effect of the proposed algorithm. It is clear that single frame shadow extraction can yield an accurate and intact shadow region and leads to many false alarms. Multi-interval frame difference can be used to acquire information on the moving region and yields misalignment and fractures in the detection box. After fusion, we can obtain an accurate position of the target with a low false alarm rate. is set between 1.3 and 1.5 for satisfactory detection probability of moving target at all speeds, while suppressing false alarms as much as possible. Figure 2 shows the effect of the proposed algorithm. It is clear that single frame shadow extraction can yield an accurate and intact shadow region and leads to many false alarms. Multi-interval frame difference can be used to acquire information on the moving region and yields misalignment and fractures in the detection box. After fusion, we can obtain an accurate position of the target with a low false alarm rate. A flowchart of the proposed VideoSAR-based MTD algorithm is shown in Figure 3. The VideoSAR preprocessing procedure is first applied to the input VideoSAR image sequence, including the inter-frame registration algorithm and the speckle suppression algorithm. After registration and noise suppression, one can obtain an image sequence with clear targets and a smooth background. Following this, the background frame detection step (shown in the green box) and the multi-interval frame difference step (shown in the orange box) are applied to the results of preprocessing, respectively. Finally, the fusion and threshold determination step (shown in the blue box) is used to increase the accuracy of detection of the shadow of a moving target on the ground and reduce false alarms.  A flowchart of the proposed VideoSAR-based MTD algorithm is shown in Figure 3. The VideoSAR preprocessing procedure is first applied to the input VideoSAR image sequence, including the inter-frame registration algorithm and the speckle suppression algorithm. After registration and noise suppression, one can obtain an image sequence with clear targets and a smooth background. Following this, the background frame detection step (shown in the green box) and the multi-interval frame difference step (shown in the orange box) are applied to the results of preprocessing, respectively. Finally, the fusion and threshold determination step (shown in the blue box) is used to increase the accuracy of detection of the shadow of a moving target on the ground and reduce false alarms. Figure 2 shows the effect of the proposed algorithm. It is clear that single frame shadow extraction can yield an accurate and intact shadow region and leads to many false alarms. Multi-interval frame difference can be used to acquire information on the moving region and yields misalignment and fractures in the detection box. After fusion, we can obtain an accurate position of the target with a low false alarm rate. A flowchart of the proposed VideoSAR-based MTD algorithm is shown in Figure 3. The VideoSAR preprocessing procedure is first applied to the input VideoSAR image sequence, including the inter-frame registration algorithm and the speckle suppression algorithm. After registration and noise suppression, one can obtain an image sequence with clear targets and a smooth background. Following this, the background frame detection step (shown in the green box) and the multi-interval frame difference step (shown in the orange box) are applied to the results of preprocessing, respectively. Finally, the fusion and threshold determination step (shown in the blue box) is used to increase the accuracy of detection of the shadow of a moving target on the ground and reduce false alarms.

Experimental Results and Discussion
Empirical VideoSAR data were used to verify the performance of the proposed method. The data were captured at Kirtland Air Force Base and published by Sandia National Laboratories (SNL) [10]. The video contained 900 frames captured at a rate of 29.97 Hz. The height and width of each frame after preprocessing were 720 pixels and 650 pixels, respectively. The methods proposed in [24] were used to register the image sequence and suppress speckle noise. The method to reduce the number of false alarms proposed in [21] was applied after fusion.

Results of Shadow Marking
To quantitatively analyze the results of shadow detection, the shadow of the moving target was marked for SNL VideoSAR data, and the area of the shadow was manually marked frame by frame. A total of 50 moving targets (vehicles) were marked in 900 frames of the video, including 33 vehicles on the left road and 17 on the right road. The results of marking of the moving vehicles in the SNL VideoSAR data are shown in Figure 4, the number of moving vehicles is plotted in Figure 4a, from three to 11 in each frame, and the motion of 50 vehicles in and out the frame is shown in Figure 4b. method. The data were captured at Kirtland Air Force Base and published by Sandia National Laboratories (SNL) [10]. The video contained 900 frames captured at a rate of 29.97 Hz. The height and width of each frame after preprocessing were 720 pixels and 650 pixels, respectively. The methods proposed in [24] were used to register the image sequence and suppress speckle noise. The method to reduce the number of false alarms proposed in [21] was applied after fusion.

Results of Shadow Marking
To quantitatively analyze the results of shadow detection, the shadow of the moving target was marked for SNL VideoSAR data, and the area of the shadow was manually marked frame by frame. A total of 50 moving targets (vehicles) were marked in 900 frames of the video, including 33 vehicles on the left road and 17 on the right road. The results of marking of the moving vehicles in the SNL VideoSAR data are shown in Figure 4, the number of moving vehicles is plotted in Figure 4a, from three to 11 in each frame, and the motion of 50 vehicles in and out the frame is shown in Figure 4b.  Due to limitations of space, only the results of the 23rd and the 551st frames are shown here. Figure 5 shows the results of detection of the shadow of a moving target on the ground after VideoSAR registration preprocessing. The shadows of six moving cars are marked in the 23rd frame in Figure 5a and in the 551st frame shown in Figure 5b. Due to limitations of space, only the results of the 23rd and the 551st frames are shown here. Figure 5 shows the results of detection of the shadow of a moving target on the ground after VideoSAR registration preprocessing. The shadows of six moving cars are marked in the 23rd frame in Figure 5a and in the 551st frame shown in Figure 5b.

Results of Shadow Detection
The processing parameters were as follows: The grayscale range used for background frame detection ( , min max C C ) was (30, 50), ROI T was set to 1.3, and N was set to 7 based on the analysis in Section 3. The area of ROI was limited to the range (80, 500). The morphological opening operation used a 3 × 3 circular structural element. The closed operation used a 5 × 5 circular structural element. Adaptive histogram equalization was used to improve visibility (the original video was extremely dark and objects in it were

Results of Shadow Detection
The processing parameters were as follows: The grayscale range used for background frame detection (C min , C max ) was (30, 50), T ROI was set to 1.3, and N was set to 7 based on the analysis in Section 3. The area of ROI was limited to the range (80, 500). The morphological opening operation used a 3 × 3 circular structural element. The closed operation used a 5 × 5 circular structural element. Adaptive histogram equalization was used to improve visibility (the original video was extremely dark and objects in it were difficult to recognize). Figures 6 and 7 show that the background frame detection algorithm could extract regions that satisfied the grayscale range-related criteria of the moving target, while generating a large number of false alarms (see Figures 6b and 7b). The multi-interval frame difference algorithm could detect regions of vibrations (see Figures 6c and 7c). A comparison of the results of these two algorithms yielded a precise difference between false alarms and shadows of the real target (note the grayscale of each pixel; a false alarm could meet the grayscale criteria without variation), shown in Figures 6d and 7d

Results of Shadow Detection
The processing parameters were as follows: The grayscale range used for background frame detection ( , min max C C ) was (30, 50), ROI T was set to 1.3, and N was set to 7 based on the analysis in Section 3. The area of ROI was limited to the range (80, 500). The morphological opening operation used a 3 × 3 circular structural element. The closed operation used a 5 × 5 circular structural element. Adaptive histogram equalization was used to improve visibility (the original video was extremely dark and objects in it were difficult to recognize). Figures 6 and 7 show that the background frame detection algorithm could extract regions that satisfied the grayscale range-related criteria of the moving target, while generating a large number of false alarms (see Figures 6b and 7b). The multi-interval frame difference algorithm could detect regions of vibrations (see Figures 6c and 7c). A comparison of the results of these two algorithms yielded a precise difference between false alarms and shadows of the real target (note the grayscale of each pixel; a false alarm could meet the grayscale criteria without variation), shown in Figures 6d and 7d  The probability of detection P d and false alarm rate F ar were used to evaluate the detection performance of the proposed method. P d and F ar are defined as follows: where N c denotes the number of correct instances of detection and N r denotes the number of real moving targets. N f represents the number of false alarms and N all represents the number of all detected targets. Figure 8 shows indices of the probability of detection and false alarm rate. Both P d and F ar yielded suitable values of 77.65% and 11.21%, respectively. Table 1 compares detectionrelated performance of the background frame detection algorithm, the multi-interval frame difference algorithm and the proposed algorithm. It is clear that the background frame detection algorithm recorded the highest rate of detection and the highest false alarm rate. The multi-interval frame difference algorithm yielded lowest false alarm rate but also a low rate of detection. The proposed algorithm, by combining the advantages of these two algorithms, yielded good results on both indices. and ar F yielded suitable values of 77.65% and 11.21%, respectively. Table 1 compares detection-related performance of the background frame detection algorithm, the multi-interval frame difference algorithm and the proposed algorithm. It is clear that the background frame detection algorithm recorded the highest rate of detection and the highest false alarm rate. The multi-interval frame difference algorithm yielded lowest false alarm rate but also a low rate of detection. The proposed algorithm, by combining the advantages of these two algorithms, yielded good results on both indices.

Comparison and Discussion
To visually show the effects of the background frame detection algorithm and the multi-interval frame difference algorithm, part of the target region is illustrated in

Comparison and Discussion
To visually show the effects of the background frame detection algorithm and the multi-interval frame difference algorithm, part of the target region is illustrated in Figure 9. The background frame detection algorithm could not deal with false alarms in a low-RCS background, whereas the multi-interval frame difference algorithm could not handle problems such as the misalignment and fracture of the detection box. The fusion algorithm, however, obtained satisfactory results in terms of both measures.
To verify the advantage of the multi-interval frame difference algorithm, it was compared with the conventional frame difference method (adjacent frame difference mentioned in Section 2). Three targets were marked in descending order of velocity from 1 to 3. The multi-interval frame difference algorithm outperformed the conventional method, as shown in Figure 10. In Figure 10a, only small areas of targets 1 and 2 are detected and deviated away from their real positions. By contrast, the result of detection in Figure 10b is superior. low-RCS background, whereas the multi-interval frame difference algorithm could not handle problems such as the misalignment and fracture of the detection box. The fusion algorithm, however, obtained satisfactory results in terms of both measures.
(a) (b) To verify the advantage of the multi-interval frame difference algorithm, it was compared with the conventional frame difference method (adjacent frame difference mentioned in Section 2). Three targets were marked in descending order of velocity from 1 to 3. The multi-interval frame difference algorithm outperformed the conventional method, as shown in Figure 10. In Figure 10a, only small areas of targets 1 and 2 are detected and deviated away from their real positions. By contrast, the result of detection in Figure 10b is superior.  To verify the advantage of the multi-interval frame difference algorithm, it was compared with the conventional frame difference method (adjacent frame difference mentioned in Section 2). Three targets were marked in descending order of velocity from 1 to 3. The multi-interval frame difference algorithm outperformed the conventional method, as shown in Figure 10. In Figure 10a, only small areas of targets 1 and 2 are detected and deviated away from their real positions. By contrast, the result of detection in Figure 10b is superior.

Algorithm Time Complexity Analysis
In this subsection, we theoretically analyze the computational burden of the proposed algorithm by computing the floating-point operations (FLOPs) of the main steps. Assume the size of a frame of an image is P × Q (pixel × pixel). The image is real data. The calculation complexity of addition, subtraction, multiplication and logical operations of a frame of the image is PQ FLOPs. Then, the computational complexity of Equations (4)-(6) and (10)- (11) can be expressed as: C p = PQ + PQ · 3N + PQ + PQ + PQ + PQ = (3N + 5)PQ (15) Since N is much smaller than P and Q, the computational complexity is O(PQ). In addition, considering the frame rate of VideoSAR is f rate , the calculation burden of VideoSAR per second can be obtained as: C p1 = f rate · C p = f rate (3N + 5)PQ (16) In this paper, f rate is 29.97 Hz, P is 720 pixels, Q is 650 pixels and N was set to 7; the calculation amount of video SAR detection part is about 3.6 × 10 8 FLOPs per second.
The proposed algorithm was tested in C language on a platform with an Intel i7-7700 K CPU with a dominant frequency of 4.2 GHz and 32 GB RAM. The running time of the detection algorithm part is 24.81 s, which is less than the whole video acquisition time of 30 s. The algorithm has high computational efficiency and meets the needs of practical application.

Conclusions
In light of the problem of the misalignment of the detection box, and a mismatch between the difference intervals and the target velocities, this paper combined the background frame detection algorithm and the multi-interval frame difference algorithm to improve performance in terms of MTD, especially in the case of a slowly moving target. To solve the problem whereby the conventional frame difference algorithm cannot detect the target at different velocities, the proposed algorithm accumulated the results of differences among multiple frames to improve MTD of a slowly moving target. To solve the problem whereby the detection boxes obtained via difference-based algorithms were misaligned and fractured, the proposed algorithm introduced background frame detection to obtain accurate positions of the shadow and used the result of multi-interval frame difference to reduce false alarms, while preserving the correct target. The results of experiments on empirically acquired VideoSAR data verified the effectiveness of the proposed algorithm. The algorithm can also be applied to point moving target detection of optical video satellites (such as Jilin No. 1 Video Satellite).
Author Contributions: All the authors made significant contributions to the work. Z.H. and Z.L. designed the research and analyzed the results. X.C. and T.Y. performed the experiments. Z.H. and Z.L. wrote the paper. A.Y. and Z.D. provided suggestions for the preparation and revision of the paper. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.