Next Article in Journal
Adaptive Motion Artifact Reduction Based on Empirical Wavelet Transform and Wavelet Thresholding for the Non-Contact ECG Monitoring Systems
Next Article in Special Issue
Hough Transform-Based Large Dynamic Reflection Coefficient Micro-Motion Target Detection in SAR
Previous Article in Journal
Robust GICP-Based 3D LiDAR SLAM for Underground Mining Environment
Previous Article in Special Issue
Impact of Thermal Control Measures on the Imaging Quality of an Aerial Optoelectronic Sensor

Sensors 2019, 19(13), 2912;

An Efficient Extended Targets Detection Framework Based on Sampling and Spatio-Temporal Detection
by Bo Yan 1, Na Xu 2, Wenbo Zhao 1, Muqing Li 1 and Luping Xu 1,*
School of Aerospace Science and Technology, XIDIAN University, 266 Xinglong Section of Xifeng Road, Xi’an 710126, China
School of Life Sciences and Technology, XIDIAN University, 266 Xinglong Section of Xifeng Road, Xi’an 710126, China
Author to whom correspondence should be addressed.
Received: 16 April 2019 / Accepted: 21 June 2019 / Published: 1 July 2019


Excellent performance, real-time and low memory requirement are three vital requirements for target detection in high resolution marine radar system. Unfortunately, many current state-of-the-art methods merely achieve excellent performance when coping with highly complex scenes. In fact, a common problem is that real-time processing, low memory requirement and remarkable detection ability are difficult to coordinate. To address this issue, we propose a novel detection framework which bases its principle on sampling and spatiotemporal detection. The framework consists of two stages, coarse detection and fine detection. Sampling-based coarse detection is designed to guarantee the real-time processing and low memory requirements by locating the area where targets may exist in advance. Different from former detection methods, multi-scan video data are utilized. In the stage of fine detection, the candidate areas are grouped into three categories: single target, dense targets and sea clutter. Different approaches for processing the different categories are implemented to achieve excellent performance. The superiority of the proposed framework beyond state-of-the-art baselines is well substantiated in this work. Low memory requirement of the proposed framework was verified by theoretical analysis. Real-time processing capability was verified by the video data of two real scenarios. Synthetic data were tested to show the improvement in tracking performance by using the proposed detection framework.
marine radar system; target detection; extended target; clutter suppression

1. Introduction

Moving target detection plays a primary and pivotal role in the marine radar system, which aims to completely and accurately detect moving objects from video data. For the non-Gaussian sea clutter and complex backgrounds, using sequential radar images to extract targets of interest such as vessels and low-flying aircraft is a challenging task. The moving target detection problem mainly has two issues, target detection and target tracking. Target detection is explored to find the candidate positions of targets by the video data originated from radar front-end. Target tracking is designed to associate the positions into the trajectories of the targets. As the increased resolution of modern radar, targets would be found in several resolution cells rather than merely appearing in one single resolution cell. Then, the high-resolution radar would receive more than one points per time step from different corner reflectors of a single target. The target is unsuitable to be categorized as a point. Therefore, a hot research topic, extended target detection and tracking, arises recently. The aim of this work is to develop a novel target detection framework to improve the extended target tracking performance by providing more accurate points of targets and fewer false alarm points.
Various algorithms have been developed for multiple extended target tracking (METT). ET-PHD-based algorithms [1,2,3,4,5] are capable of estimating the target extent and measurement rates as well as the kinematic state of the target. For the weak extended target, track-before-detect (TBD) methodology, which makes full use of multi-scan, has been employed to develop Hough transformation-based TBD [6], dynamic programming-based TBD [7], Grey Wolf optimization-based (GWO) TBD [8], and particle filter-based methods [9]. The existing methods [1,2,3,4,5,6,7,8,9] have achieved excellent tracking performance. It is hard to further significantly improve tracking performance by developing delicate tracking algorithms. Therefore, providing more accurate points of targets and less false alarm points by improved extended target detection method is promising to improve the performance of radar systems. The PHD-based filters [1,2,3,4,5] are fed with the points provided by an originally ordered statistic constant false alarm rate (OS-CFAR) detector [10]. Following the work in [10], more improved versions of CFAR detectors [11,12,13] are developed. Wang C. et al. [11] present an intensity-space (IS) domain CFAR ship detector. In [12], clutter modeling has been identified as a viable solution to suppress false detection. Gao G. et al. [13] develope a statistical model of the filter in nonhomogeneous sea clutter to achieve CFAR detection. However, due to the presence of non-intentional interference (sea clutter, thermal clutter and ground clutter) and the echoes of background (mountains, shores, buildings, islands and motor vehicles), many false alarms exist. The methods in [10,11,12,13] are insufficient in suppressing the fixed clutter. To address this problem, clutter map-based CFAR (CM-CFAR) methods [14,15], which take the full benefits of multi-scans, are proposed. Conte et al. [14] develop a CM-CFAR relying on a combination of space and time processing. In [15], the background noise/clutter power is estimated by processing the returns in the map cell from all scans up to the current one. Using temporal information and spatial information simultaneously is another powerful technology that benefits from multi-scans for clutter suppression [16,17,18]. The priority of the methods is identifying the background priors for a video [16]. The spatial saliency map and the temporal saliency map are calculated in [17] and spatiotemporal local contrast filter is developed in [18]. However, drawbacks still exist. Pixels of radar video increase dramatically for the increases of resolution and coverage range. Many more calculations and memory are required for using the detection methods [10,11,12,13,14,15,16,17,18]. Meanwhile, one frame of video must be processed within a radar scanning cycle by limited memory space. However, the methods in [10,11,12,13,14,15,16,17,18] cannot process the video in real time. Meanwhile, the memory requirements of the methods in [14,15,16,17,18] are enormous. The above-mentioned shortcomings drastically limit the utilization of the methods in [10,11,12,13,14,15,16,17,18] in engineering.
A series of algorithms has been developed in our former works [19,20,21,22] to address excellent performance, real-time processing, and low memory requirement simultaneously. A contour tracking algorithm is used in [19] to meet the real-time processing in advance. The detection approach in [20], which uses a region growing algorithm, is developed to improve location precision. The methods in [21,22] are designed to detect targets in dense targets scenarios such as fleet detection and targets in lanes. To suppress the fixed clutter, an efficient spatiotemporal detection method based on sampling is designed. Meanwhile, the sampling-based spatiotemporal detection method and the methods in [20,21,22] are integrated into the novel detection framework. The methods in [19,20,21] detect targets only by the current frame. In this work, both the current frame and several past frames are utilized to estimate the intensity of the clutter. Compared with former work [19,20,21], more video data are utilized to improve the detection performance. Thus, our former works [19,20,21,22] are components in the proposed framework. We do not simply piece together these components. The framework is designed to make each component work well with the others. Past frames have not been used in extended target detection before and the detection methods are not combined with target tracking methods in [19,20,21,22]. Their excellent performance can be improved by fine detection. Meanwhile, computation and memory requirements can be decreased by coarse detection. In the first stage, a sampled map evaluating the clutter intensity of surveillance area is built to suppress the fixed clutter. Unlike spatiotemporal-based filters [18], little memory is required for sampling on range, azimuth and time axes. The coarse detection is designed to roughly locate the area where targets may exist in advance by uniformly selecting seeds in the whole surveillance area. Only the selected seeds are used to guarantee the real-time processing and low memory requirement. In the fine detection stage, only the areas where targets may exist are processed. The candidate areas are identified into three categories, namely single target, dense targets and sea clutter, by the contours of the areas [21]. The areas of dense targets are further separated into subareas using the Rain algorithm method [22]. Each subarea is regarded as an individual target. Excellent performance can be achieved by the fine detection. As presented in Figure 1, the input of the target detection is the video sequences of radar. The results of target detection are three-dimensional points, i.e. two-dimensional positional information and its measuring time. The measuring time in target tracking algorithms can be simply represented by the frame number (see, e.g., [1,2,3,4]). Correct points can be obtained by the detection framework to further improve the final tracking performance. Figure 1 describes the relationship between the radar data processing and the existing methods mentioned above.
The remainder of the work is organized as follows. Section 2 defines the models and problems. Section 3 presents the implementation of the sampling-based spatiotemporal detection method. In this section, the proposed detection framework is also presented. The superiority of the proposed framework beyond state-of-the-art baselines is substantiated in Section 4 using real high-resolution marine radar data as well as synthetic data. Section 5 draws conclusions.

2. Models and Notations

2.1. Target Model

Assume that the extended targets are randomly distributed on an x–y plane. We use Mk to denote the number of targets at kth scan. The size and quantity of targets are unknown. A general approach based on support functions to model smooth object shapes presented in [23] is used here. The state of an individual extended target can be modeled in state space Rs, s = 6. The target state of mth target is Stm = (xm, ym, lm, wm, αm, pm), 1 ≤ mM. xm and ym denote the centroid of mth target on x- and y-axis, respectively. lm and wm are the lengths of the major axis and the minor axis. αm is the angle between the major axis and line of sight. pm denotes the intensity present in a single pulse return. The comparison between reflection models and real data [20] infers that Swerling type 1 is more appropriate to express the magnitude of the target. The magnitude of a pulse yt return follows the Rayleigh distribution.
f t ( y t ) = 2 y t p exp ( y t 2 p ) , y t > 0
where yt means the intensity of the echo originated from a target. The intensity of a target in an azimuth bin M(r,a) is presented in Figure 2.

2.2. Noise Model

The clutter consists of two parts, sea clutter and measurement noise. The mean and variance of sea clutter are closely related to the sea state. The sea clutter distribution model of major theoretic and practical interest is the so-called K-distributed clutter model [24,25,26], and the PDF (probability distribution function) is Equation (2).
f c ( p c ) = 2 b Γ ( v + 1 ) ( b p c 2 ) v + 1 K v ( b p c ) ,
where pc denotes the power of the sea clutter in this cell, Γ(v) is the gamma function, ν is referred to as the shape parameter, b is a scale parameter, and Kv(u) denotes the modified Bessel function of second kind and order. Measurement noise is a zero-mean, white and uncorrelated Gaussian noise sequence.

2.3. Measurement Model

The surveillance area is divided into NA × NR grid cells in a polar coordinate, where NA and NR are the number of cells on azimuth and range axes, respectively. Each cell corresponds to a pixel in radar video. Figure 2 show that the video data can be modeled by Equation (3) in a polar coordinate. Parameters r and a denote the location on range and azimuth axes. Z(r,a) means the amplitude in the range–azimuth resolution cell (r,a).
Z ( r , a ) = N ( r , a ) + π π ω ( θ ) ( C ( r , a + d θ ) + M ( r , a + d θ ) ) d θ ,
where ω(θ) in Figure 2 is the antenna pattern function. The non-noise measurement of cell Z(r, a) is related to the RCS of target M(r, a) and the clutter C(r, a). The distribution of M(r, a) is related to the shape and material of the target in cell (r, a). N(r, a) denotes the additive measurement noise.
In Figure 2, an aircraft is illuminated by radar beams. The measurement model of marine radar [20,21] infers that, once parts of the aircraft are illuminated by the beam, the radar echoes in this azimuth bin are affected by the aircraft. The target would be illuminated by the main lobe of the beam when the direction of the beam equals φ. The scope of φ, φ1φφ2, can be estimated by Equation (4). The area where the echo is affected by the extended target in azimuth and range axes can be represented by A and R, respectively.
A = φ 2 φ 1 = 2 ψ + θ 0 2 arccos ( ( x k ) 2 + ( y k ) 2 x k l cos α k y k l sin α k ( x k ) 2 + ( y k ) 2 ( x k l cos α k ) 2 + ( y k l sin α k ) 2 ) + θ 0
The proof of Equation (4) is presented in the Appendix of [7]. θ0 denotes 3dB azimuth beam width of the radar. The expression of A and R is presented by
{ θ 0 A 2 arctan ( l max ( x k ) 2 + ( y k ) 2 ) + θ 0 0 R l max
lmax and lmin are the upper and lower limits of l. Equation (4) infers that, for the extension of the beams, the image of the target is larger than its real size. The azimuth bins and range bins whose amplitude (am, rm) might be affected by the object can be estimated using Equation (6).
{ ( 2 arctan ( l min ( x k ) 2 + ( y k ) 2 ) + θ 0 ) × N A 2 π a m ( 2 arctan ( l max ( x k ) 2 + ( y k ) 2 ) + θ 0 ) × N A 2 π l min × N R C R r m l max × N R C R
where function denotes rounding up a value and CR denotes the coverage range of radar. The measurements obtained from the front-end of radar are images that have NA × NR pixels, in accordance with the time series.
Z k = { Z ( r , a , t ) | 0 r N R ; 1 a N A ; 1 t K } ,
where K is the quantity of images.

2.4. Problem Statement

The aim of target detection is extracting the state of targets Stm by the video Zk. The quantity of video in high-resolution radar system is enormous. The parameters of the available high-resolution marine radar in this work are presented in Table 1.
Target detection must be completed in a radar scanning cycle (10 s). The images of the radar in two different scenarios are presented in Figure 3. Putting the detection performance aside, CFAR-based methods [10,11,12,13] and spatiotemporal-based methods [16,17,18] spend more than 200 s to complete the detection in the two real scenarios presented in Figure 3a,b. Meanwhile, it can be seen that the video data for the two scenarios are quite different, mainly because of the location of the two radars. Scenarios 1 and 2 correspond to Radars 1 and 2 in Figure 3c. Radar 1 is located on the hillside of a peninsula. Most of the areas near Radar 1 are sea or forest, the echo intensity of which is far less than the objects in urban areas. Meanwhile, the beams of Radar 1 are obscured by the peak in some azimuth bins. Relatively fewer clutter regions emerge for these two reasons. However, Radar 2 is located at the peak of a mountain that is facing the sea. Therefore, two urban regions around Radar 2 can be illuminated by the beams and more clutter regions emerge. The video data of the two scenarios were processed by the existing detection methods. However, the methods in [8,9,10,11,12,15,16,17] are far from meeting the real-time processing requirement.
Meanwhile, an image of the whole surveillance area is about 200 Mb. It is impossible to directly store the past images using the methods in [16,17,18]. The methods in [16,17,18] have difficulty being performed on the current hardware (MPC8640D PowerPC). The efficient methods in our former work [19,20,21] were developed to meet the requirement of real-time processing and low memory. However, we found that the methods [19,20,21] are insufficient to cope with the complex environment. Based on those methods [16,17,18,19,20,21,22], we propose a novel detection framework, which is promising to achieve the three requirements simultaneously.

3. Proposed Methods

3.1. Sampling-Based Spatiotemporal Thresholding Method

Some clutter regions originate from some huge fixed objects such as buildings and islands. Areas with high sea conditions are also responsible for clutter regions. Clutter regions are much larger than the resolution cell. Meanwhile, as shown by the direct-viewing explanation in Figure 3a, the measurements are spatially correlated. A sampling-based spatiotemporal thresholding algorithm is proposed with the utilization of the spatial context. The implementation of the method consists of the following steps. The input of the method is K successive images, each of which has NR × NA pixels. The result is a sampled thresholding map.
Step 1. The sample intervals in range and azimuth, dR and dA, are estimated according to the parameters of the radar and the size of clutter sources. The values of the two sample intervals can be set to dR and dA when the clutter region in the image is no larger than 2dR × 2dA. The sample interval in time dt is related to the variation rate of clutter regions. A larger dt can be set when the area and intensity of the clutter regions are changing slowly.
Step 2. To efficiently monitor the variation of clutter regions, only some of the pixels are uniformly selected from the images. The selected pixels used to evaluate the clutter Zm are called marked cells here.
Z m = { Z ( i d R , j d A , n d t ) | 1 i N R d R ; 1 j N A d A ; 1 n K d t }
Step 3. The sampled spatiotemporal thresholding map M has (NR/dR) × (NA/dA) pixels. The intensity of the pixels can be estimated by the marked cells Zm. A (2w + 1) × (2w + 1) × (K/dt) local patch is defined in the marked cells. The set of marked cells in the local patch can be regarded as Equation (9) when evaluating the threshold of a marked cell (r, a):
{ Z ( r + i d R , a + j d A , K u d t ) | w i w ; w j w ; 0 u K / d t }
Then, the mean intensity of the local image patch at cell (r, a) is represented by m(r, a):
m ( r , a ) = 1 K 1 d t ( ( 2 w + 1 ) 2 1 ) t = 0 K 1 d t i = w w j = w w Z ( r + i d R , a + j d A , K u d t )
Step 4. After obtaining the sampled spatiotemporal thresholding map M, the intensity of non-marked cell can be estimated by the two-dimensional linear interpolation in Equation (11).
m ( r , a ) = [ ( r 2 r r 2 r 1 ) ( a 2 r a 2 a 1 ) m ( r 1 , a 1 ) + ( r 2 r r 2 r 1 ) ( a 2 r a 2 a 1 ) m ( r 1 , a 2 ) + ( r 2 r r 2 r 1 ) ( a 2 r a 2 a 1 ) m ( r 2 , a 1 ) + ( r 2 r r 2 r 1 ) ( a 2 r a 2 a 1 ) m ( r 2 , a 2 ) ] ,
where (r1,a1), (r1,a2), (r2,a1) and (r2,a2) are the four nearest marked cells.
The result of the sampling-based spatiotemporal thresholding method is the thresholding map M. Meanwhile, it is worth noting that not all of the intensities of non-marked cells are necessary for fine detection. The intensities of non-marked cells are calculated only when an extended target potentially exists in the area. Only (NR/dR) × (NA/dA) × (K/dt) cells involved in evaluating the sampled map are stored in the processer. Compared with existing spatiotemporal-based methods [16,17,18], many calculations can be saved. Meanwhile, fewer involved cells also means a drastic decrease in computation.

3.2. The Proposed Detection Framework

After the proposed approach above, the spatiotemporal thresholding map that has all NR × NA cells, M = {m(r,a)|1<r<NR,1<a<NA }, is available to detect the targets in theory. A contrast map C, which has NR × NA pixels, is defined first, and the intensity of cell (r, a) in the contrast map is denoted by C(r, a):
C ( r , a ) = Z ( r , a , K ) m ( r , a )
The input of the proposed detection framework is the contrast map C. Similar to the thresholding map M, the intensities of cells are not calculated if unnecessary. The proposed detection framework consists of two stages, coarse detection and fine detection.
In coarse detection stage, some cells are uniformly selected from the contrast map for efficiency. The input of the coarse detection is the contrast map.
Step 1. The sample intervals in range and azimuth, dr and da, are estimated according to the parameters of the radar and the size of the targets.
Step 2. The approximate locations of targets are found efficiently by uniformly selecting some of the cells from the contrast map C. The selected cells, Cs, are called “seed cells”.
C s = { C ( i d r , j d a ) | 1 i N R d r ; 1 j N A d a }
Step 3. The candidate areas where targets may exist are found by setting a threshold Td to the seed cells.
{ C ( i d r , j d a ) T ; target may exist in ( i d r , j d a ) C ( i d r , j d a ) < T ; no targets exist in ( i d r , j d a )
The function of false alarm rate PFA(Td) and the function of target detection rate PD(Td) can be derived when the parameters of radar and targets are given. The optimal threshold can be obtained by Equation (15).
T d = arg max ( P D ( T d )   P F A ( T d ) ) ,
The derivation and simulation of the expressions for PFA(Td) and PD(Td) can be found in our previous work [20].
The results of the coarse detection are the seed cells whose intensities are larger than the threshold. The set of the seed cells is assumed to have Ns elements, i.e. CT = {Csi,1 ≤ iNs}.
The second stage is fine detection. The accurate statement of targets is estimated by the seed set CT. The fine detection consists of the following steps.
Step 1. A seed cell in CT is taken to find the contour of the candidate target in contrast map C. The multiple contour tracking method in [19] is utilized to obtain the contours of the area under different thresholds.
Step 2. The area can be grouped into four categories by its contours. If the area is a huge plain without outstanding peaks, the area very likely is unresolved clutter. If the area is larger than a normal target and has several outstanding peaks, the image of the area usually originates from several nearby targets. Then, go to Step 3 for further processing the area. If the area is moderate in size and has an outstanding peak, the image of the area should originate from a single target. Then, go to Step 4. If the area is very small, it is a false alarm. Then, go back to Step 1.
Step 3. The image of multiple targets is partitioned into smaller subareas, each of which can be regarded as a single target. The multilevel thresholding method using the Rain algorithm in [20] has been developed for this purpose. After obtaining the subareas, go to Step 4.
Step 4. The state of a single target can be estimated by the image of the area. The state includes not only location, size, and posture of the target, but also the texture of the subarea. The texture is promising for improving the association in multi-target tracking [27]. Then, go to Step 1 to process the next seed cell in Cs.
To have a better description of the proposed framework, two points are worth noting. The first is the sample intervals in spatiotemporal thresholding method and stage of coarse detection. Figure 4 presents an example of this relationship. The sample intervals dR and dA are utilized to locate the area of a clutter region. Therefore, dR and dA are larger than dr and da, which are utilized to locate the targets because the clutter regions are much larger than the targets. The sample intervals dR, dA, dr, and da are related to the parameters of the radar and the size of targets. It assumes that the long axes of an extended target and a clutter region are lm and Lm. The lower limits of lm and Lm are lmin and Lmin. Then, according to Equation (6), there are at least dat azimuth bins and at least drt range bins whose amplitudes are affected by an extended target.
{ d a t = ( 2 arctan ( l min ( x k ) 2 + ( y k ) 2 ) + θ 0 ) × N A 2 π d r t = l min × N R C R
Similarly, there are at least dAC azimuth bins and at least dRC range bins whose amplitudes are affected by a clutter region.
{ d A C = ( 2 arctan ( L min ( x k ) 2 + ( y k ) 2 ) + θ 0 ) × N A 2 π d R C = L min × N R C R
The sample intervals dR, dA, dr, and da should be no less than the lower limits, i.e.,
{ d A C > d A d R C > d R ; { d a t > d a d r t > d r
The second point is the multiple contours of the candidate area. As presented in Figure 5, the contours of a single target, nearby targets and false alarm are represented by the black lines of different intensities. The contour of the false alarm is small and irregular. The outstanding peaks in the area of targets can be found by the contours.
The flowchart of the proposed detection framework is presented in Figure 6. The inputted video and current image are presented in the red dashed box. The video data contain enormous cells to be processed in detection algorithms. The sampled video for spatiotemporal thresholding method is presented in the blue dashed box. Only a few cells need to be stored in the processor, thus much memory is saved. The (NR/dr) × (NA/da) selected cells utilized in the stage of coarse detection are presented in the green dashed box. The cells involved in the fine detection are presented in the purple dashed box. The state of targets is estimated using only these cells. A small quantity of involved cells brings a significant decrease in calculations. The black dashed boxes in the bottom of Figure 6 infer that the areas are clustered into three categories and the points regarding the location of targets can be obtained. The points are the results of the target detection.

4. Experiment and Results

4.1. Real Data

Suitable memory requirement and real-time performance are two basic requirements in target detection. However, it is hard to balance good detection ability and these two requirements at the same time. To evaluate the superiority of the proposed framework in memory requirement and calculation, two problems of several representative methods are discussed in this section.
The calculation of CFAR-based methods is closely related to the quantity of the cells employed for estimating a threshold. Figure 7 shows the quantity of the cells used in several methods. The x-, y-, and z-axes in Cartesian coordinates represent azimuth, range and time axes, respectively. The blue and red cells, respectively, denote the pixels in the current frame and past frames. The green cell represents the cell whose threshold is being estimated.
Parameters m and n are the size of one target in azimuth and range. Then, the local region size is (m + 2d1) × (n + 2d2). The guard area with m × n cells exists so that the clutter pixels are collected some distance away from the test cell and target pixels are prohibited from contaminating clutter statistics estimation. The target size in the image can be estimated by the models in Section 2.3. The parameters of the radar in this work are listed in Table 1 and presented in Figure 7. m and n equal 21 and 3. respectively. Here, we set d1 = 5 and d2 = 4. d1 and d2 denote the width of protection cells on range and azimuth axes, respectively. d1 and d2 should meet the criterion in Equation (19) to ensure that the selected cells do not belong to the target.
{ d 2 > d a t / 2 d 1 > d r t / 2
However, large values of d1 and d2 mean more cells would be employed to evaluate the threshold. Then, in Cell-Averaging CFAR (CA CFAR) [28] and OS CFAR [10], (m + 2d1) × (n + 2d2) – m × n cells are employed for estimating the threshold. In CM CFAR [15], the p past cells at the same location are employed. We set p = 15 here. In spatiotemporal CFAR [16], both the (m + 2d1) × (n + 2d2) – m × n cells in current frame and the p cells in past frame are employed for one threshold. In spatiotemporal CA CFAR, all cells in this region are necessary, i.e. (m + 2d1) × (n + 2d2) – m × n cells in the current frame and (m + 2d1) × (n + 2d2) × p cells in the past frames. In the proposed framework, as presented in Figure 7, for the sampling, a × b × c cells are selected from the (a × dR) × (b × dA) × (c × dt)-sized cube. a, b, and c equal 11, 5 and 8 here, respectively. The sample intervals dR, dA and dt are 5, 10 and 5, respectively.
The quantity of employed cells for a threshold in these methods is listed in Table 2 for a better description. The sufficient quantity of employed cells and the huge distance between the test cell and target pixels guarantee that a more suitable threshold can be obtained. Meanwhile, employing more cells can improve the robustness in estimating threshold. However, it means that more calculations are spent on one threshold. In CFAR-based approaches, the threshold of all cells, NR × NA here, is calculated. However, only the threshold of (NR/dR) × (NA/dA) marked cells are estimated in our method. We set dR = 5 and dA = 10 here. Thus, only NR × NA/50 thresholds are estimated. The total quantity of employed cells equals the number of cells for one threshold and the marked cells, i.e. 440 × (NR/dR) × (NA/dA). Therefore, considerable calculations are saved. The total number of employed cells is presented in the fourth column of Table 2.
Next, the memory requirement of the methods is compared. In OS CFAR and CA CFAR, only the current frame is utilized. The memory requirement of the two approaches is NR × NA cells. In spatiotemporal CFAR [17], CM CFAR [15] and spatiotemporal CA CFAR, the (p–1) past frames are also required. Therefore, the memory requirement of the three approaches is NR × NA × p cells. In the proposed frame, the current frame and (NR/dR) × (NA/dA) × (c–1) cells in past frames are necessary. The expression and the value of the memory requirement are listed in Table 3. It infers that the memory requirements of the spatiotemporal CFAR [17], CM CFAR [15] and spatiotemporal CA CFAR are much larger than the others.
After the theoretical analysis in calculation and memory requirement, we conducted an experiment in which the data for the two scenarios presented in Figure 3 were processed. The methods were performed on the PowerPC MPC8640D, 1.0 GHz with 4 GB RAM in Wind River Workbench 3.2 environment. As is presented in Table 4, it is no surprise that the elapsed time of the proposed framework was much less than the others. The CA CFAR had the second lowest calculation time for its strategy in estimating the threshold. A huge elapsed time was spent in OS CFAR [10] for sorting the cells and estimating the threshold iteratively. The elapsed time of Scenario 2 was much larger than that of Scenario 1 because Scenario 2 is more complex. More iterations were spent estimating each threshold. However, the elapsed time of the other methods was stable in different scenarios. The interval of two scans is 10 s in this radar. It is apparent that the proposed method is the only approach that can satisfy the real-time requirement.
However, real data cannot be utilized to evaluate the tracking performance, mainly because the state of the targets in the two scenarios is unknown. Even if a trajectory were obtained, we would not know that the trajectory originated from a target or a clutter resource. Therefore, synthetic data were applied in the following experiment.

4.2. Synthetic Data

Extensive experiments were conducted to verify the performance of the proposed framework from the robustness against to noise, the ability of background suppression and target detection ability, and the computation time of the algorithm. To fully access the superiority of the proposed algorithm, the five approaches used in Section 4.1 were included for comparison.
In this example, a fleet constituted by three paralleled vessels is regarded as one group. The configuration of the fleet is shown in Figure 8a. The long axis and short axis of the three targets are 60 m and 15 m, respectively. The space between the two targets is 35 m. The three targets, whose initial positions are (12,225 m, 370 m), (12,175 m, 370 m) and (12125 m, 370 m), respectively, move at constant velocity (x = 5 m/s and y = 0 m/s) from t = 1 s to t = 200 s. The scanning period T equals 10 s for 20 scanning periods to generate the video data of the high-resolution radar with the model mentioned in Section 2 (cf. [20,21]). The echoes of three targets among 20 scans are presented in Figure 8b. The three targets can be fine-detected when the sea clutter and measurement noise are absent. Therefore, the synthetic scenario is presented in Figure 8c. In our former work [20], we should that synthetic data are similar to real data in both distribution and texture. Therefore, synthetic data are suitable to evaluate the detection performance. The seven groups of targets are moving in a surveillance area which consists of 13 subareas. The intensity of K distributed sea clutter in each subarea is different. The parameter v represents the shape parameter in K distributed sea clutter [26]. A larger value of v means a higher sea clutter. The detection rate and tracking precision can be greatly deteriorated in this situation. The synthetic images of the three targets under various shape parameter are shown in Figure 8c. The values of the shape parameter in each subarea can be seen in Figure 8c. It is worth noting that, in Group 0, no clutter exists and the only noise is the measurement errors. The study is presented to show the deterioration in performance originated from clutter regions. The targets are hard to follow by the naked eye when v equals 10 or 12. The synthetic video data are fed to the detection approaches for the points representing the position of a possible target. Then, as presented in Figure 1, the points are associated to form trajectories using the target-tracking approaches. In the experiment, the points obtained by the detection methods were fed to the PHD filter [1]. The result, i.e. the trajectories of targets, were obtained finally. A better set of trajectories can be achieved when an outstanding detection approach is employed. The optimal sub-pattern assignment (OSPA) distance [29] was used for evaluating the correctness of the trajectories. A lower OSPA distance means a more appropriate result. The OSPA distance between the ground truth of n targets T = {T1, T2,…,Tn} and the estimated positions p = {p1, p2,…, pn} in each scan can be calculated by:
OSPA ( T , p ) = { D p , c ( T , p ) , m > n D p , c ( p , T ) , m n
D p , c ( T , p ) = ( 1 n ( min κ Ω i = 1 m ( d c ( T i , p κ ( i ) ) ) p + ( n m ) c p ) ) 1 p , m n
where Ω represents the set of permutations of length m with elements taken from T. The cut-off value c and the distance order p of OSPA distance were set as c = 100 and p = 1.5. Note that the cut-off parameter c determines the relative weighting given to the cardinality error component against the localization error component. Smaller values of c tend to emphasize localization errors and vice versa.
Figure 9 shows the OSPA distance for 20 scans and also reveals that the proposed detection approach performs better than the others. Comparison between Groups 1–7 and Group 0 infers that the tracking performance would be greatly deteriorated by the clutter. The tracking performance would be further deteriorated when the intensity of clutter is high. However, the performance advantage of the proposed approach appears more obvious in severe scenarios (e.g., Group 6) because it has a strong capability of background suppression and target detection.
The existing CFAR detectors are insufficient in the detection of the fleet for several drawbacks. Firstly, in the CA CFAR and OS CAFR detectors, the cells of Target 1 may be employed to estimate the threshold of a cell which belongs to Target 2. The cells of other targets usually have a larger intensity than the normal background cells. For the cells of another target, a higher threshold is obtained. This would decrease the detection rate of targets in the stage of target detection. A satisfactory trajectory can be hardly achieved by few points of targets, even though an advanced target tracking algorithm is utilized. Secondly, for the existence of sea clutter, two or three targets would be regarded as one large target when the intensity of the cells between two targets is high. Taking the example in Figure 8c, Targets 1 and 2 in the patch of Group 4 would be regarded as a large target for the sea clutter. Instead of two individual points, one inaccurate point is obtained. Misdetection and a huge localization error would arise in target tracking. However, in the proposed approach, for the utilization of the Rain algorithm [22], the image of multiple targets is partitioned into separate subareas where each smaller subarea denotes one potential target. Therefore, two accurate points denoting the two targets can be obtained. Thirdly, for the limitation of memory space, saving many images in the cache is impossible. Therefore, compared to the cells that can be employed in the current frame, fewer cells in past frames can be employed in CM CFAR detector and spatiotemporal-based CFAR detectors.
Meanwhile, the performance of the CFAR-based method is related to the parameters of the radar and targets. The guard cells in azimuth and range were set to match the size of the target in this experiment. Therefore, it is hard to achieve a satisfying result in real engineering because the targets have various sizes. Setting multiple guard areas in different sizes is a promising way to alleviate the problem. However, it requires many more calculations. Meanwhile, the prior information of targets is unnecessary in selecting the employed cells in the proposed approach. Therefore, the proposed approach can solve the difficulty and is superior to the others in complex environments.
The average OSPA distance of different groups is presented in Table 5. The lowest OSPA distance in each group is emphasized in boldface. It is obvious that, with the utilization of the proposed detection framework, a better target tracking result can be achieved.
The results show that the methods in [15,17] that consider several past frames are superior to the others because the clutter intensity of the cell can be estimated by the cells in the past frames, in addition to the cells in current frame. Although OS-CFAR is more appropriate in a multi-target situation, it is still possible that both the employed cells and the cell under estimation are occupied by the same extended target. A higher threshold would be obtained, which is harmful for reaching a remarkable detection rate. The problem has been relieved by considering more cells in past frames. However, there is no free lunch. The methods in [15,17] need far more memory space than OS-CFAR and CA-CFAR to store the video data of past frames. As to the proposed framework, the method is designed to detect the extended targets. Non-extended targets would be missed for the sampling.
The average running time of the performed methods is presented in Table 6. The lowest elapsed time in each group is emphasized in boldface. The result matches the analysis in Section 4.1. The elapsed time of the proposed approach was less than the 1/80 of that of CA CFAR detector. In the proposed approach, processing one frame of synthetic image takes 0.5 ms at most. Meanwhile, the calculation of the OS CFAR detector is still much larger than those of the others.
It can be verified by the experiment presented in this section that the proposed approach is superior to the existing detectors in regards to detection performance, calculation and memory requirement simultaneously. The outstanding detection framework is a promising approach to improve target tracking performance in real engineering.

5. Conclusions

In this research, we present a target detection framework based on sampling and spatiotemporal detection. The coarse detection guarantees the real-time and low memory requirements by locating the area where targets may exist in advance. The fine detection can improve the detection performance by identifying and processing the single target, dense targets and sea clutter using different strategies. The extensive experiments showed that excellent performance, real-time processing and low memory requirement can be achieved simultaneously by the proposed detection framework. The tracking performance can be improved by utilizing the proposed approach with far fewer calculations and less memory being spent. Meanwhile, far less prior information, such as the extension of targets, is necessary in using the proposed approach. It also makes the proposed approach more practical in real engineering.

Author Contributions

Conceptualization, N.X. and B.Y.; methodology, B.Y.; software, N.X.; validation, M.L.; formal analysis, M.L.; investigation, W.Z.; resources, B.Y.; data curation, M.L.; writing—original draft preparation, B.Y.; writing—review and editing, W.Z.; visualization, W.Z.; supervision, L.X; project administration, L.X; funding acquisition, L.X.


This work was supported by the National Natural Science Foundation of China, under grant No.61502373.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Granstrom, K.; Lundquist, C.; Orguner, O. Extended Target Tracking Using a Gaussian-Mixture PHD Filter. IEEE Trans. Aerosp. Electron. Syst. 2012, 48, 3268–3286. [Google Scholar] [CrossRef]
  2. Orguner, U.; Lundquist, C.; Granström, K. Extended target tracking with a cardinalized probability hypothesis density filter. In Proceedings of the 14th International Conference on Information Fusion, Chicago, IL, USA, 5–8 July 2011. [Google Scholar]
  3. Granstrom, K.; Orguner, U. On Spawning and Combination of Extended/Group Targets Modeled with Random Matrices. IEEE Trans. Signal Process. 2013, 61, 678–692. [Google Scholar] [CrossRef]
  4. Granström, K.; Antonio, N.; Braca, P. Gamma Gaussian Inverse Wishart Probability Hypothesis Density for Extended Target Tracking Using X-Band Marine Radar Data. IEEE Trans. Geosci. Remote Sens. 2015, 12, 6617–6631. [Google Scholar] [CrossRef]
  5. Lundquist, C.; Granström, K.; Orguner, U. An Extended Target CPHD Filter and a Gamma Gaussian Inverse Wishart Implementation. IEEE J. Sel. Top. Signal Process. 2013, 7, 472–483. [Google Scholar] [CrossRef]
  6. Yan, B.; Xu, N.; Zhao, W.B.; Xu, L.P. A Three-Dimensional Hough Transform-Based Track-Before-Detect Technique for Detecting Extended Targets in Strong Clutter Backgrounds. Sensors 2019, 19, 881. [Google Scholar] [CrossRef]
  7. Yan, B.; Xu, L.P.; Li, M.Q.; Yan, J.Z.H. A Track-Before-Detect Algorithm Based on Dynamic Programming for Multi-Extended-Targets Detection. IET Signal Process. 2017, 11, 674–686. [Google Scholar] [CrossRef]
  8. Yan, B.; Zhao, X.Y.; Xu, N.; Chen, Y.; Zhao, W.B. A Grey Wolf Optimization-based Track-Before-Detect Method for Maneuvering Extended Target Detection and Tracking. Sensors 2019, 19, 1577. [Google Scholar] [CrossRef]
  9. Wang, X.; Li, T.; Sun, S.; Corchado, J.M. A Survey of Recent Advances in Particle Filters and Remaining Challenges for Multitarget Tracking. Sensors 2017, 17, 2707. [Google Scholar] [CrossRef]
  10. Levanon, N.; Shor, M. Order statistics CFAR for Weibull background. IEE F Radar Signal Process. 1990, 137, 157–162. [Google Scholar] [CrossRef]
  11. Wang, C.; Bi, F.; Zhang, W.; Chen, L. An intensity-space domain CFAR method for ship detection in HR-SAR images. IEEE Geosci. Remote Sens. Lett. 2017, 99, 1–5. [Google Scholar] [CrossRef]
  12. Dai, H.; Du, L.; Wang, Y.; Wang, Z. A modified CFAR algorithm based on object proposals for ship target detection in SAR images. IEEE Geosci. Remote Sens. Lett. 2016, 99, 1–5. [Google Scholar] [CrossRef]
  13. Gao, G.; Shi, G. CFAR ship detection in nonhomogeneous sea clutter using polarimetric SAR data based on the notch filter. IEEE Trans. Geosci. Remote Sens. 2017, 99, 1–14. [Google Scholar] [CrossRef]
  14. Conte, E.; Lops, M. Clutter-map CFAR detection for range-spread targets in non-gaussian clutter. I. system design. IEEE Trans. Aerosp. Electron. Syst. 1997, 33, 432–443. [Google Scholar] [CrossRef]
  15. Zhang, R.L.; Sheng, W.X.; Ma, X.F.; Han, Y.B. Clutter map CFAR detector based on maximal resolution cell. Signal Image Video Process. 2015, 9, 1–12. [Google Scholar] [CrossRef]
  16. Li, Y.; Zhang, Y.; Yu, J.G.; Tan, Y.; Tian, J.; Ma, J. A novel spatio-temporal saliency approach for robust dim moving target detection from airborne infrared image sequences. Inf. Sci. 2016, 369(C), 548–563. [Google Scholar] [CrossRef]
  17. Deng, L.; Zhu, H.; Tao, C.; Wei, Y. Infrared moving point target detection based on spatial–temporal local contrast filter. Infrared Phys. Technol. 2016, 76, 168–173. [Google Scholar] [CrossRef]
  18. Xi, T.; Zhao, W.; Wang, H.; Lin, W. Salient object detection with spatiotemporal background priors for video. IEEE Trans. Image Process. 2017, 26, 3425–3436. [Google Scholar] [CrossRef]
  19. Yan, B.; Xu, L.P.; Zhao, K.; Yan, J.Z.H. An efficient plot fusion method for high resolution radar based on contour tracking algorithm. Int. J. Antennas Propag. 2016. [Google Scholar] [CrossRef]
  20. Yan, B.; Xu, L.P.; Yan, J.Z.H.; Li, C. An efficient extended target detection method based on region growing and contour tracking algorithm. Proc. Inst. Mech. Eng. Part G- J. Aerosp. Eng. 2018, 232, 825–836. [Google Scholar] [CrossRef]
  21. Yan, B.; Xu, L.P.; Yang, Y.; Li, C. Improved plot fusion method for dynamic programming based track before detect algorithm. AEU Int. J. Electron. Commun. 2017, 74, 31–43. [Google Scholar] [CrossRef]
  22. Yan, B.; Xu, N.; Xu, L.P.; Li, M.Q.; Cheng, P. Improved multilevel thresholding plot fusion method using the rain algorithm for the detection of closely extended targets. IET Signal Process. 2019, (in press). [Google Scholar] [CrossRef]
  23. Sun, L.; Li, X.R.; Lan, J. Modeling of extended objects based on support functions and extended Gaussian images for target tracking. IEEE Trans. Aerosp. Electron. Syst. 2014, 50, 3021–3035. [Google Scholar] [CrossRef]
  24. Sutour, C.; Petitjean, J.; Watts, S.; Quellec, J.M. Analysis of K-distributed sea clutter and thermal noise in high range and Doppler resolution radar data. In Proceedings of the 2013 IEEE Radar Conference (RadarCon13), Ottawa, ON, Canada, 29 April–3 May 2013; pp. 1–4. [Google Scholar]
  25. Watts, S. Radar detection prediction in K-distributed sea clutter and thermal noise. IEEE Trans. Aerosp. Electron. Syst. 1987, 23, 40–45. [Google Scholar] [CrossRef]
  26. Roy, L.P.; Kumar, R.V.R. Accurate K-Distributed Clutter Model for Scanning Radar Application. IET Radar Sonar Navig. 2010, 4, 158–167. [Google Scholar] [CrossRef]
  27. Vivone, G.; Braca, P.; Errasti-Alcala, B. Extended target tracking applied to X-band marine radar data. In Proceedings of the OCEANS 2015, Genoa, Italy, 18–21 May 2015; pp. 1–6. [Google Scholar]
  28. Watts, S. The Performance of Cell-Averageing CFAR Systems in Sea Clutter. In Proceedings of the Record of the IEEE 2000 International Radar Conference, Alexandria, VA, USA, 12 May 2000; pp. 398–403. [Google Scholar]
  29. Ristic, B.; Vo, B.N.; Clark, D. Performance evaluation of multi-target tracking using the OSPA metric. In Proceedings of the 13th International Conference on Information Fusion, Edinburgh, UK, 26–29 July 2010; pp. 1–7. [Google Scholar]
Figure 1. The schematic diagram of radar data processing.
Figure 1. The schematic diagram of radar data processing.
Sensors 19 02912 g001
Figure 2. The measurement model of marine radars.
Figure 2. The measurement model of marine radars.
Sensors 19 02912 g002
Figure 3. The radar image of two real scenarios: (a) video data of Radar 1; (b) video data of Radar 2; and (c) location of the two radars.
Figure 3. The radar image of two real scenarios: (a) video data of Radar 1; (b) video data of Radar 2; and (c) location of the two radars.
Sensors 19 02912 g003aSensors 19 02912 g003b
Figure 4. A sample of the spatiotemporal thresholding method.
Figure 4. A sample of the spatiotemporal thresholding method.
Sensors 19 02912 g004
Figure 5. A sample of multiple contour tracking method.
Figure 5. A sample of multiple contour tracking method.
Sensors 19 02912 g005
Figure 6. The schematic diagram of the proposed detection framework.
Figure 6. The schematic diagram of the proposed detection framework.
Sensors 19 02912 g006
Figure 7. The employed cells for the detection threshold in mentioned approaches.
Figure 7. The employed cells for the detection threshold in mentioned approaches.
Sensors 19 02912 g007
Figure 8. The synthetic data in this work: (a) the configuration of the fleet; (b) the echoes of the three targets among 20 scans; and (c) the synthetic scenario.
Figure 8. The synthetic data in this work: (a) the configuration of the fleet; (b) the echoes of the three targets among 20 scans; and (c) the synthetic scenario.
Sensors 19 02912 g008
Figure 9. The OSPA distance of six scenarios at each scan: (af) Groups 1–7; and (g) Group 0.
Figure 9. The OSPA distance of six scenarios at each scan: (af) Groups 1–7; and (g) Group 0.
Sensors 19 02912 g009aSensors 19 02912 g009b
Table 1. Parameters of the radar.
Table 1. Parameters of the radar.
3dB azimuth beam width0.94°
Number of bins in range axis (NR)8192
Number of bins in range axis (NA)8192
Angular Precision0.0439°
Range Resolution6(m)
Central frequency1.35(GHz)
Rotating speed of antennaπ/5(°/s)
Table 2. The quantity of employed cells for thresholds.
Table 2. The quantity of employed cells for thresholds.
The Quantity of Cells for One ThresholdThe Value of the Quantity The Total Number of Employed Cells
The proposed frameworka × b × c4408.8 × NR × NA
Spatiotemporal CFAR [17](m + 2d1) × (n + 2d2)-m × n + p293293 × NR × NA
OS CFAR [10](m + 2d1) × (n + 2d2)-m × n278278 × NR × NA
CM CFAR [15]p1515 × NR × NA
CA CFAR [28](m + 2d1) × (n + 2d2)-m × n278278 × NR × NA
Spatiotemporal CA CFAR(m + 2d1) × (n + 2d2) × p-m × n50525052 × NR × NA
Table 3. The memory requirement.
Table 3. The memory requirement.
In theoryIn the Experiment
The proposed frameworkNR × NA + (NR/dR) × (NA/dA) × (c-1)1.14 NR × NA
Spatiotemporal CFAR [17]NR × NA × p15 NR × NA
CM CFAR [15]NR × NA × p15 NR × NA
Spatiotemporal CA CFARNR × NA × p15 NR × NA
Table 4. Elapsed time of the methods.
Table 4. Elapsed time of the methods.
Scenario 1Scenario 2
The proposed framework4.764.97
Spatiotemporal CFAR [17]220.01219.19
OS CFAR [10]3213.158274.43
CM CFAR [15]256.18260.58
CA CFAR [28]61.7763.26
Spatiotemporal CA CFAR467.96467.56
Table 5. OSPA distance of synthetic data.
Table 5. OSPA distance of synthetic data.
Group 1Group 2Group 3Group 4Group 5Group 6Group 0
The proposed framework14.414.6714.9615.0715.4315.83.97
Spatiotemporal CFAR [17]14.5915.3515.881616.6517.034.08
OS CFAR [10]15.515.9716.3316.8317.417.765.06
CM CFAR [15]14.8415.3415.9416.517.1417.674.54
CA CFAR [28]14.8315.4515.916.5117.0917.374.86
Spatiotemporal CA CFAR15.615.8115.9716.416.917.474.84
Table 6. Elapsed time of the methods.
Table 6. Elapsed time of the methods.
Group 1Group 2Group 3Group 4Group 5Group 6Group 0
The proposed framework0.490.430.390.350.370.380.35
Spatiotemporal CFAR [17]61.3661.6862.0662.1161.8462.3955.6
OS CFAR [10]5041.274993.385267.965272.795319.965337.254710.06
CM CFAR [15]150.09148.54147.95149.52148.38147.67133.54
CA CFAR [28]37.6437.6838.1138.0838.0438.0834.12
Spatiotemporal CA CFAR400.43397.42402.07400.77383.4384355.58

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
Back to TopTop