Extended Target Tracking and Feature Estimation for Optical Sensors Based on the Gaussian Process

A problem of tracking surface shape-shifting extended target by using gray scale pixels on optical image is considered. The measurement with amplitude information (AI) is available to the proposed method. The target is regarded as a convex hemispheric object, and the amplitude distribution of the extended target is represented by a solid radial function. The Gaussian process (GP) is applied and the covariance function of GP is modified to fit the convex hemispheric shape. The points to be estimated on the target surface are selected reasonably in the hemispheric space at the azimuth and pitch directions. Analytical representation of the estimated target extent is provided and the recursive process is implemented by the extended Kalman filter (EKF). In order to demonstrate the algorithm’s ability of tracking complex shaped targets, a trailing target characterized by two feature parameters is simulated and the two feature parameters are extracted with the estimated points. The simulations verify the validity of the proposed method with compared to classical algorithms.


Introduction
The extended target tracking (ETT) problem has been widely studied to estimate target motions and shape. With the development of the sensors, more information such as amplitude, color, etc. can be collected. This additional information should be fully utilized since it helps to improve the tracking precision and contains more abundant target characteristics. Therefore, new demand for an effective algorithm that can take advantage of additional information has emerged.
Many scholars have carried out research on ETT problems from different aspects such as shape modeling, measurement modeling and dynamic modeling. In this letter, we focus on the first aspect, i.e., the shape model method for an extended object. The current research achievements can be divided into the following three categories, corresponding to three levels of shape complexity. No extent: the simplest approach is to ignore the target extent. Under the assumptions that the measurements are distributed in the neighbor of target with a Poisson distributed number, Gaussian mixture probability hypothesis density (GM-PHD) filter is proposed [1,2]. Simple extent: a better modeling approach is to assume the target as a basic geometry, such as an ellipse, a circle or a rectangle. The most widely adopted method is based on a random matrix (RM), which represents the target extent as an ellipse [3][4][5][6][7]. For the target with non-elliptic shapes, a combination of multiple elliptical subobjects can be employed to approximate its real appearances [8][9][10]. Rectangle shape is usually adopted when tracking cars and other box-shaped objects [11][12][13]. Arbitrary extent: a more accurate and general approach is called random hypersurface models (RHMs) [14][15][16], which models the extent as a star-convex shape [17]. The RHMs can adapt to more flexible shapes in contrast with RM. The Gaussian process (GP) method can also be employed to estimate a target with highly nonlinear extent [18,19]. A more complete review for an ETT problem about shape model methods is presented in [20]. As mentioned in part Simple extent, the classical RM, RHMs and other variations only output the two-dimensional contour of the target, they do not have the ability to utilize extra information while the GP has the potential to realize that. The GP based method can theoretically estimate target of any dimension due to the fitting ability for highly nonlinear function.
GP has been adopted to learn the shape of a stationary three-dimensional object in [21,22], but none of them provide a recursive solution in a tracking application. A recursive solution for a cube-shaped object is provided in [23] by using point measurements collected from the target surface, but the target shape is relatively simple and there are no comparisons to the classical methods. In this letter, we extend the GP method to the problem of ETT on an image plane for optical sensors. Since the traditional RM, RHMs and GP are unable to utilize the amplitude information, we try to present a novel approach to overcome this problem. The surface of an extended target is described with solid radial function, rather than a two-dimensional counter. In order to adapt to the three-dimensional target tracking, the spherical difference was adopted rather than the angle difference of two two-dimensional angles. The RM and RHMs need to parameterize the target extent and estimate these parameters, while this letter estimates some selected points on the target surface. These selected points can represent the target distribution feature. Thus, the proposed method does not require any parametric model. The measurements with amplitude are converted into polar coordinates, and the measurements are used to estimate these points by the GP method with modified covariance function. In addition, we use the estimated points to extract the feature parameters of the target distribution, which are beneficial to the identification and classification of the target in the future work.

GP Theory
The GP is a kind of stochastic process in probability theory and mathematical statistics, which combines a series of random variables that follow normal distribution within an index set. Unknown complicated functions can be learned with training data. In this section, GP theory is directly applied to tracking application. More detail about the theory can be referred to [18].

Basic GP
GP is a typical stochastic process. it is always used to model spatially correlated measurements [24]. With t being input, the distribution of a GP f (t) can be uniquely determined by its mean function µ(t) and covariance function k(t, t ) as Then, the GP can be marked as For the function value of the finite input t 1 · · · t N , the Gaussian process is a generalization of the multivariate Gaussian probability distribution, i.e., . . . where

GP Regression
The distribution of a highly nonlinear function can be fitted with a Gaussian process regression (GPR) method. In this subsection, the GPR is adopted to model the target extent. In tracking application, f N ] T is learned by using the measurement z = [z 1 , · · · z M ] T and the corresponding input t = [t 1 , · · · t N ] T . For a target with amplitude information shown in Figure 1, the above measurement z is defined by {r z , c z , I z }, where r z , c z and I z represent the row, column, and amplitude of the measurement in grayscale image, respectively. In our application, the input t = ψ, where ψ ∆ =(θ, ϕ) is the azimuth and pitch angle of a solid angle. The radius value z k and the function value f = f (ψ k ) satisfy the joint Gaussian distribution According to the GP theory, the likelihood and initial priori probabilities are given as follows: where  Figure 1. Sketch of the target surface described by a solid radial function.

Recursive GPR
In order to realize the real-time recursive processing, the approximate method is usually used to solve the recursive problem. Bayes formula is applied to the posterior probability p (f |z 1:N ), which is Therefore, it can be approximated that f is independent from the previous measurement z 1:k−1 , which means that f is completely statistically independent from the previous measurement, i.e.,

Covariance Function Modification
In the covariance function used in Ref. [18], the squared Euler distance between the two plane angles is used to measure the correlation, i.e., where σ 2 f denotes the variance of the measurement amplitude and l denotes the scale of the function; · 2 denotes the squared Euler distance. Since the target amplitude surface is considered as a convex hemisphere, it is obvious that the spherical distance is more reasonable to measure the correlation of the two solid angles, which is expressed as

EKF Model
Kalman filters such as EKF and Unscented Kalman filter (UKF) [25,26] are the most commonly used filters for nonlinear filtering. Since the EKF exerts lower computing complexity than the UKF, we employed EKF as the Bayesian approximation in this section.

Measurement Amplitude Modification
Since the measurement amplitude and target size are different in units as well as unequal in the values, in order to obtain a better extent estimation, we hope that the input x f can be distributed as uniformly as possible on the target surface. Therefore, the measurement amplitude and target size should be close in order of magnitude, which means that the amplitude information needs to be reasonably modified. Let S k represent the number of pixels covered by the target, and let I z k denote the amplitude value of all the target measurements collected from the optical image plane. Then, we define a modification factor β as With the amplitude value of the measurements multiplied by the modification factor β, the maximum value of the modified measurement max( It can be seen that, after modification, the amplitude value I z k and the target scale √ S k are at the same order of magnitude. Thus, the modified measurement is re-defined as where r z k , c z k and I z k represent the row, column, and amplitude of the measurement z k . We now replace z k with the modified measurementz k for the rest of this letter.

Measurement Model with Amplitude
With the amplitude information considered, the target dimension is augmented to three, and the target extent is regarded as a hemispherical convex surface. Thus, the extent of the target can be represented by a function of the solid angle ψ = (θ, ϕ) and radius r = f (ψ) in polar coordinates. Two coordinate systems are adopted. The sensor coordinate system is fixed to the sensor and local coordinate system is fixed to the body of the extended target, denoted by S and L, respectively. The origin of the local system o L is combined with the mass center of the target and fixed on the image plane; the x L axis lies in the image plane and points to the target moving direction, the z L axis is perpendicular to the plane. x L , y L and z L follow the right-hand rule. The target extent is represented in the local coordinate system and the target center is presented in the sensor coordinate system.
As is shown in Figure 1, the target measurementz k,l can be expressed in terms of the direction vector p(x c k ) and radius f (ψ L k,l (x c k , ζ k )) relative to the centroid x c k , which is whereē k,l ∼ N (0, R), and p(x c k ) = p(ψ S k,l (x c k )) =z k,l −x c k ||z k,l −x c k || is the unit vector that points to the measurementz k,l from centroid x c k . Reviewing the likelihood given in Equation (7), H f k f is the mean of the random variable z k and R f k is the variance of it. Thus, the measurement z k can be expressed as In tracking application, the target extent x f k is the instantiation of function value f, i.e., f = x f k . Similar to Equation (20), the radius value on the ψ L k,l (x c k , ζ k ) direction can be expressed as where H f (·) denotes the observation model and e f k,l denotes the noise of it. Substitute Equation (21) into Equation (19), the measurement equation can be rewritten as whereH According to Equation (22), it is easy to obtain the new noise covariance as follows: where R f k,l is the covariance of noise e f k,l , andR k,l is the covariance of noiseē k,l .

Motion Model
We model the target state as x k = (x k , x f k ) T , wherex k =(x c k , ζ k ) is the centroid, ζ k denotes the orientation of the extended target and x f k is the extent. Because the target centroid is different from that of the extended, the centroid and the extent state are modeled separately. The dynamics of the target centroidx k is consistent with the constant velocity model: where the transition matrixF and covariance matrixQ for target centroidx k are given as where T denotes the sampling period, ⊗ denotes the Kronecker product, I 3 denotes the third order identity matrix, σ 2 q and σ 2 q ζ represents the process variance of the centroid and orientation. The dynamics of target extent x f k is shown as follows: where the extent transition matrix and covariance matrix are presented as where α is the forgetting factor. Finally, transition model of the augmented state, i.e., the combination of dynamics of target statē x k and extent x f k , is given as where The initial state x 0 is given as whereμ 0 andP 0 denote the mean and covariance ofx k , P f 0 denotes the covariance of x f k , which can be calculated by Equation (22).
So far, the state space description is provided in Equations (22) and (33). Now, the estimation can be performed using a nonlinear filter and EKF is employed in this letter. The calculation of Jacobian matrix is deeply discussed in the appendix of Ref [18].

Simulation Setting
An extended target with grayscale information moves in the [300 × 300] surveillance area. The total number of scans is set to 100 s and the scan cycle is set to T s = 1 s. The grayscale amplitude along the velocity direction of the target follows the type I extreme value distribution with the scale parameter σ ex . The grayscale amplitude on a direction perpendicular to velocity obeys the Gaussian distribution with the standard deviation σ gau . A view of the amplitude distribution model of the target is shown in Figure 2. The standard deviations of the process noise on position and angle are set to be σ q = 1 and σ q ζ = 0.1, respectively, forgetting factor α = 0.0001. The standard deviations of the measurement noise on position and amplitude are set as σ r = 1 and σ I = 0.1. The hyper-parameter of the GP are set to σ f = 2, σ r = 2 and l = π/4. Those pixels with amplitude value greater than Th gray = 0.01 are retained as target measurements z and the others are removed. Each simulation are tested within 50 Monte Carlos.

Feature Estimation Steps
In addition to the estimation of the target centroid x c k , another important goal is to extract the target feature parameters based on the estimated values x f k at all directions. In this study, the steps shown in Figure 3 are designed to extract the target distribution feature parameters σ ex and σ gau . In this feature estimation method, the estimated points along velocity direction and vertical velocity direction are extracted. These points are fitted with function f 1 and f 2 , which is given below. After the fitting step, the center estimation (x L c ,ŷ L c ) and feature parameters σ ex and σ gau are easily attained.  For the estimation step, too many ψ f will increase the calculation cost and cause the irreversibility of the covariance function K(ψ f , ψ f ). However, if the number of input is small, the estimation accuracy of target extent cannot be guaranteed. As a compromise, we set the input number as follows. For the proposed GP with AI, eight points are uniformly sampled within the interval of [0 ∼ 2π] as azimuth θ f and five points are uniformly sampled within the interval of [0 ∼ π/2] as ϕ f . For the traditional GP, eight points are uniformly sampled within the interval of [0 ∼ 2π]. Since the number of Fourier coefficients in RHMs has to be odd, and to ensure that the number of parameters to be evaluated is consistent with that of GP as much as possible, we set the coefficients number to 9.
As for the extraction and fitting step, the feature parameters σ ex and σ gau can be subsequently obtained by fitting the estimated points on velocity direction and vertical velocity direction. Since the distribution of the trailing target on velocity and vertical velocity direction follows the type I extreme value distribution and Gaussian distribution, the functions f 1 and f 2 are given as Subsequently,x L c andŷ L c , corresponding to the maximum value on each fitting curve in Figure 1, denotes the target local coordinate position, respectively.

Shape Deformation
The target moves at a constant speed and the initial state of the target centroid is [31, 23, 1.2, 1.1] T . The scale parameter σ ex = 3.5 and becomes 2.5 at the 50th second. The standard deviation σ gau = 1.5 and becomes 2 at the 50th second. The sum of the target grayscale is 200. The performance is measured with the tracking precision and target feature parameter estimation accuracy. Figure 4 shows a snapshot of the tracking process. For display convenience, we added a bias of 20 pixels on the y-axis of the estimation. It can be observed that the shape estimate changes with the measurement progressively.  Figure 5 compares the tracking precision of the RHMs, GP, and the proposed GP with AI. It can be seen from Figure 5 that the proposed method outperforms the RHMs and GP method. Because the RHMs and the traditional GP method can only use the position information of the measurements, the contribution of each pixel to the target centroid is the same, so traditional methods can only estimate the center of form. On the other hand, the proposed GP with AI can estimate the center of mass with the amplitude information introduced. The estimation of target feature σ ex and σ gau are given in Figure 6, the red dotted lines denote the ground truth of target features and the blue lines denote the estimation of target features. It can be seen that the proposed method estimates the ground truth quite well during the first 50 s. Even if the target extent mutates in the 50th second, the proposed method is able to approximate the ground truth gradually with the new measurements' accumulation. The result demonstrates that the proposed method has the ability to adapt to the shape-shifting extended target. There exists deviation between the estimation and the ground truth value since the systematic error is brought in by the truncation of measurement amplitude by Th gray and the fitting step.

Motion Change
The scale parameter σ ex = 3.5 and standard deviation σ gau = 1.5. The sum of target grayscale is 200. The target motion model follows Equation (38) during the 20th to 70th second and constant speed model during the rest of the simulation time where ω is the turn rate. Figure 7 shows a snapshot of a target with turn rate ω = π/80. Figure 8 shows the feature parameter estimation results of the targets moving with ω = π/80, π/100 and π/140, respectively. It can be seen that the feature parameter estimation performance goes worse with the increase of the target maneuverability.

Intensity Change
The scale parameter σ ex = 3.5 and standard deviation σ gau = 1.5. The sum of target grayscale changes from 200 to 500 at the 50th second. Figure 9 shows a snapshot of intensity change and Figure 10 shows the feature estimation results.  It can be seen that, even if the target parameters' estimation has a large deviation as the intensity abrupt changes at the 50th second, it tends to converge to the ground truth as long as the intensity of the target holds steady.

Conclusions
In this letter, a GP based method for the tracking of an extended target by using amplitude information is proposed. Surface shape and kinematic state are simultaneously estimated. The superiority of utilizing the amplitude information is demonstrated compared with traditional RHMs and GP methods. As a result, the tracking error implemented by the proposed method reduces about 0.5 pixels compared with the RHMs and GP. Furthermore, by using the estimated points on the target surface, we also extract the target feature parameters, which can be used for target classification in further study. The estimation error of feature parameters σ ex and σ gau are able to converge to 0.17 and 0.1, respectively. Through challenging scenes in Sections 4.4 and 4.5, the estimation performance can be limited, influenced by the mutations on target maneuvering and intensity changing, which proves the robustness of our proposed method. In fact, the tracking performance improvement is mainly due to the introduction of amplitude information. If the amplitude information is ignored, this method degrades to the traditional GP method, and its performance will be no different from that of the GP and RHM methods. Since the proposed method does not require any parametric model, it is flexible enough to estimate any convex hemisphere-shaped target.