A Robust SMC-PHD Filter for Multi-Target Tracking with Unknown Heavy-Tailed Measurement Noise

In multi-target tracking, the sequential Monte Carlo probability hypothesis density (SMC-PHD) filter is a practical algorithm. Influenced by outliers under unknown heavy-tailed measurement noise, the SMC-PHD filter suffers severe performance degradation. In this paper, a robust SMC-PHD (RSMC-PHD) filter is proposed. In the proposed filter, Student-t distribution is introduced to describe the unknown heavy-tailed measurement noise where the degrees of freedom (DOF) and the scale matrix of the Student-t distribution are respectively modeled as a Gamma distribution and an inverse Wishart distribution. Furthermore, the variational Bayesian (VB) technique is employed to infer the unknown DOF and scale matrix parameters while the recursion estimation framework of the RSMC-PHD filter is derived. In addition, considering that the introduced Student- t distribution might lead to an overestimation of the target number, a strategy is applied to modify the updated weight of each particle. Simulation results demonstrate that the proposed filter is effective with unknown heavy-tailed measurement noise.


Introduction
Multi-target tracking (MTT) involves the estimate of target number and states from noisy measurements [1]. Influenced by clutter and detection uncertainty, the traditional MTT algorithms based on data association, such as joint probabilistic data association (JPDA) [2] and multiple hypotheses tracking (MHT) [3], are burdened with high computation complexity with the increase of target number and clutter number. Under the framework of finite set statistics (FISST), multi-target states and measurements are modeled as random finite sets (RFS) to avoid data association. Unfortunately, the MTT approaches based on RFS are still computationally intractable because of the multiple integrals. To obtain a computational tractable solution, the probability hypothesis density (PHD) filter [4] which propagates the first order moment approximation of multi-target posterior density is proposed and realizes the estimate of multi-target number and states. The two main implementations of the PHD filter are the Gaussian mixture PHD (GM-PHD) filter [5] and the sequential Monte Carle PHD (SMC-PHD) [6]. In recent years, the PHD filter has generated substantial interest in visual tracking [7], radar tracking [8], etc.
However, the above algorithms only achieve good performance with the known Gaussian measurement noise variance. When the measurement noise variance is unknown and inconsistent with the true variance, the performance of these algorithms decreases. In [9], Sarkka considers the unknown measurement noise variance problem under the linear Gaussian system and models noise variance as an inverse Gamma (IG) distribution. Then, the joint estimate of target state and noise variance is obtained by variational Bayesian (VB) technique. Since the inverse Gamma distribution is only suitable for the situation where the noise variance is diagonal, the inverse Wishart (IW) distribution is further used to model the full noise variance [10,11]. Following this, the VB technique was introduced into the PHD filter [12][13][14] for MTT.
In some applications, the measurement noise might be non-Gaussian. For example, due to the electromagnetic interference and the sensor's unreliability, the measurement model is usually accompanied with outliers and the measurement noise can be treated as non-Gaussian noise with a heavy-tailed probability density function [15]. The outliers will result in the performance degradation of conventional tracking algorithms. In [15], the combination of a Laplacian distribution and a Gaussian distribution is used to describe the heavy-tailed measurement noise. In [16,17], the heavy-tailed measurement noise is characterized as a mixture of two Gaussian distributions. Due to the advantage of the heavy-tailed characteristic, Student-t distribution was widely used to deal with outliers in [18][19][20]. However, the relevant parameters of Student-t distribution need to be known in these applications. Similar to [9][10][11][12][13][14], the VB technique was employed in the situation where the Student-t distribution parameters are unknown in [21][22][23][24]. The joint estimate of target and unknown parameters is obtained in a linear system. However, the above approaches work worse in a nonlinear system. The nonlinear transformation algorithms, such as extended Kalman filter (EKF) and unscented Kalman filter (UKF), can be used to cope with the nonlinear system to some extent. Unfortunately, the EKF depends on the nature of nonlinearities and the UKF relies on the Gaussian characteristic of the system. They often give an unreliable estimate if the nonlinearities are severe or the system is non-Gaussian, which leads to target missing.
With heavy-tailed measurement noise, the robust PHD filter in a nonlinear system has been much less studied than that in a linear system. Compared with nonlinear transformations, the particle filter is more suitable for practical applications because it can deal with nonlinear and non-Gaussian dynamics. Motivated by [23], we extend the Student-t distribution into SMC-PHD filter and propose a robust SMC-PHD (RSMC-PHD) filter with unknown heavy-tailed measurement noise. In the proposed filter, the Student-t distribution where the degrees of freedom (DOF) and the scale matrix are both unknown is used to model the measurement noise. The Gamma distribution and IW distribution are respectively used to model the DOF and the scale matrix. Then, the augmented state of the target involves the target state, DOF, and the scale matrix. Further, the target state is estimated by particle approximations and the DOF and the scale matrix are updated by the VB technique within the SMC-PHD filter framework. In addition, considering the problem that the introduced Student-t distribution might lead to an overestimation of the target number, we apply a strategy to modify the updated weight of each particle. Simulation results show that the proposed RSMC-PHD filter can achieve higher estimate accuracy than the compared filter under unknown heavy-tailed measurement noise. This paper is organized as follows. Section 2 gives a brief overview of the tracking model, measurement noise model, PHD filter, and SMC-PHD filter. Section 3 presents the detailed implementation of the proposed filter. Section 4 evaluates the performance of the proposed filter under two scenarios. Finally, Section 5 displays the conclusions.

Tracking Model and Measurement Noise Model
Consider a state space system given by where x k is referred to as the target state and z k is referred to as the measurement at time k. w k is the process noise assumed to be Gaussian white noise with covariance matrix Q k . f (·) and h(·) are the dynamic motion model function and measurement model function, respectively. ε k is the measurement noise.
The target dynamic motion model function is given by where N (x;x, P) denotes a Gaussian probability density function with meanx and covariance matrix P. Assume the measurement noise has a heavy tail and can be described by a Student-t distribution as [18] p(ε k ) where St(·) denotes the Student-t distribution, 0 d denotes d dimension zeros vector, d is the dimension of z k , ν k denotes DOF, R k is a symmetric, positive defined d × d covariance matrix and Γ(·) denotes the Gamma function. Different from the Gaussian distribution, the Student-t distribution has a thicker tail away from the mean value. When ν k → ∞ , the Student-t distribution is equivalent to a Gaussian distribution. The probability density function in Equation (4) can be also formulated as a mixture of Gamma distribution and Gaussian distribution denoted as [25] where G(·) denotes the gamma distribution, u k is an auxiliary variable. The likelihood of the measurement can be written as According to Equation (5), we can obtain where u k is distributed as In this paper, the DOF ν k and scale matrix R k are both unknown. Considering the conjugate prior to Gaussian distribution with unknown variance is the IW distribution, we model R k as the IW distribution. Similarly, the Gamma distribution is assumed to be the conjugate prior to the DOF ν k . Then the joint probability density of R k , u k , and ν k can be expressed as where IW(·) denotes the IW distribution, υ k , V k are the sufficient statistics of R k , α k and β k are the sufficient statistics of ν k .

PHD Filter
Assume that is the multi-target states set and Z k = z 1 k , · · · , z M k k is the measurements set at time k. x n k is referred to as the nth target state and z m k is referred to as the mth measurement. N k and M k are the number of targets and measurements, respectively. Then the Bayesian filter recursion can be expressed as where g k (·) is the likelihood function. p k|k−1 (·) and p k (·) denote multi-target predicted density and multi-target posterior density, respectively. µ s is the reference measure. The recursion Equations (10) and (11) involve multiple integrals, which are computationally intractable. To alleviate the computational intractability, the PHD filter provides a sub-optimal solution by propagating the first-order statistical moment of the multiple posterior densities. Without the spawning targets, the predicted and updated equations of the PHD filter are given by [5] where D k|k−1 (x k ) and D k|k (x k ) are the predicted intensity and posterior intensity, respectively. γ k (x k ) is the birth targets intensity, p S,k is the target survival probability, p D,k is the detection probability. κ k (z) = λc(z) is the clutter intensity with average clutter number λ and spatial distribution c(z).

SMC-PHD Filter
Assume that the multi-target posterior intensity can be denoted by weighted particles k−1 is the ith particle state and ω (i) k−1 is the corresponding weight. Then, the SMC-PHD filter can be summarized by the following steps.
Step 1 Prediction where q k · x (i) k−1 , Z k is the proposal density of survival targets and p k (·|Z k ) is the proposal density of birth targets. J k is the particles number of birth targets.
Step 2 Update Step 3 Resampling Target number is estimated byN k = ceil ∑ ceil(x) denotes the smallest integer more than x, L k = N particle ·N k , N particle is the number of particles per expected target.
Step 4 State extraction Extract the target states X k = x 1 k , · · · ,xN k k from the resampling particles set, wherê x 1 k , · · · ,xN k k denote the estimated target states.

VB Approximation
The aforementioned SMC-PHD filter performs well under Gaussian measurement noise. However, the outliers in measurements will have a negative effect on the SMC-PHD filter. To solve the problem, the Student-t distribution is introduced to model the heavy-tailed measurement noise. Since the parameters of the Student-t distribution are unknown, the joint posterior probability density of target state x k , scale matrix R k , auxiliary variable u k , and DOF ν k need to be estimated. Therefore, the augmented target state can be expressed as ..
Since there is no explicit analytical solution existing for the joint posterior density p( .. x k |Z 1:k ), we utilize the VB inference [9] to obtain the approximate solution. As for an approximate method, VB inference is widely used in searching for a factored approximate posterior density as follows where q( ..
x k |Z 1:k ) is the approximate joint posterior density, q x (·), q R (·), q u (·),and q ν (·) are the approximate posterior densities. Equation (21) has the implicit assumption that the target state and the Student-t distribution parameters are independent. By minimizing the Kullback-Leibler (KL) divergence between p( .. The optimal solution of Equation (22) can be obtained by [9,21,26] where φ denotes an arbitrary element in ..
x (−φ) k denotes the elements set in ..
x k except for φ, and c φ is an irrelevant constant for element φ.

The Update of Unknown Student-t Distribution Parameters
Based on the assumption that the target state is independent of the Student-t distribution parameters, the joint posterior density of the augmented target state can be factorized as p( .. where the unknown Student-t distribution parameters are integrated out. Based on (24), the target state can be estimated by particle samples and the Student-t distribution parameters can be updated by the VB approximation method.
In this subsection, we derive the update of Student-t distribution parameters for each particle. At time k, let .. ..
where z k is the measurement with respect to the target represented by particle x The posterior distributionq(R with parameters Letting φ = ν (i) k and using Equation (27) in Equation (23), we can obtain Using Stirling's approximation in [27] ln Γ( ν The posterior distributionq(ν with parameters Letting φ = u (i) k and using Equation (27) in Equation (23), we can obtain The posterior distributionq(u Then we can obtain . Since the parameters in Equations (30), (31), (36), (37), (40) and (41) are coupled, it can be solved iteratively by the fixed point method in [21].

The Implementation of RSMC-PHD Filter
In this subsection, we introduce the Student-t distribution into the SMC-PHD filter and present an implementation of the robust SMC-PHD filter with unknown heavy-tailed measurement noise. Assume that the multi-target posterior intensity is represented by a weighted particles set ω Assume R k , u k , and ν k are independent and the predicted intensity at time k can be expressed as k are given in Equations (14) and (15). There are L k−1 particles for survival targets and J k particles for birth targets. Generally, the dynamic of measurement noise is unknown and we use the similar heuristic dynamic in [9] for the predicted sufficient statistics. For the particles of survival targets i = 1, 2, . . . , L k−1 , the predicted sufficient statistics are given by where ρ ∈ (0, 1] is the decreasing factor. For the particles of birth targets i = L k−1 + 1, L k−1 + 2, . . . , L k−1 + J k , the predicted sufficient statistics are given by γ,k are sufficient statistics with respect to the birth targets. The posterior intensity can be expressed as where ω (i) k is given in (17). It is worth mentioning that the likelihood function is g k (z .. . Then, the sufficient statistics υ where (·) (i)(j) denotes the update of parameter (·) (i) at the jth iteration. The initial values are given by V k is needed in each iteration. Since there is no explicit association between measurements and targets, the measurement origin is ambiguous. Considering the fact that the measurement with larger likelihood value is more likely generated by a potential target, we choose the measurement with the largest likelihood value to calculate the innovation covariance matrix as follows where l = argmax ..

The Modification of Particles Weight
The estimate of target number relies on the updated weights of particles in the SMC-PHD filter. In our paper, the weight is updated by Equation (17) for each particle. The likelihood function is g k (z ..
, and the updated weight of the ith particle is given by Figure 1 gives the illustration of the measurement likelihood function. We can see that the Student-t distribution has a heavy-tailed characteristic and it can deal with outliers well by increasing the weights of outliers. However, the weight of clutter will increase at the same time. Hence, some clutter may be regarded as target-generated measurements, resulting in the overestimate of the target number at the current time.
To solve this problem, a strategy is adopted to modify the updated weight in this subsection. In the PHD filter, one target can produce at most one measurement and only one measurement can be attributed to one target per time step [4], named as the 'one-to-one' assumption. For each particle, the normalized measurement likelihood value g ..
where q = argmin p∈I g p k ( .. x (i) k ), & is a logic and operation and or is a logic or operation. In our paper, the measurement with the largest normalized likelihood value is regarded as the corresponding measurement generated from the target corresponding to particle maximum likelihood value and penalizes the other larger likelihood value to hold the 'one-to-one' assumption. Then the modified weight can be expressed as The robust SMC-PHD filter is summarized in Algorithm 1.

Remark 1:
Although the heavy-tailed characteristic of the Student-t distribution can deal with outliers, the target might still miss when the weights of outliers are lower than the extracted threshold and the target number would be underestimated. In our paper, the weight modification of the particles is mainly aimed to deal with the target number overestimate problem caused by clutter at the current time.
Step 1 Prediction for i = 1, 2, . . . , L k−1 Step 2 Update For i = 1, 2, . . . , Perform iteration initialization, set the iteration number τ, and update the parameters of measurement noise by Equations (49)-(60). end for Step 3 Resampling Step 4 State extraction Step 3 and step 4 are the same as that in the standard SMC-PHD filter.  Figure 1 gives the illustration of the measurement likelihood function. We can see that the Student-t distribution has a heavy-tailed characteristic and it can deal with outliers well by increasing the weights of outliers. However, the weight of clutter will increase at the same time. Hence, some clutter may be regarded as target-generated measurements, resulting in the overestimate of the target number at the current time. To solve this problem, a strategy is adopted to modify the updated weight in this subsection. In the PHD filter, one target can produce at most one measurement and only one measurement can be attributed to one target per time step [4], named as the 'one-to-one' assumption. For each particle, the normalized measurement likelihood value can be obtained by Equations (59) and (60). Let σ is a preset threshold. If

Simulation Results
To reveal the performance of the proposed robust SMC-PHD (RSMC-PHD) filter, we compare the Gaussian (Normal) Gamma inverse Wishart Gamma PHD (NGIWG-PHD) filter [23] and the SMC-PHD filter [6] with the proposed filter. For each simulation, 100 Monte Carlo trials are performed. The target number and optimal sub-pattern assignment (OSPA) [28] distance are chosen as the metric, where the parameters of OSPA are p = 2 and c = 100.

The Choice of Threshold σ
In Section 3.4, threshold σ is utilized to distinguish the target-generated measurements from clutter. In the SMC-PHD filter, the underlying target relies on the updated weight of each particle, and the likelihood value in Equation (59) can directly reflect the contribution of a given measurement or clutter to the underlying target. In most cases, the targetgenerated measurements usually have the largest likelihood value for each particle. If σ is larger, some clutter might be still regarded as target-generated measurements, and the overestimate of the target number cannot be solved. By contrast, if σ is smaller, the likelihood values of some clutter which contribute a small proportion to the updated weight will be modified. However, this modification will have no obvious effect on the updated weight of the particle except for unnecessary complexity. As we all know, N particle is the number of particles for each expected target in the SMC-PHD filter and it is used to extract the weighted particles to approximate the posterior intensity. To some extent, 1/N particle can approximately reflect the average contribution of each particle for an underlying target. Therefore, we choose σ = 1/N particle as the threshold.

Linear Scenario
The target state of each target is x k = [x k , .
x k , y k , y k ] T . The target motion model is given by T ), σ w = 5 m/s 2 is the standard deviation of the process noise, ∆ = 1 s is the sampling period. The observation model is formulated by where The heavy-tailed measurement noise corrupted by outliers is represented as [21,23] where η denotes the probability of the measurement noise outlier.
is the nominal noise variance, and R n = 20R 0 . The model of the birth target is a Poisson RFS with intensity where m To illustrate the performance of the proposed filter under the Gaussian measurement noise, η is set to 0. Figures 3 and 4 show the estimate of target number and OSPA distance of the three filters, respectively. From Figure 3, we can see that all the filters achieve unbiased estimate of the target number and have a comparable estimate accuracy under Gaussian measurement noise. This is because the SMC-PHD filter utilizes the true measurement noise covariance. For the NGIWG-PHD filter and the RSMC-PHD filter, the To illustrate the performance of the proposed filter under the Gaussian measurement noise, η is set to 0. Figures 3 and 4 show the estimate of target number and OSPA distance of the three filters, respectively. From Figure 3, we can see that all the filters achieve unbiased estimate of the target number and have a comparable estimate accuracy under Gaussian measurement noise. This is because the SMC-PHD filter utilizes the true measurement noise covariance. For the NGIWG-PHD filter and the RSMC-PHD filter, the DOF and scale matrix of the Student-t distribution can be adjusted automatically and the target tracking under Gaussian measurement noise realized. Figure 4 shows the OSPA distance of the three filters. Obviously, all the filters have an approximate OSPA distance. To illustrate the performance of the proposed filter under the Gaussian measurement noise, η is set to 0. Figures 3 and 4 show the estimate of target number and OSPA distance of the three filters, respectively. From Figure 3, we can see that all the filters achieve unbiased estimate of the target number and have a comparable estimate accuracy under Gaussian measurement noise. This is because the SMC-PHD filter utilizes the true measurement noise covariance. For the NGIWG-PHD filter and the RSMC-PHD filter, the DOF and scale matrix of the Student-t distribution can be adjusted automatically and the target tracking under Gaussian measurement noise realized. Figure 4 shows the OSPA distance of the three filters. Obviously, all the filters have an approximate OSPA distance.
To verify the multi-tracking performance under heavy-tailed measurement noise, η is set to 0.2. The remaining parameters are the same as above.    Figures 5 and 6 show the estimate of target number and OSPA distance of the three filters, respectively. Under heavy-tailed measurement noise, the SMC-PHD filter will diverge because the assumed measurement noise is not matched with the true measurement noise. For the NGIWG-PHD filter and the RSMC-PHD filter, the two filters will produce a weight which is non-negligible for filter update when there are outliers in measurements. Thus, the estimate accuracy of the target number of the NGIWG-PHD filter and the RSMC-PHD filter is higher than that of the SMC-PHD filter. From Figure 6, it can be observed that the RSMC-PHD filter has the best performance on OSPA distance.
er To verify the multi-tracking performance under heavy-tailed measurement noise, η is set to 0.2. The remaining parameters are the same as above. Figures 5 and 6 show the estimate of target number and OSPA distance of the three filters, respectively. Under heavy-tailed measurement noise, the SMC-PHD filter will diverge because the assumed measurement noise is not matched with the true measurement noise. For the NGIWG-PHD filter and the RSMC-PHD filter, the two filters will produce a weight which is non-negligible for filter update when there are outliers in measurements. Thus, the estimate accuracy of the target number of the NGIWG-PHD filter and the RSMC-PHD filter is higher than that of the SMC-PHD filter. From Figure 6, it can be observed that the RSMC-PHD filter has the best performance on OSPA distance. urement noise. For the NGIWG-PHD filter and the RSMC-PHD filter, the two filters will produce a weight which is non-negligible for filter update when there are outliers in measurements. Thus, the estimate accuracy of the target number of the NGIWG-PHD filter and the RSMC-PHD filter is higher than that of the SMC-PHD filter. From Figure 6, it can be observed that the RSMC-PHD filter has the best performance on OSPA distance.  Figure 7 illustrates that the MAE estimate of target number for the RSMC-PHD filter is lower than that of the RSMC-PHD(M1) filter. The more the clutter number, the higher is the probability that the clutter is regarded as the target-generated measurement, resulting in an overestimate of the target number at the current time. From Figure 8, it can be seen that the RSMC-PHD filter also outperforms the RSMC-PHD(M1) filter on AOSPA distance.  trial, OSPA k is the corresponding OSPA distance. Figure 7 illustrates that the MAE estimate of target number for the RSMC-PHD filter is lower than that of the RSMC-PHD(M1) filter. The more the clutter number, the higher is the probability that the clutter is regarded as the target-generated measurement, resulting in an overestimate of the target number at the current time. From Figure 8, it can be seen that the RSMC-PHD filter also outperforms the RSMC-PHD(M1) filter on AOSPA distance.

Nonlinear Scenario
To further evaluate the performance of the proposed filter, we consider a nonlinear scenario. For the NGIWG-PHD filter, EKF is used to handle nonlinearity.
The target state transition function is given by The target state is The observation model is formulated by 2 2 arctan( ) Assume the nominal measurement noise variance is ( )   Figure 7 illustrates that the MAE estimate of target number for the RSMC-PHD filter is lower than that of the RSMC-PHD(M1) filter. The more the clutter number, the higher is the probability that the clutter is regarded as the target-generated measurement, resulting in an overestimate of the target number at the current time. From Figure 8, it can be seen that the RSMC-PHD filter also outperforms the RSMC-PHD(M1) filter on AOSPA distance.

Nonlinear Scenario
To further evaluate the performance of the proposed filter, we consider a nonlinear scenario. For the NGIWG-PHD filter, EKF is used to handle nonlinearity.
The target state transition function is given by where The target state is x k , y k , . y k , ω] T , ω is the turn rate, w k ∼ N (·; 0 2 , Q k ), , σ x = σ y = 5 m/s 2 , σ ω = 5(π /180) rad/s, ∆ = 1 s. The observation model is formulated by Assume the nominal measurement noise variance is R 0 = diag([10 2 , (π/180 ) 2 ] T ), and the true measurement noise is represented as (69) with R n = 20R 0 and η = 0.2. The model of birth target is a Poisson RFS with intensity where P γ = diag([900, 900, 900, 900, The trajectories of the targets are shown in Figure 9, where circles are the beginning positions and triangles are the ending positions. Figures 10 and 11 show the estimate of target number and OSPA distance for the two filters, respectively. It indicates that the NGIWG-PHD filter performs worse in the nonlinear system. This is because the NGIWG-PHD filter uses EKF to linearize the nonlinear model, and the estimate error of the target state will become large. Under the influence of heavy-tailed measurement noise, the target will miss and the target number will be underestimated. On the contrary, the RSMC-PHD filter is not subject to nonlinear constraint, and has a better performance in the nonlinear system.    Figures 10 and 11 show the estimate of target number and OSPA distance for the two filters, respectively. It indicates that the NGIWG-PHD filter performs worse in the nonlinear system. This is because the NGIWG-PHD filter uses EKF to linearize the nonlinear model, and the estimate error of the target state will become large. Under the influence of heavy-tailed measurement noise, the target will miss and the target number will be underestimated. On the contrary, the RSMC-PHD filter is not subject to nonlinear constraint, and has a better performance in the nonlinear system.

Conclusions
In this paper, a robust SMC-PHD filter with unknown heavy-tailed measurement noise is proposed. The proposed filter uses the Student-t distribution to model the unknown measurement noise. The IW distribution is used to model the unknown scale matrix parameter. Meanwhile, the Gamma distribution is used to model the unknown DOF parameter. Then, the unknown noise parameters are updated by the VB technique within the SMC-PHD filter framework. In addition, a strategy is applied to solve the overestimate of the target number induced by the Student-t distribution. Simulation results under the linear system and nonlinear system show that the proposed filter has a better performance than the compared filters with the unknown heavy-tailed measurement noise, especially in a nonlinear system.

Conclusions
In this paper, a robust SMC-PHD filter with unknown heavy-tailed measurement noise is proposed. The proposed filter uses the Student-t distribution to model the unknown measurement noise. The IW distribution is used to model the unknown scale matrix parameter. Meanwhile, the Gamma distribution is used to model the unknown DOF parameter. Then, the unknown noise parameters are updated by the VB technique within the SMC-PHD filter framework. In addition, a strategy is applied to solve the overestimate of the target number induced by the Student-t distribution. Simulation results under the linear system and nonlinear system show that the proposed filter has a better performance than the compared filters with the unknown heavy-tailed measurement noise, especially in a nonlinear system.