Multi-Vehicle Cooperative Target Tracking with Time-Varying Localization Uncertainty via Recursive Variational Bayesian Inference

Cooperative target tracking by multiple vehicles connected through inter-vehicle communication is a promising way to improve the estimation of target state. The effectiveness of cooperative tracking closely depends on the accuracy of relative localization between host and cooperative vehicles. However, the localization signal usually provided by the satellite-based navigation system is rather susceptible to dynamic driving environment, thus influencing the effectiveness of cooperative tracking. In order to implement reliable cooperative tracking, especially when the statistical characteristic of the relative localization noise is time-varying and uncertain, this paper presents a recursive Bayesian framework which jointly estimates the state of the target and the cooperative vehicle as well as the localization noise parameter. An online variational Bayesian inference algorithm is further developed to achieve efficient recursive estimate. The simulation results verify that our proposed algorithm can effectively boost the accuracy of target tracking when the localization noise dynamically changes over time.


Introduction
Intelligent vehicles (IVs) [1] equipped with different types of on-board sensors, such as Lidar, Radar, Camera, etc, can collect the information about surrounding environment. By analyzing this information, the IVs can achieve reliable situational awareness [2] and thus make correct decisions [3,4]. Among various environment perception tasks, target tracking [5,6] plays a crucial role for many autonomous driving functions, such as the forward collision assistance or adaptive cruise control. Traditionally, the state of target is estimated based on the data collected by on-board sensors. Consequently, the tracking performance is restricted due to the limitation of sensors. For example, the mutual occlusion between targets will prevent the IVs to recognize those occluded targets [7]. Recently, due to the fast development of inter-vehicle and cellular communication, cooperative perception through vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) communication has attracted much attention [4,7]. The IVs equipped with dedicated short range communication (DSRC) radios [8] or 4G/5G technology [9] are able to exchange their information including the position, speed, heading, etc., with other neighboring IVs or roadside units (RSUs). Then, the vehicle receiving such information can fuse the transmitted data with its local information to improve the state estimate of target and extend the perception range. In the foreseeable future, a standardized 5G network can enable vehicular communication with remarkably low power consumption, high peak data rate, and low latency [10].
During the last few years, many studies have been proposed to fuse the information from multiple sources in cooperative intelligent transportation systems (C-ITS) areas [11]. In this study, we focus on cooperative target tracking via multiple vehicles, although the developed algorithm is also applicable for other cooperative situations. For convenience, we call the vehicle transmitting information as the cooperative vehicle (CV), while the vehicle receiving information as the host vehicle (HV). Depending on the information shared among vehicles, the architecture of cooperative perception can be categorized into three types: data-level, feature-level, and track-level fusion.
For data-level fusion [12], CV directly forwards the raw measurements, such as point cloud or image, to HV, where most processing operations, including spatial registration and state estimation, are carried out. For feature-level fusion [13], on the other hand, the raw data are partially processed independently by each vehicle so as to extract low-dimensional features, which are then exchanged among vehicles. Different from the above two types of fusion, track-level fusion supposes that the raw measurements are completely processed by each vehicle to obtain local state estimation of the target. Then, the local estimation is transmitted to other vehicles for further fusion. Generally, data-level fusion can yield more accurate estimation because it can make full use of the available information. However, it may cause large computation and communication burden. In contrast, track-level fusion [14,15] reduces the communication bandwidth; however, may lose useful information. Feature-level fusion strikes the balance between computation and communication load, and, hence, can be viewed as compromise between data-level and track-level fusion.
In [16], a cooperative perception approach exchanging raw LiDAR data among vehicles was proposed. In [13], an algorithm which can share the feature extracted from point cloud data among multiple vehicles was developed. In [17], a collaborative tracking algorithm was proposed for the tracking of multiple vehicles. PHD filter [18] was independently executed by HV and CV to obtain the state of vehicles. The PHD density was then transformed from the coordinate frame of CV to that of HV and then merged with local estimate via covariance intersection (CI) fusion [19]. The resulting cooperative perception was further applied for the overtaking decision [20].
In order to achieve reliable cooperative tracking, the relative localization between HV and CV is an important factor used to convert the local measurement or track estimation of target from the frame of one vehicle to that of another. In the literatures, most works assume such information is perfectly known when conducting cooperative tracking [21]. Currently, satellite-based navigation system, such as global positioning system (GPS) and Beidou, has been widely applied in the localization of vehicles. However, the localization signal is subject to noise from different sources. The signal can be degraded or even blocked when traveling in dense urban areas close to building or vegetation, thus leading to large localization errors [22,23]. As a result, the relative location between vehicles calculated from the localization signal is expected to be uncertain and time-varying. Few works investigated the cooperative tracking problem with imperfect localization information. In [24], the high-precision and low-precision positioning situations are, respectively, characterized by two observation likelihood models. In [25], an approach for jointly estimating the state of vehicle and targets based on Poisson multi-Bernoulli filter (PMBM) [26] was developed to consider the uncertain localization of vehicle. However, the localization noise covariance was assumed to be exactly known. In [27], a cooperative pedestrian tracking algorithm was presented in the GPS-denied environments. In [28], a robust cooperative tracking algorithm was proposed for the situation where the localization information is not available.
In summary, achieving reliable cooperative target tracking especially when the relative localization noise is uncertain and time-varying is a nontrivial task. Therefore, this paper presents a recursive Bayesian framework to address the above problem. Specifically, both HV and CV collect the measurement of the same target while the information collected by CV is transmitted to HV. Under the assumption that the uncertainty of relative position and orientation between two vehicles may vary over time, HV attempts to estimate the joint state of target and CV as well as the noise parameter based on the measurements. To guarantee the conjugate relationship, a solution algorithm based on variational Bayesian (VB) inference is then developed, such that the system state and the noise parameter can be alternatively corrected. VB inference is widely applied in machine learning community [29,30] and has been successfully introduced into target tracking and sensor fusion [31][32][33] in recent years. Overall, this paper is interesting from the following aspects 1.
Our algorithm can be used in dynamic environments, in the sense that the statistical characteristic of the localization noise is uncertain and time-varying. 2.
The cooperative tracking problem is formulated in recursive Bayesian framework and an efficient VB inference algorithm is proposed.

3.
Simulation shows that the tracking error of cooperative tracking can be reduced by 18.15% to 9.5% compared with non-cooperative tracking, depending on the uncertainty level of the localization noise.
The remainder of this paper is organized as follows. In Section 2, we give the system description and model definition based on recursive Bayesian framework. In Section 3, the principle of VB inference is briefly explained. Then, the specific solution algorithm for the estimation of system state and the localization noise parameter is presented in Section 4. In Section 5, the proposed approach is evaluated through computer simulation experiments. Finally, Section 6 gives the conclusions of the paper.

System Description and Model Definition
The scenario considered in this work is shown in Figure 1. The host and cooperative vehicles collect the information about the same target based on their individual on-board sensors. Then, CV transmits the measurement about the target along with its own localization information to HV. Finally, HV fuses the received measurement with its own measurement about the same target based on the relative localization between HV and CV. It should be emphasized that the scenario considered in this work is very universe and can be extended to more complex situations through integrating with other technologies. For example, by introducing data association, such as global nearest-neighbor method, etc., our model can be immediately used for multiple target tracking.
Sensors 2020, 20, x 3 of 18 based on variational Bayesian (VB) inference is then developed, such that the system state and the noise parameter can be alternatively corrected. VB inference is widely applied in machine learning community [29,30] and has been successfully introduced into target tracking and sensor fusion [31][32][33] in recent years. Overall, this paper is interesting from the following aspects 1. Our algorithm can be used in dynamic environments, in the sense that the statistical characteristic of the localization noise is uncertain and time-varying. 2. The cooperative tracking problem is formulated in recursive Bayesian framework and an efficient VB inference algorithm is proposed. 3. Simulation shows that the tracking error of cooperative tracking can be reduced by 18.15% to 9.5% compared with non-cooperative tracking, depending on the uncertainty level of the localization noise.
The remainder of this paper is organized as follows. In Section 2, we give the system description and model definition based on recursive Bayesian framework. In Section 3, the principle of VB inference is briefly explained. Then, the specific solution algorithm for the estimation of system state and the localization noise parameter is presented in Section 4. In Section 5, the proposed approach is evaluated through computer simulation experiments. Finally, Section 6 gives the conclusions of the paper.

System Description and Model Definition
The scenario considered in this work is shown in Figure 1. The host and cooperative vehicles collect the information about the same target based on their individual on-board sensors. Then, CV transmits the measurement about the target along with its own localization information to HV. Finally, HV fuses the received measurement with its own measurement about the same target based on the relative localization between HV and CV. It should be emphasized that the scenario considered in this work is very universe and can be extended to more complex situations through integrating with other technologies. For example, by introducing data association, such as global nearestneighbor method, etc., our model can be immediately used for multiple target tracking. As shown in Figure 1, taking the coordinate frame of HV as reference frame, we assume that the motion state of both target and CV evolve following linear state space model. Therefore, we can combine their individual motion state together to yield an augmented dynamic model as As shown in Figure 1, taking the coordinate frame of HV as reference frame, we assume that the motion state of both target and CV evolve following linear state space model. Therefore, we can combine their individual motion state together to yield an augmented dynamic model as where k = 1, 2, 3, · · · is the time step, x k ∈ R n is the system state. Specifically, x k = x t k , x c besides the position and velocity as the target. F k ∈ R n×n is the state transition matrix, w k ∈ R n is the process noise following Gaussian distribution N(w k ; 0, Q) with zero mean and covariance matrix Q.
We also assume that the system state x 1 at the first time step is subject to have a Gaussian distribution with mean vectorx 1|1 and the covariance matrix P 1|1 , i.e., N x 1 ;x 1|1 , P 1|1 . For the cooperative tracking scenario we consider in this work, the statistical property of the relative localization noise of cooperative vehicle is time-varying and uncertain while the statistical property of observation noise of target by on-board sensor of HV and CV is known and stable. Therefore, we have the following measurement equation of the system state where Specifically, y t k and y tc k represent the measurement of the same target in the coordinate frame of HV and CV, respectively. v 1 k is the corresponding measurement noise. y c k and θ c k are the measurements of CV in the coordinate frame of HV, including the observed position in x and y direction and heading angle. In this work, we assume that the measurement k also follows Gaussian distribution N v 2 k ; 0, Σ 2 k ; however, with uncertain and time-varying covariance matrix Σ 2 k = diag σ 2 k|k,1 . . . , σ 2 k|k,m 2 1 , with diag(·) denoting a diagonal matrix. According to the relationship between the coordinate frames of HV and CV shown in Figure 1, For the observation y 2 k , we have the measurement matrix in (2) as In our online estimation model, the measurement set y 1:k = y 1 , . . . , y k is observable, while the system state x k and the localization noise covariance Σ 2 k , are viewed as hidden variables and need to be estimated based on y 1:k . The aim of optimal recursive Bayesian filtering is to estimate the posterior probability distribution of x k and Σ 2 k based on the observed data y 1:k , so as to realize the joint estimation of the unknown variables. It generally consists of two steps. First, according to the Chapman-Kolmogorov (CK) equation, the prediction of the unknown variables is given by Then, the observed information is incorporated by the well-known Bayesian theorem, thus yielding the following correction equation Sensors 2020, 20, 6487

of 18
Given the measurements y 1:k−1 , we assume the joint posterior distribution of x k−1 and Σ 2 k−1 at time step k − 1 can be approximated as the product of the Gaussian distribution and the inverse-Gamma distribution, that is where we have Here, IG σ 2 k−1,t ; α k−1|k−1,t , β k−1|k−1,t denotes the inverse-Gamma distribution with the scale parameter α k−1|k−1,t and the shape parameter β k−1|k−1,t . In summary, the proposed cooperative tracking model under uncertain localization noise can be represented as the probabilistic graphical model (PGM) [34] shown in Figure 2.
where we have

Principle of Variational Bayesian Inference
Suppose that we want to infer the posterior distribution ( | ) of the hidden variable given the observed data under the Bayesian framework. Based on the Bayes' theorem, we have where ( | ) is the likelihood of observed data , ( ) is the prior distribution and ( ) denotes the marginal likelihood of data. For many problems, it is infeasible to evaluate the posterior ( | ) analytically or numerically. To address this problem, VB inference was proposed to find variational distribution of hidden variables that can approximate true posterior distribution as closely as possible. Specifically, we always have the following equation for the logarithm of ( ) [35] where (Φ) is the introduced variational distribution aiming to approximate the intractable posterior distribution ( | ); (Φ) is the free energy of the following form and (Φ)|| ( | ) denotes the Kullback-Leibler (KL) divergence between (Φ) and ( | ), which is used to measure the consistency between two probability distributions. That is, Equation (10) indicates that the sum of free energy and KL divergence is always equal to the logarithm of marginal likelihood. In VB framework, we seek for the variational posterior (Φ) by minimizing KL divergence (12), indicating that the variational distribution is able to approximate the true posterior. However, it is infeasible to minimize (12) directly because (Φ| ) is unknown. To

Principle of Variational Bayesian Inference
Suppose that we want to infer the posterior distribution p(φ Z) of the hidden variable φ given the observed data Z under the Bayesian framework. Based on the Bayes' theorem, we have where p(Z φ) is the likelihood of observed data Z, p(φ) is the prior distribution and p(Z) denotes the marginal likelihood of data. For many problems, it is infeasible to evaluate the posterior p(φ Z) analytically or numerically. To address this problem, VB inference was proposed to find variational distribution of hidden variables that can approximate true posterior distribution as closely as possible. Specifically, we always have the following equation for the logarithm of p(Z) [35] log where q(Φ) is the introduced variational distribution aiming to approximate the intractable posterior distribution p(φ Z); F(q(Φ)) is the free energy of the following form and KL q(Φ) p(φ Z) denotes the Kullback-Leibler (KL) divergence between q(Φ) and p(φ Z), which is used to measure the consistency between two probability distributions. That is, Equation (10) indicates that the sum of free energy and KL divergence is always equal to the logarithm of marginal likelihood. In VB framework, we seek for the variational posterior q(Φ) by minimizing KL divergence (12), indicating that the variational distribution is able to approximate the true posterior. However, it is infeasible to minimize (12) directly because p(Φ|Z ) is unknown. To avoid this issue, we can instead maximize the free energy F(q(Φ)) in (11) Then, based on the mean field theory, we suppose that q(Φ) has a factorized form as It indicates that Φ m and Φ l (l m) are independent to each other. In order to maximize F(q(Φ)) with factorized form (13), an iterative approach is adopted where each q(Φ m ) is alternatively optimized while keeping the other q(Φ l ) fixed, l m. In such a case, the optimal q(Φ m ) is given by [35] log where · q(Φ l ),l m is the expectation with respect to all Φ l except Φ m . The above iteration continues until some convergence criterion is satisfied.

Online Variational Bayesian Inference of Parameters
In this section, we concentrate on the solution of the proposed cooperative tracking model. Following the recursive Bayesian framework, the involved prediction step (5) and correction step (6) are derived to achieve joint estimation of the system state and the noise parameter.

Prediction
We assume the system state x k and the noise parameter Σ 2 k are independent to each other given their previous estimation. Therefore, we have Substituting (7) and (15) into (5) yields where we have Considering the state evolution model (1) and the posterior distribution of the system state at time step k − 1 in (8), we can have wherex k|k−1 and P k|k−1 are given by the Kalman filter prediction equations Sensors 2020, 20, 6487

of 18
For the prediction of the localization noise covariance, it is difficult to specify the evolution model which has the desirable conjugative property. In order to implement recursive estimation and consider the time-varying characteristics of unknown localization noise covariance, a heuristic model [31] is assumed such that p Σ 2 k y 1: where the involved scale and the shape parameters are given by for t = 1, 2, . . . , m 2 . Here, ρ (0 < ρ ≤ 1) is called the forgetting factor reflecting the fluctuation characteristics of noise statistics. Finally, substituting (18) and (20) into (16), we obtain the prediction distribution of the system state and the localization noise covariance as follows

Correction
It can be seen from the correction equation (6) that solving the joint posterior distribution p x k , Σ 2 k y 1:k of the system state and the localization noise covariance involves multiple integrals and thus is difficult to calculate directly. Therefore, following the VB inference principle described in Section 3, we construct the variational posterior distribution q x k , Σ 2 k , which approximates the true posterior distribution p x k , Σ 2 k y 1:k . Therefore, we have where Q x (x k ) and Q Σ Σ 2 k are the approximate probability densities of the unknown x k and Σ 2 k . The KL divergence between the separable approximate distribution and the true posterior distribution is According to equation (14), the logarithmic expression of the approximate distribution of x k and Σ 2 k is given by Furthermore, according to the graphical model shown in Figure 2, the log joint posterior distribution of y 1 k , y 2 k , x k and Σ 2 k can be expressed as Now, we focus on the derivation of the variational posterior distribution of x k . Firstly, as can be seen from (3), the measurement equation h(x k ) is nonlinear in the system state x k , thus preventing the recursive estimation. To deal with this problem, we follow the linearization of extended Kalman filter (EKF) and expand the nonlinear function into a Taylor series. By omitting the terms higher than order two, we can obtain the linearized model of the nonlinear system. Specifically, let the Jacobian matrix of the measurement equation h(x k ) at the predicted system statex k|k−1 be For convenience, the observation and the measurement equation can be combined as follows Then, by substituting (26) and (28) into (25), we can have where representing the expectation of Σ 2 k with respect to the variational posterior distribution Q Σ Σ 2 k . According to (29), we immediately find that the variational posterior distribution of x k is Gaussian Q x (x k ) = N x k ;x k|k , P k|k , where the parameterŝ x k|k and P k|k are given by In order to derive the variational posterior distribution of Σ 2 k , we substitute (26) into (25) and obtain As a result, we find that Σ 2 k follows inverse-Gamma distribution Q Σ Σ 2 k = m 2 t=1 IG σ 2 k|k,t ; α k|k,t , β k|k,t , with the parameters α k|k,t , β k|k,t (t = 1, 2, . . . , m 2 ) given by Considering the correction equations of Q x (x k ) and Q Σ Σ 2 k are dependent on each other, the posterior distribution parameters need to be calculated alternatively such that more accurate approximation can be achieved. Considering the calculation cost, we present two termination conditions Condition 1: If the state change between two adjacent iterations is less than the threshold, then the VB iteration will converge. The specific formula is where · 2 is the Euclidean norm.

Condition 2:
If the number of VB iterations reaches the predetermined maximum number of iterations c, the VB iteration will also be terminated.

Algorithm
In summary, the proposed cooperative tracking algorithm under time-varying localization noise can be described as follows (Algorithm 1).

Simulation Scene Configuration
Assume that the target and CV move in a 2-D plane with the constant velocity model. The position of target can be observed simultaneously by HV and CV. In addition, the position of CV can be obtained by HV through inter-vehicle communication. Taking the coordinate frame of HV as reference frame, the initial states of the target and the cooperative vehicle are [30,15,1,1] T and [20, 20, 2, 2] T , respectively. The state transition matrix F k in (1) is given by  The process noise w k in (1) is assumed to be zero-mean white Gaussian noise with covariance matrix Q = diag(Q 1 , Q 2 ), where Q 1 = I 2 0.01 2 GG T , is the 2-D identity matrix, is the Kronecker product. The expression of G and Q 2 is given below [36] Suppose that the observation noises v 1 k and v 2 k in (2) obey Gaussian distribution with zero mean and covariance matrix. Specifically, for the on-board sensor measurement noise v 1 k , we let the corresponding covariance Σ 1 k = diag(0.5, 0.5, 0.5.0.5). For the localization noise v 2 k , we simulate the time-varying covariance Σ 2 k at time step k as follows where Σ = diag σ 2 , σ 2 , 0.1 × σ 2 . In the simulation, we let σ 2 vary in the set {0.1, 0.2, 0.3, 0.4, 0.5}, so as to investigate the tracking performance under different levels of localization uncertainty. Take σ 2 = 0.2 as an instance. Figure 3 shows the observed position of the target and CV in the coordinate frame of HV. Figure 4 shows the observed heading angle of CV in the coordinate frame of HV. Figure 5 shows the observed position of the target in the coordinate frame of CV.
where Σ = diag( , , 0.1 × ). In the simulation, we let vary in the set 0.1,0.2,0.3,0.4,0.5 , so as to investigate the tracking performance under different levels of localization uncertainty. Take = 0.2 as an instance. Figure 3 shows the observed position of the target and CV in the coordinate frame of HV. Figure 4 shows the observed heading angle of CV in the coordinate frame of HV. Figure 5 shows the observed position of the target in the coordinate frame of CV.    = 0.2 as an instance. Figure 3 shows the observed position of the target and CV in the coordinate frame of HV. Figure 4 shows the observed heading angle of CV in the coordinate frame of HV. Figure 5 shows the observed position of the target in the coordinate frame of CV.         With the above simulated scene, computer experiments are carried out to evaluate the performance of our proposed algorithm, termed as variational Bayesian inference-cooperative tracking (VBI-CT). For comparison, we include three related methods: Kalman filter (KF), EKF-cooperative tracking/static (EKF-CT/S) and EKF-cooperative tracking/dynamic (EKF-CT/D). For KF, we estimate the state of target based solely on HV without cooperation. For EKF-CT/S, we assume the potential localization noise covariance Σ 2 k is static and set to Σ, and thus the state of target and CV (i.e., (1) and (2)) can be recursively estimated by EKF. For EKF-CT/D, it is similar to EKF-CT/S, except that we assume the perfect knowledge regarding the dynamic localization noise is provided (i.e., the exact value of Σ 2 k at each time step is known). For all the algorithms, the initial state of the target and CV is set as the corresponding position observation while the velocity is simply set as zero since no prior information is available. For VBI-CT, there are some parameters need to set before online estimation. Specifically, the convergence threshold of VB iteration ε = 5 × 10 −6 , the forgetting factor ρ = 0.7. the maximum number of iterations c = 10, the initial value for the localization noise parameter α and β is simply set as 1.0.
To eliminate the influence of randomness, we perform a total of 100 Monte Carlo (MC) runs to compare KF, EKF-CT/S, EKF-CT/D, and VBI-CT. To evaluate the tracking performance of different algorithms, we adopt the root mean squared error (RMSE) and averaged root mean squared error (ARMSE), defined as where p x,k , p y,k denotes the true target position at time step k, while (p m x,k ,p m y,k ) is the corresponding estimation at the mth MC run. As can be seen, ARMSE is the average RMSE across the whole simulation time.

Tracking Performance under Time-Varying Localization Noise
As mentioned above, we have performed a total of 100 MC runs to calculate the tracking error of four algorithms. The experimental results obtained by KF, EKF-CT/S, EKF-CT/D, and VBI-CT, when the localization noise covariance σ 2 increases from 0.1 to 0.5, are shown in Tables 1-5, respectively. Figure 6 shows the RMSE curve of different algorithms when σ 2 = 0.2.

Run Time Comparison
The averaged execution time of all algorithms across 100 MC runs is shown in Table 6. It should be noticed that the execution time includes the calculation in 400 time steps. In terms of execution time, compared with the other algorithms, KF consumes the least time because it only needs to estimate the state of target, thus leading to very small state space. The speed of EKF-CT/S and EKF-CT/D is slower than that of KF because they both have to estimate the joint state of target and CV, thus leading to larger state space. Finally, our proposed VBI-CT algorithm takes the longest time, because it needs to estimate more parameters and iterate the VB inference until convergence. Nevertheless, the time consumed in each time step is still very small (~5 ms).

Influence of Parameters
For the proposed VBI-CT, the forgetting factor 0 < ≤ 1 is used to initialize the prior distribution of the localization noise at each time step, thus will affect the subsequent VB inference procedure. A large should be chosen when the variation of the localization noise over time is slow, wheras a small is more preferred if the localization noise changes fast. Hence, we analyze the variation of tracking error with respect to different values of . We increase the value from 0.5 to 1.0 with step 0.1. The resulting variation curve of ARMSE is shown in Figure 7. It can be seen that the tracking error gradually decreases with the increased value of at the initial stage. However, when is too large, the tracking error will increase due to the excessive self-pruning of the parameters by the VB algorithm [32]. Therefore, in the simulation, we set as 0.7. By analyzing the above experimental results, we find that the overall performance of KF is the worst among the three algorithms. The reason for the poor performance is that the target state is estimated exclusively based on the observations of HV without considering the information from CV. EKF-CT/S outperforms KF when the localization noise covariance σ 2 is small, indicating the information from CV has been successfully integrated with the local observation of HV. However, with the increase of σ 2 , the performance of EKF-CT/S degenerates. This is because EKF-CT/S cannot adjust the localization noise covariance automatically, thus lacking adaptation to the dynamic environment. EKF-CT/D and VBI-CT work better than the other two methods, since they can not only integrate the information from CV but also take into account the dynamic variation of the localization noise uncertainty. We observe from the results that the performance of VBI-CT is very close to that of EKF-CT/D, which assumes the exact time-varying localization noise is known. As a result, the application of EKF-CT/D is restricted in many real applications where the uncertainty of the localization error is difficult or even impossible to obtain. Our proposed VBI-CT, on the other hand, can estimate such uncertainty automatically based on the observed measurements, thus verifying the effectiveness of VBI-CT in the dynamic environment. In addition, we notice that, with the increase of the localization noise variance, the tracking error of VBI-CT increases correspondingly because of the position uncertainty of CV. Nevertheless, in comparison with KF, we can find the tracking error of VBI-CT can reduce about 18.15% to 9.5%, depending on the uncertainty level of the localization noise.

Run Time Comparison
The averaged execution time of all algorithms across 100 MC runs is shown in Table 6. It should be noticed that the execution time includes the calculation in 400 time steps. In terms of execution time, compared with the other algorithms, KF consumes the least time because it only needs to estimate the state of target, thus leading to very small state space. The speed of EKF-CT/S and EKF-CT/D is slower than that of KF because they both have to estimate the joint state of target and CV, thus leading to larger state space. Finally, our proposed VBI-CT algorithm takes the longest time, because it needs to estimate more parameters and iterate the VB inference until convergence. Nevertheless, the time consumed in each time step is still very small (~5 ms).

Influence of Parameters
For the proposed VBI-CT, the forgetting factor 0 < ρ ≤ 1 is used to initialize the prior distribution of the localization noise at each time step, thus will affect the subsequent VB inference procedure. A large ρ should be chosen when the variation of the localization noise over time is slow, wheras a small ρ is more preferred if the localization noise changes fast. Hence, we analyze the variation of tracking error with respect to different values of ρ. We increase the value from 0.5 to 1.0 with step 0.1. The resulting variation curve of ARMSE is shown in Figure 7. It can be seen that the tracking error gradually decreases with the increased value of ρ at the initial stage. However, when ρ is too large, the tracking error will increase due to the excessive self-pruning of the parameters by the VB algorithm [32]. Therefore, in the simulation, we set ρ as 0.7. The number of iterations in VB inference is an important factor affecting the tracking behavior. Generally, with the increase of iterations, variational distribution can approximate the true posterior distribution more precisely, thus resulting in better tracking performance. However, too many iterations will bring about heavier computational burden and reduce the tracking speed. Therefore, we investigate how the tracking error varies with the increased number of VB iterations. The result is shown in Figure 8. where the number of VB iterations increases from 1 to 10. We also show different variation curves under different values of forgetting factor. As can be seen, the tracking error decreases rapidly during the first several iterations and reaches the convergence after about 3 iterations. Therefore, the proposed cooperative tracking method shows fast convergence speed with respect to the number of VB iterations. The number of iterations in VB inference is an important factor affecting the tracking behavior. Generally, with the increase of iterations, variational distribution can approximate the true posterior distribution more precisely, thus resulting in better tracking performance. However, too many iterations will bring about heavier computational burden and reduce the tracking speed. Therefore, we investigate how the tracking error varies with the increased number of VB iterations. The result is shown in Figure 8. where the number of VB iterations increases from 1 to 10. We also show different variation curves under different values of forgetting factor. As can be seen, the tracking error decreases rapidly during the first several iterations and reaches the convergence after about 3 iterations. Therefore, the proposed cooperative tracking method shows fast convergence speed with respect to the number of VB iterations. is shown in Figure 8. where the number of VB iterations increases from 1 to 10. We also show different variation curves under different values of forgetting factor. As can be seen, the tracking error decreases rapidly during the first several iterations and reaches the convergence after about 3 iterations. Therefore, the proposed cooperative tracking method shows fast convergence speed with respect to the number of VB iterations. In the above experiments, we assume the information from CV can be immediately transmitted to HV without delays. In such a case, the transmitted information and the local perception of HV can be temporally aligned. However, in real applications, due to the communication band or the large number of vehicles involved in cooperation, communication delay is an important factor influencing the performance of cooperative tracking. Therefore, we study the tracking error under different communication delays. We increase the delay from 0.00 to 0.09 s with the step 0.01 and show the results in Figure 9. As we can see, with the increase of communication delay, the error of cooperative In the above experiments, we assume the information from CV can be immediately transmitted to HV without delays. In such a case, the transmitted information and the local perception of HV can be temporally aligned. However, in real applications, due to the communication band or the large number of vehicles involved in cooperation, communication delay is an important factor influencing the performance of cooperative tracking. Therefore, we study the tracking error under different communication delays. We increase the delay from 0.00 to 0.09 s with the step 0.01 and show the results in Figure 9. As we can see, with the increase of communication delay, the error of cooperative tracking gradually increases and the impact of delay on cooperative tracking is larger when the localization uncertainty is small. tracking gradually increases and the impact of delay on cooperative tracking is larger when the localization uncertainty is small.

Tracking Performance under Stationary Localization Noise
In this experiment, we assess the tracking performance of different approaches when the localization noise is stationary so as to evaluate the applicability of our proposed algorithm. In this case, EKF-CT/D will reduce to EKF-CT/S since the covariance of localization noise is constant. We gradually increase the value of from 0.1 to 0.5 and show the corresponding results in Figure 10. As can be seen, VBI-CT and EKF-CT/S algorithms still outperform KF in terms of state estimation, which further verifies the advantages of cooperative tracking over non-cooperative tracking. We also observe that, as the relative localization noise increases, the error of cooperative tracking algorithms increases gradually, thus emphasizing the importance of relative localization in the cooperative scenario.

Tracking Performance under Stationary Localization Noise
In this experiment, we assess the tracking performance of different approaches when the localization noise is stationary so as to evaluate the applicability of our proposed algorithm. In this case, EKF-CT/D will reduce to EKF-CT/S since the covariance of localization noise is constant. We gradually increase the value of σ 2 from 0.1 to 0.5 and show the corresponding results in Figure 10. As can be seen, VBI-CT and EKF-CT/S algorithms still outperform KF in terms of state estimation, which further verifies the advantages of cooperative tracking over non-cooperative tracking. We also observe that, as the relative localization noise increases, the error of cooperative tracking algorithms increases gradually, thus emphasizing the importance of relative localization in the cooperative scenario.
gradually increase the value of from 0.1 to 0.5 and show the corresponding results in Figure 10. As can be seen, VBI-CT and EKF-CT/S algorithms still outperform KF in terms of state estimation, which further verifies the advantages of cooperative tracking over non-cooperative tracking. We also observe that, as the relative localization noise increases, the error of cooperative tracking algorithms increases gradually, thus emphasizing the importance of relative localization in the cooperative scenario.

Conclusions
In this paper, a cooperative target tracking algorithm is developed for the integration of the information from multiple vehicles. This method attempts to jointly estimate the state of target and CV as well as the localization noise parameter modeled by inverse-Gamma distribution. The method is formulated in the recursive Bayesian framework, where the posterior distribution of the unknown variables is dealt with variational Bayesian inference. The performance of the proposed method is verified using computer simulation. The results through 100 Monte Carlo runs show that cooperative

Conclusions
In this paper, a cooperative target tracking algorithm is developed for the integration of the information from multiple vehicles. This method attempts to jointly estimate the state of target and CV as well as the localization noise parameter modeled by inverse-Gamma distribution. The method is formulated in the recursive Bayesian framework, where the posterior distribution of the unknown variables is dealt with variational Bayesian inference. The performance of the proposed method is verified using computer simulation. The results through 100 Monte Carlo runs show that cooperative tracking can effectively reduce the tracking error even with time-varying localization uncertainty.