Multiple Target Tracking Based on Multiple Hypotheses Tracking and Modified Ensemble Kalman Filter in Multi-Sensor Fusion

In multi-sensor fusion (MSF), the integration of multi-sensor observation data with different observation errors to achieve more accurate positioning of the target has always been a research focus. In this study, a modified ensemble Kalman filter (EnKF) is presented to substitute the traditional Kalman filter (KF) in the multiple hypotheses tracking (MHT) to deal with the high nonlinearity that always shows up in multiple target tracking (MTT) problems. In addition, the multi-source observation data fusion is also realized by using the modified EnKF, which enables the low-precision observation data to be corrected by high-precision observation data, and the accuracy of the corrected data can be calibrated by the statistical information provided by the EnKF. Numerical studies are given to demonstrate the effectiveness of our proposed method and the results show that the MHT-EnKF method can achieve remarkable enhancement in dealing with nonlinear movement variation and positioning accuracy for MTT problems in MSF scenario.


Introduction
In multi-sensor surveillance systems like radar-based tracking, sonar-based tracking, and video-based tracking, multiple target tracking (MTT) is a vital problem that always arises. At the same time, the problems brought by multi-sensor monitoring (different positioning measurement methods lead to differences in the accuracy of various positioning information) have also arisen. Therefore, it is necessary to present a method to make use of the high-precision information in multi-source observation data to correct the low-accuracy information, that is, multi-sensor fusion (MSF).
Unlike some applications of the evidence theories for MSF in target recognition [1,2], the purpose of MTT in the MSF scenario is to simultaneously maintain the confirmed tracks for multiple targets and get more accurate estimates of the target trajectories. Once the tracks are generated and confirmed, the number of targets can be estimated and the parameters (velocity, acceleration, future predicted position, etc.) can be computed for each track. Therefore, there are two essential problems in MTT: (1) data/track association; and (2) state estimation/fusion.
The basic step of MTT is the measurement of track association. When a new group of measurements is generated by a sensor, the processing of each observation point can be divided into three situations: (1) assigned to an existing track, (2) considered as a new track, or (3) considered as a false alarm. The global nearest neighbor algorithm (GNN) [3] is one of the simplest and most widely spread However, these studies have not extended the EnKF approach to MTT scenarios. Generally speaking, the application of the EnKF method in target trajectory analysis is comparatively preliminary.
In this study, we proposed a modified EnKF to substitute the KF in MHT to deal with the nonlinearity in the MTT and MFS scenarios. This allows us to consider the coupling of acceleration and velocity with nonlinear kinematic equations in each time interval. In addition, the multi-source observation data fusion is realized by using the EnKF, which enables the low-precision observation data to be corrected to high-precision observation data, and the accuracy of the corrected data can be directly calibrated by the statistical information provided by the EnKF results. In the remainder of this paper, we will first introduce the MHT in general terms (Section 2). Then, we will present the modification on the EnKF for the MHT and their combination for the MTT in the MSF scenario (Section 3). The experimental results and discussion are in Section 4. Finally, Section 5 concludes the paper.

Traditional MHT with Kalman Filter
In this study, the TOMHT is chosen for track hypotheses generation. In this section, we will present the basic concepts regarding TOMHT.

Basic Concepts
In the TOMHT, the observation set at scan k is defined as: where N k is the number of observations, o i k (i 0) is the i-th measurement in scan k and o 0 k represents the missing detection or false alarm.
The track hypothesis matrix is defined as an ensemble of measurements: where z i j j is the i j -th (i j ∈ 1, . . . , N j ) measurement that is assumed to be associated to the i-th track at scan j (j ∈ {1, . . . , k}). The TOMHT aims to utilize the state estimation from the most probable global hypothesis for the track maintenance and keeps a track tree in a certain number of sequential scans to find the global hypothesis. Here, we assume that any two tracks are not allowed to originate from one common beginning and any two tracks in one global hypothesis do not share any common observations. A track tree in the TOMHT contains multiple track branches, also known as track hypotheses. Each track maintained from previous scans may trigger several new tracks. In addition, a new track tree will be built for each measurement to consider the probability that each measurement is a new target (more details of TOMHT can be found in [23,24]).
In the TOMHT, each hypothesis is associated with a track score defined as a log likelihood ratio (LLR). The LLR score of Z i k can be calculated as: and the term ∆L(Z i k ) is defined as: no update on scan k ∆L u (k); track update on scan k The loss in track score when there is no update observation on scan k is a function of the expected detection probabilityP D , whereP D and ∆L u (k) can be calculated using the original work in [24]. In addition, if a signal intensity (such as the signal-to-noise ratio (SNR)) is measured, it may also be added to the track score function.
The track score of the new hypothesis can be initialized as: where λ F denotes the spatial density of clutters and λ N denotes the spatial density of new targets, both of which are assumed to be Poisson distributed. After the individual track scores are calculated, the score of one certain hypothesis is the sum of all track scores contained in that hypothesis. Then, given the hypothesis scores, the most probable global hypotheses can be found out. As a track may show up in several hypotheses, so that its probability is the sum of probabilities of all hypotheses which contain it, the classical sequential probability ratio test (SPRT) based on the current score of a certain track versus the upper and lower thresholds (T low , T up ) characterizes the status of the track. Generally, a track will be confirmed as long as its score exceeds the upper threshold T up and will be directly deleted once its score falls below the lower threshold T low . The upper and lower thresholds are defined as: where α and β are the false and true track confirmation probability, respectively. If the track score falls between the upper and lower thresholds, the corresponding track is still tentative and is required to be further certified. After a measurement has been associated with a track in the current scan, the state update is performed using the Kalman Filter (KF): where K is the Kalman gain, P k|k is the post error covariance, P k|k−1 is the predicted error covariance, x k|k is the posteriori state estimate at time k given observations up to and including at time k and R k is the error covariance of the observation noise. Then, the predict phase of the KF uses the state estimate from the current timestep to produce an estimate of the state at the next time step: where F k is the previous state transition model, B k is the input-control model applied to the control vector u k and w k is the system noise which is drawn from a zero mean multivariate normal distribution with covariance Q k .

TOMHT with Modified EnKF
The EnKF method is mainly divided into two steps: a forecast step and an update step. In the forecast step, first, a set of sample parameters is generated to predict future changes of the mode based on a priori information of the sample parameters. Each of these sample parameters is in the form of a model state vector that contains dynamic and static parameters and observations. In the update step, the state vector in the parameter sample set is corrected by comparing the difference between the predicted value obtained by the prediction step and the actual observation data. This section will modify the EnKF method to handle multi-source data fusion problems for the TOMHT.

Ensemble Matrix
An ensemble matrix Y is first introduced in the EnKF, of which each column represents each element in the sample set. Therefore, each column is defined as a state vector y j , with j being the label of the state vector and the j-th column of the ensemble matrix. The state vector consists of dynamic and static parameters, which can be expressed as: where Ne is the number of samples in the ensemble, η i represents the dynamic parameters, γ i represents static parameters and t k is the time of scan k. Y(t i ) is the mean of the ensemble matrix: where 1 Ne is a N e × N e matrix with each element being 1/N e . Then, the disturbance in the ensemble matrix Y can be expressed as: Then, the covariance matrix C Y for the sample ensemble is:

Observation Matrix
Meanwhile, the measurement vector d ∈ R N d is introduced, where N d is the total number of observation values. We add a perturbation {ε j } with zero mean to d to obtain the measurement matrix D: where d j is defined as: d(t k ) is the observation vector at time t k of scan k. In the multi-sensor data fusion scenario, the perturbation or observation error {ε j } will be different for each sensor. Thus, Equation (16) in the EnKF should be modified to meet the demands: where ε j n (t k ) represents the perturbation for sensor n at time t k . The perturbation matrix E n (t i ) of sensor n can be expressed as: The observation covariance matrix of sensor n at time t k is:

Update Step of the EnKF
The update for the ensemble y a j is represented as follows: where C Y and C D are the covariance matrices of Y and D, y a j is the updated state vector, and y j is the predicted state vector obtained from previous timestep. D j is defined as: In the multi-sensor data fusion scenario, Equation (20) and Equation (21) should be modified as: where the matrix H is a mapping matrix between the model parameter state vector y j and the observation vector d j . Usually, there is no linear mapping between the model parameter state vector y j and the observation vector d j due to the strong nonlinearity of the model. This is the reason why the KF is unable to deal with nonlinear problems. However, when the predicted values of the observation data are contained in the elements of the state vector y j , the form of the mapping matrix will be very simple. The elements in the mapping matrix H will only include 0 and 1:

Forecast Step of the EnKF
In this study, we no longer assume that the target motion is uniform during each time interval, so the state parameters are the target position (xs, ys), velocity (xv, yv), and acceleration (xa, ya). The state vector y j at time t k−1 is expressed as: The predicted state vector y j at time t k is expressed as: Let ∆t represent the time interval between time t k and t k−1 . The nonlinear dynamic governing equation used in this study is expressed as: Although the predicted value of the acceleration value in Equation (27) is tentatively set to the acceleration value at the previous time, this acceleration is continuously corrected in the EnKF update step. After obtaining the predicted state vector y j , Equation (20) can be used to update the y j .
As the observation data of each time step continuously enters the EnKF system (history fitting process), the parameters (speed and acceleration) in the state vector y j can be gradually corrected and gradually approaches the true value.
Finally, by substituting Equation (20) to Equation (27) for Equation (7) to Equation (10) in TOMHT, the model system of the TOMHT based on the EnKF (MHT-EnKF) is established and can be applied to solve nonlinear movements in multi-sensor data fusion scenarios.

Experimental Settings
To evaluate the performance of the proposed MHT-EnKF model, we design the following simulation scenarios. The first case is based on multi-source observation data of one moving target with acceleration. During the actual movement, the target enjoys a uniform acceleration motion with v x,0 = 5 km/h, a x = 0.2 km/h 2 , v y,0 = 4 km/h, a y = −0.2 km/h 2 . The observation data has four different sources with corresponding standard deviations of 2 km, 5 km, 8 km, and 15 km. Through the first case, we tend to verify the data fusion effectiveness of the MHT-EnKF algorithm in dealing with the MSF problem. The second case is presented with a set of multi-source observation data of three moving targets. Each target is moving at a different uniform speed (Track 1 : The observation data has six different sources with corresponding standard deviations of 0.5 km, 1 km, 1.5 km, 2 km, 3 km, and 5 km. To verify the capability of the MHT-EnKF of seeking out correct tracks through multiple track hypotheses, a set of randomly distributed false alarms are simultaneously generated with true observation data. Figure 1 depicts the trajectory and the error bar of each observation data; the target trajectory has 15 data points. The larger the length of the error bar, the larger the standard deviation of the observation, which corresponds to 1 km, 2 km, 4 km and 8 km. The mean value of the observation error at each point is set to be zero. In the MHT-EnKF, the initial guess of the velocity and acceleration are set as follows: the initial velocity average in the X direction is set to be 6 km/h, and the standard deviation is 5 km/h; the initial average velocity in the Y direction is set to be 3 km/h, and the standard deviation is also 5 km/h; the initial average acceleration in the X direction and in the Y direction are all set to be 0 km/h 2 with standard deviation of 0.25 km/h 2 . The number of EnKF ensembles is set to 100. Figure 2 is a prediction of the target motion trajectory based on initial velocity and acceleration guesses. It can be seen that the target predicted trajectory distribution without EnKF is discrete, and the actual motion state cannot be characterized.

Performance Evaluation of Data Fusion Effectiveness of the MHT-EnKF
deviation is 5 km/h; the initial average velocity in the Y direction is set to be 3 km/h, and the standard deviation is also 5 km/h; the initial average acceleration in the X direction and in the Y direction are all set to be 0 km/h 2 with standard deviation of 0.25 km/h 2 . The number of EnKF ensembles is set to 100. Figure 2 is a prediction of the target motion trajectory based on initial velocity and acceleration guesses. It can be seen that the target predicted trajectory distribution without EnKF is discrete, and the actual motion state cannot be characterized.             Figure 4 that after the EnKF update step, the results of the EnKF are not only closer to the actual target motion trajectory, but the standard deviation results of the EnKF trajectory also show a significant decrease, especially at several observation points with large observation errors (starting from the left, point 4,5,8,11,12), which means the EnKF results have a higher positioning accuracy. This is because the updated value of each time has two sources: one source is the current time prediction value obtained by the EnKF prediction step based on the sample update value of the previous time step, and the other source is the observation data at the current moment. When the accuracy of the observation data at the two moments is different, the calculation process of the EnKF makes it possible to correct the low-precision observation data with high-precision observation data and realize information fusion between different observation data sources. At the same time, the last three data points in Figures 3 and 4 are the predicted results of the trajectory based on the target speed and acceleration matching results. It can be seen that the MHT-EnKF prediction value can still keep close to the real track. moments is different, the calculation process of the EnKF makes it possible to correct the lowprecision observation data with high-precision observation data and realize information fusion between different observation data sources. At the same time, the last three data points in Figure 3 and Figure 4 are the predicted results of the trajectory based on the target speed and acceleration matching results. It can be seen that the MHT-EnKF prediction value can still keep close to the real track.  6 show the history matching results of the velocities in the X-direction and Ydirection through the EnKF, in which the black curves represent 100 samples in the EnKF. It shows that the 100 curves gradually converge from the discrete distribution in the initial step to the last step. In the twelfth time step, the average speed in the X direction is 11.56 km/h, and the average speed in the Y direction is −2.93 km/h (the actual X direction speed is 11.80 km/h and the Y direction speed is −2.80 km/h).   Figures 5 and 6 show the history matching results of the velocities in the X-direction and Y-direction through the EnKF, in which the black curves represent 100 samples in the EnKF. It shows that the 100 curves gradually converge from the discrete distribution in the initial step to the last step. In the twelfth time step, the average speed in the X direction is 11.56 km/h, and the average speed in the Y direction is −2.93 km/h (the actual X direction speed is 11.80 km/h and the Y direction speed is −2.80 km/h).     Figure 9 is a comparison of the trajectory fitting results based on the improved EnKF method with KF, EKF and PF. It can be seen from Figure 9 and Table 1 that the trajectory fitting result based on the MHT-EnKF method constructed in this study is closest to the actual trajectory and the other three methods show weak robustness when facing the perturbation in the observation series. It is also verified that the MHT-EnKF model constructed in this study has a robustness to deal with nonlinear motion problems. To further verify the robustness of the algorithm, we applied the algorithm to a piece of trajectory data we collected from a drone produced by Hover. As the GPS update frequency is 10 Hz with a positioning accuracy of 0.5 m, we selected 1 s as the time interval to obtain the positioning information. The tracking results are depicted in Figure 10. It can be seen from the figure that for the real target tracking problem, our algorithm fitting results can still keep close to observation trajectory.          Figure 9 is a comparison of the trajectory fitting results based on the improved EnKF method with KF, EKF and PF. It can be seen from Figure 9 and Table 1 that the trajectory fitting result based on the MHT-EnKF method constructed in this study is closest to the actual trajectory and the other three methods show weak robustness when facing the perturbation in the observation series. It is also verified that the MHT-EnKF model constructed in this study has a robustness to deal with nonlinear motion problems. To further verify the robustness of the algorithm, we applied the algorithm to a piece of trajectory data we collected from a drone produced by Hover. As the GPS update frequency is 10 Hz with a positioning accuracy of 0.5 m, we selected 1 s as the time interval to obtain the positioning information. The tracking results are depicted in Figure 10. It can be seen from the figure that for the real target tracking problem, our algorithm fitting results can still keep close to observation trajectory.

Performance Evaluation of Multi-Target Tracking Effectiveness of the MHT-EnKF
In Section 4.2, we verified the superiority of the newly proposed MHT-EnKF model in multisensor data fusion. However, there is only one moving target in the whole monitoring and tracking process, which means that the advantages of the MHT are not really used. In this section, we design a multi-target and multi-sensor tracking scenario, in which the observation data in different scans comes from different sensors. This time, the initial guess of the velocity and acceleration in the MHT-EnKF are set as follows: the initial velocity average in the X direction is set to be 2 km/h with a standard deviation of 2.5 km/h; the initial average velocity in the Y direction is set to be 6 km/h with a standard deviation of 2.5 km/h; and the initial average accelerations in the X direction and in the Y direction are all set to be 0.5 km/h 2 and −0.5 km/h 2 with a standard deviation of 0.5 km/h 2 . The number EnKF ensembles is also set to 100. As the actual movement of each target keeps a uniform velocity, we will test the effectiveness of MHT-EnKF by whether the model will find the final acceleration to be 0 km/h 2 or not. Figure 11 shows the tracking results of the MHT-EnKF for three targets. The observation data are represented by the randomly distributed diamond points. The false alarms and target observations are all included in these observation points without any distinction. We can see that the MHT-EnKF method seeks out three tracks out of these points, which are plotted as the red, yellow and blue lines. In order to test the correctness of our tracking results, Figure 12 depicts the comparisons between the actual tracks, the corresponding observation points and the MHT-EnKF tracking results with the statistical errors for each track. It shows that, as in the previous example,

Performance Evaluation of Multi-Target Tracking Effectiveness of the MHT-EnKF
In Section 4.2, we verified the superiority of the newly proposed MHT-EnKF model in multi-sensor data fusion. However, there is only one moving target in the whole monitoring and tracking process, which means that the advantages of the MHT are not really used. In this section, we design a multi-target and multi-sensor tracking scenario, in which the observation data in different scans comes from different sensors. This time, the initial guess of the velocity and acceleration in the MHT-EnKF are set as follows: the initial velocity average in the X direction is set to be 2 km/h with a standard deviation of 2.5 km/h; the initial average velocity in the Y direction is set to be 6 km/h with a standard deviation of 2.5 km/h; and the initial average accelerations in the X direction and in the Y direction are all set to be 0.5 km/h 2 and −0.5 km/h 2 with a standard deviation of 0.5 km/h 2 . The number EnKF ensembles is also set to 100. As the actual movement of each target keeps a uniform velocity, we will test the effectiveness of MHT-EnKF by whether the model will find the final acceleration to be 0 km/h 2 or not. Figure 11 shows the tracking results of the MHT-EnKF for three targets. The observation data are represented by the randomly distributed diamond points. The false alarms and target observations are all included in these observation points without any distinction. We can see that the MHT-EnKF method seeks out three tracks out of these points, which are plotted as the red, yellow and blue lines. In order to test the correctness of our tracking results, Figure 12 depicts the comparisons between the actual tracks, the corresponding observation points and the MHT-EnKF tracking results with the statistical errors for each track. It shows that, as in the previous example, accuracy corrections can be formed between observation sources of different precisions, and the fusion positioning resulting is closer to the real tracks.  In contrast, as false alarms are generated in each time step, an EnKF tracking process without applying MHT is depicted in Figure 13. We can see how the three tracks are getting closer until they reach the ambiguous area circled in red and tracking results gradually deviate from real ones shown in Figure 12 after they leave this area. In the end, even though the initial settings of the EnKF tracking without MHT are the same as those in Figure 11, the tracking results are totally different. This means that the EnKF tracking process without applying MHT is unable to detect real tracks when false alarms are disturbing the eyesight. The comparison between Figure 11 and Figure 13 shows the capability of MHT-EnKF in seeking real tracks when interference items appear.   In contrast, as false alarms are generated in each time step, an EnKF tracking process without applying MHT is depicted in Figure 13. We can see how the three tracks are getting closer until they reach the ambiguous area circled in red and tracking results gradually deviate from real ones shown in Figure 12 after they leave this area. In the end, even though the initial settings of the EnKF tracking without MHT are the same as those in Figure 11, the tracking results are totally different. This means that the EnKF tracking process without applying MHT is unable to detect real tracks when false alarms are disturbing the eyesight. The comparison between Figure 11 and Figure 13 shows the capability of MHT-EnKF in seeking real tracks when interference items appear. In contrast, as false alarms are generated in each time step, an EnKF tracking process without applying MHT is depicted in Figure 13. We can see how the three tracks are getting closer until they reach the ambiguous area circled in red and tracking results gradually deviate from real ones shown in Figure 12 after they leave this area. In the end, even though the initial settings of the EnKF tracking without MHT are the same as those in Figure 11, the tracking results are totally different. This means that the EnKF tracking process without applying MHT is unable to detect real tracks when false alarms are disturbing the eyesight. The comparison between Figures 11 and 13 shows the capability of MHT-EnKF in seeking real tracks when interference items appear.  Figure 14 and Figure 15 show the history matching results of the velocities and accelerations in the X-direction and Y-direction through the MHT-EnKF for three tracks. The black curves represent 100 samples in the EnKF. It shows that the 100 curves gradually converge from the discrete distribution in the initial step to the last step in Figure 14 and Figure 15. The final matching results are shown in Table 2. The history matching results are very close to the actual values. Moreover, even if the initial guess value deviates from the actual motion situation, the real trajectory can still be found in the case of interference with the false alarm condition. This proves the robustness of the MHT-EnKF method in dealing with MTT and MSF problems.   Table 2. The history matching results are very close to the actual values. Moreover, even if the initial guess value deviates from the actual motion situation, the real trajectory can still be found in the case of interference with the false alarm condition. This proves the robustness of the MHT-EnKF method in dealing with MTT and MSF problems.  Figure 14 and Figure 15 show the history matching results of the velocities and accelerations in the X-direction and Y-direction through the MHT-EnKF for three tracks. The black curves represent 100 samples in the EnKF. It shows that the 100 curves gradually converge from the discrete distribution in the initial step to the last step in Figure 14 and Figure 15. The final matching results are shown in Table 2. The history matching results are very close to the actual values. Moreover, even if the initial guess value deviates from the actual motion situation, the real trajectory can still be found in the case of interference with the false alarm condition. This proves the robustness of the MHT-EnKF method in dealing with MTT and MSF problems.

Conclusion
Target trajectory analysis based on multi-sensor detection data has always been one of the research focuses in the field of target tracking. In this paper, the EnKF method is modified to solve the MSF problem and a set of nonlinear kinematics equations in the forecast step of the EnKF are constructed, which makes the modified EnKF method capable of dealing with the nonlinearity of the coupling of velocity and acceleration. Furthermore, the MHT is introduced to help the EnKF to cope with the multi-target tracking problem. Thus, unlike former research, the MHT-EnKF method established in this study is able deal with nonlinear MTT problems in MSF scenarios.
Meanwhile, the feasibility of applying the MHT-EnKF model in MSF scenarios is verified by two simulation case studies. The following conclusions can be drawn: (1) in the MSF scenario, the motion state (velocity and acceleration) of the target can be accurately fitted by the MHT-EnKF based on the historical trajectory of the target and used to predict the future motion trajectory; (2) by adding different error disturbances to the observation data with different accuracies in the calculation process, the MHT-EnKF method can directly express the uncertainty of the multi-source observation data and take these uncertainty into account in the parameter fitting process; (3) the MHT-EnKF method can realize the fusion of multi-source observation data with different accuracies even for

Conclusions
Target trajectory analysis based on multi-sensor detection data has always been one of the research focuses in the field of target tracking. In this paper, the EnKF method is modified to solve the MSF problem and a set of nonlinear kinematics equations in the forecast step of the EnKF are constructed, which makes the modified EnKF method capable of dealing with the nonlinearity of the coupling of velocity and acceleration. Furthermore, the MHT is introduced to help the EnKF to cope with the multi-target tracking problem. Thus, unlike former research, the MHT-EnKF method established in this study is able deal with nonlinear MTT problems in MSF scenarios.
Meanwhile, the feasibility of applying the MHT-EnKF model in MSF scenarios is verified by two simulation case studies. The following conclusions can be drawn: (1) in the MSF scenario, the motion state (velocity and acceleration) of the target can be accurately fitted by the MHT-EnKF based on the historical trajectory of the target and used to predict the future motion trajectory; (2) by adding different error disturbances to the observation data with different accuracies in the calculation process, the MHT-EnKF method can directly express the uncertainty of the multi-source observation data and take these uncertainty into account in the parameter fitting process; (3) the MHT-EnKF method can realize the fusion of multi-source observation data with different accuracies even for multi-target tracking problems and use high-precision observation data to improve the accuracy of low-precision observation data.
In the end, as the MHT-EnKF model can match the motion state parameters (velocity and acceleration) accurately, the abnormal changes of the target motion state parameters can be monitored. It should be mentioned that this work is not a real time computation. However, as the EnKF is a sequential method, once new data are available, these data can be used to update all parameters. This makes the EnKF suitable for real time computation. Some studies have been applied to solve the real time computation problems for the EnKF using distributed or parallel computing techniques. In addition, the data fusion considered in this study is a multi-sensor asynchronous data fusion problem, which means there is only one sensor monitoring each observation step. Thus, further research can be carried out for real time computing problems for the MHT-EnKF or the multi-sensor synchronous observation data fusion problem when one target may be simultaneously detected by several sensors.