Dual Sensor Control Scheme for Multi-Target Tracking

Sensor control is a challenging issue in the field of multi-target tracking. It involves multi-target state estimation and the optimal control of the sensor. To maximize the overall utility of the surveillance system, we propose a dual sensor control scheme. This work is formulated in the framework of partially observed Markov decision processes (POMDPs) with Mahler’s finite set statistics (FISST). To evaluate the performance associated with each control action, a key element is to design an appropriate metric. From a task-driven perspective, we utilize a metric to minimize the posterior distance between the sensor and the target. This distance-related metric promotes the design of a dual sensor control scheme. Moreover, we introduce a metric to maximize the predicted average probability of detection, which will improve the efficiency by avoiding unnecessary update processes. Simulation results indicate that the performance of the proposed algorithm is significantly superior to the existing methods.


Introduction
Multi-target tracking (MTT) refers to the state estimation of an unknown number of moving targets. The measurements are subject to missed detections and clutter. Mahler's finite set statistics (FISST) [1] has provided a unified framework to deal with the MTT problem. For the MTT problem with controllable sensors, it involves multi-target state estimation and optimal control of the sensor [2,3]. The sequential estimation and decision-making process constitute its main content. It has attracted intense interest in the modern surveillance system [4]. Surveys of recent advances in multi-target tracking and sensor control have been presented in [5,6].
The sensor control problem for multi-target tracking has been studied in the context of partially observed Markov decision processes (POMDPs) [7] with FISST. As a key element in the POMDPs, a predefined metric is used to evaluate the performance associated with each control action. Actually, the optimal control process is carried out before the real measurement is observed. Generally, the combination of an appropriate metric and an excellent filter will improve the overall performance.
To solve the above problem, scholars have developed a series of methods. Mahler has proposed the idea of using the Kullback-Leibler discrimination as a metric in [8]. Later, he has defined a metric as the posterior expected number of targets in [9]. Depending on the development of multi-target filters, Ristic, and Vo [10,11] have developed the α-divergence based sensor control algorithms via the probability hypothesis density filter [12,13]. Hoang and Vo [14] have used two objective functions via the Cardinality Balanced Multi-Target Multi-Bernoulli (CB-MeMBer) filter [15]. Gostar et al. have utilized the Cauchy-Schwarz divergence for the CB-MeMBer filter [16], etc. From an information theoretic viewpoint, the mechanism of such algorithm is to measure the information gain between the updated posterior densities and the predicted density [17,18]. In addition, several methods can be classified as the task-driven algorithms [9,14,18,19], and others can be classified as the information-driven algorithms [8,10,11,16]. Obviously, the previous works have paid enough attention to the choice of metrics (tasks or information divergences), while seldom focusing on the structure. This paper was committed to improving the efficiency of sensor control algorithm and maximizing the overall utility of the surveillance system. The main contribution is to propose a dual sensor control scheme for the MTT problem. In addition, two novel metrics were developed for the sensor control process. By minimizing the posterior distance between sensor and targets (PDST), the sensor can be driven to the targets directly. By maximizing the predicted average probability of detection (PAPD), more reliable measurements are observed. From a task-driven perspective, both the PDST and the PAPD are based on the understanding of improving the performance of the surveillance system. In particular, the distance-related metric PDST contributes to the redesign of the structure. Moreover, the sensor control process using the PAPD metric is valid while avoiding unnecessary update steps. Typically, a dual sensor control scheme contains two controllers, in which the metric pair is composed to distinguish different algorithms. Furthermore, the existing evaluation functions can be applied to the proposed dual sensor control scheme directly.
The remaining part of this article is structured as follows. A general formulation of the sensor control process is given in Section 2. For completeness, we present a brief review of the δ-Generalized labeled multi-Bernoulli (GLMB) filter in Section 3. Section 4 contains the main work of the proposed strategies, and Section 5 describes the dual sensor control algorithms for MTT with one controllable sensor via δ-GLMB filter. Simulation results and analysis are given in Section 6. Conclusions are drawn in Section 7.

The Formulation of Sensor Control
For a general nonlinear multi-target tracking system with the sensor control problem, we have where f k denotes an evolution model, h k denotes an observation model, and q k denotes a control model. X represents the state of targets, Z represents the measurement, S represents the state of sensors, and U represents the selected control actions. The noise is used to describe the uncertainty. This problem can be roughly divided into three parts: Filter, Observer, and Controller. These three parts interact with each other. Figure 1 illustrates a general diagram of the sensor control process. Generally, this process is carried out in the framework of the POMDPs [2]. The key elements of a POMDP include: 1. a portrayal of the multi-target posterior probability density function (pdf); 2. the admissible control actions of the sensors; 3. a predefined metric works to evaluate various control actions.
The following parts of this section illustrate the three aspects specifically.

Bayesian Multi-Target Filtering
Mahler's FISST has provided a batch of solutions to the MTT problem in the random finite set (RFS) framework, and the methodologies throughout this paper are derived in this background. Following the conventional notation, we use small letters to denote the single-target states, e.g., x, z while the capital letters for the multi-target states, e.g., X, Z. In addition, blackboard bold letters denote spaces, e.g., X, Z. F (X) represents the collection of all finite sets of the space X. At time k, we have the following RFS descriptions of a multi-target state and a multi-target measurement: where X k in Equation (4) encapsulates the target motions, births, and deaths. In addition, Z k in Equation (5) encapsulates the imperfect detection and false alarms. Let π k|k−1 (X k |Z 1:k−1 ) denote the predicted multi-target posterior density, and π k (X k |Z 1:k ) denote the updated multi-target posterior density at k. Then, the predicted and updated multi-target posterior densities are calculated as follows: where f k (X k |X) is the multi-target transition density and g k (Z k |X k ) is the multi-target likelihood function. Generally, the multi-target posteriors can be computed sequentially via the prediction and the update steps.

Ideal Control Process
For the sensor control problem with a fixed number of controllable sensors, the state of the sensors can be represented by S k = s k,1 , . . . , s k,i , . . . , s k,s .
A general task of the sensor control problem is to determine the optimal control action for each sensor. Let U k ∈ U k denote the desired optimal control action and U k be the admissible control actions. Then, where u k,i denotes the optimal control action for the ith sensor.
Most of the existing methods use an ideal control process (ICP) for simplification, in which each sensor can be driven to several positions without considering the specific dynamic process. Therefore, the admissible control actions are quantified. Given the previous positions of the sensors S k−1 , their one-step ahead positions are adopted as where S k (S k−1 , U k ) denote the admissible positions. Actually, the term admissible control actions and the admissible positions of the sensors are equivalent in the context of ICP. For instance, we can define a set of admissible control actions as [10] where (x k−1,i , y k−1,i ) s i=1 represents for the previous positions of the sensors, v i can be viewed as the maximum speed of the ith sensor, j = 0, 1, · · · , n R denotes the variety of the speeds, and l = 1, · · · , n θ denotes the variety of directions. n R and n θ are set to be constant. T is the step length of the control process, and we set T = 1 in this paper to represent the single step control process. Figure 2 gives an example of a single sensor situation with parameters s = 1, j = n R = 1, and n θ = 8. The speed v is equivalent to the length of the arrow, and the endpoints represent the admissible positions of the sensor. Once the optimal control action u k is determined, we can drive the sensor to one of the eight positions.

Evaluation Function
Let E 1 (·) denote a reward function. An optimal control strategy is formulated as [10] where U 1 k is the admissible control actions, π k−1 (X k−1 |Z 1:k−1 ) is the previous updated multi-target posterior density calculated by Equations (6) and (7), and Z 1 k (U, S k−1 ) is a virtual observation associated with a specific control action. The optimal control action is selected by maximizing this expectation E[·] in Controller 1 .
From a POMDP perspective, the reward function E 1 (U, π, Z) is a real-valued function associated with the control U, the previous posterior pdf π, and the current measurement Z. In fact, the current measurement Z k used by the Filter can only be observed after applying the sensor control process. Therefore, the virtual measurement Z 1 k (U, S k−1 ) is involved. Once the optimal control action is determined, we can drive the sensors to the new positions S 1 k . The previous studies have demonstrated that the predefined metric (reward function or cost function) plays a very important role in the POMDPs. It is necessary to develop an efficient task-driven strategy for the specific problem.

Predicted Ideal Measurement
Recall the virtual measurement Z 1 k (U, S k−1 ) in Equation (12). The predicted ideal measurement (PIM) is introduced to substitute the missing measurement for its virtual update step. For example, a PIM can be generated by taking the predicted state into the observation model h k in Equation (2). Generally, different control actions will generate different PIMs. The validity of using the PIMs implies that the observation model is accurate, which is a common assumption held by most existing methods.

Delta-Generalized Labeled Multi-Bernoulli Filter
For the multi-target state estimation, this part provides a brief review of δ-Generalized labeled multi-Bernoulli filter filter (GLMB) [20,21]. Following the conventional notations in the references, a δ-GLMB is completely characterized by the set propagates a δ-GLMB density through prediction and update steps recursively. For simplification, we omit the time index and use the symbol "+" to denote the predicted quantities.

Prediction
The predicted density are combined with two parts, the existing density and the newborn density. Assume that the birth process is formulated as a GLMB RFS in X × B, where B is the label space of newborn targets. For the existing density, the label space is L. The two space should be distinct, i.e., L ∩ B = ∅.
The labeled multi-Bernoulli birth model is For the existing part, we have the δ-GLMB parameter set . By using the notations p S (·, l) and f (x|·, l) to denote the single-target survival probability and Markov transition density, the parameters of the survival δ-GLMB are where the notation f , g ∆ = f (x) g (x) dx denotes the inner product. Given the current δ-GLMB multi-target posterior and the birth density, the predicted multi-target posterior to the next time is a δ-GLMB with parameter set ω where L + = L ∪ B denotes the new label space. The predicted δ-GLMB parameters are calculated by

Update
In the update step, the δ-GLMB filter takes missed detection and clutter into account. Thus, the probability of detection is denoted as p D (x, l) if detected. Recalling the notation of Equation (5), we use Z to denote the measurements set. For a measurement z, the clutter is assumed to be a Poisson RFS with intensity κ (z). In addition, the likelihood of the measurement is denoted as g (z|x, l). The parameters of the updated posterior ω (I + ,ξ,θ) (Z), p (ξ,θ) (·, ·|Z) θ∈Θ(I + ) I + ⊆L + ,ξ∈Ξ are given by The intermediate terms are where θ ∈ Θ(I + ) is the association between the label set and the measurements.

State Estimation
From an implementation viewpoint, the multi-target state estimation is the purpose of the sensor chasing algorithm. In this part, we use a Marginal multi-Bernoulli estimator to extract the states, and the core of this idea is to extract estimates via best cardinality. The distribution of cardinality is where n = 1, ..., N max , and F n (L) denotes the subsets of space L with n targets, where N max is the predefined maximum number of targets. The simplified estimation process iŝ where ĥ ,ĵ := arg max , which means we try to find the labels and states from the highest weighted element that has the cardinalityN.

A Novel Structure of Dual Sensor Control Scheme
For most existing works, once the control action is determined, no further correction step is involved. To maximize the overall utility of the system, an additional decision-making process is introduced. In this part, we propose a dual sensor control scheme. Figure 3 shows a diagram of the proposed structure.
Let E 2 (·) denote the evaluation function related to the Controller 2 , and the additional control process is where U 2 k is the admissible control actions, and π k (X k |Z 1:k ) is the updated multi-target posterior density after applying the current measurement Z k . S k U 2 , S 1 k is a one-step ahead position of the sensors associated with a control action U 2 , and S 1 k is the positions of the sensors after Controller 1 . The optimal control action is selected by maximizing this expectation E[·] in the Controller 2 .
Compared to the sensor control process in Equation (12), the evaluation function E 2 (U, π, S) is a real-valued function associated with the control U, the current posterior pdf π, and the future positions of the sensors S. In fact, the additional sensor control process is carried out by utilizing the real measurement information. Similar to the Controller 1 , once the optimal control action is determined, we can drive the sensors to the new positions S k .

Minimize the Posterior Distance between Sensor and Targets
For Controller 2 , an intuitive idea is to choose the control action that minimizes the distance between sensors and targets after getting the real observation. Therefore, we define a distance-related metric, namely the posterior distance between sensor and targets (PDST). Then, Equation (27) turns out to be whereX k (·) is the estimated state of the targets extracted from the updated multi-target posterior density π k (X k |Z 1:k ). In addition, S k U 2 , S 1 k is a one-step ahead position of the sensors associated with a control action U 2 .
Generally, the PDST is calculated between two sets with different cardinalities. As an example, we recommend utilizing the optimal sub-pattern assignment (OSPA) [22] metric. The OSPA distance between two sets X = {x 1 , · · · , x m } andX = {x 1 , · · · ,x n } is defined bȳ where d (c) (x,x) := min (c, x −x ), Π k is the set of permutations on {1, 2, . . . , k}, and the positive integer p ≥ 1 and c > 0. Given the estimationX k (·) = {x k,1 , . . . ,x k,N k } and an implementation of the S k (U 2 , S 1 k ) = s k,1 , . . . , s k,i , . . . , s k,s , the PDST D X k (·) , S k (U 2 , S 1 k ) is adopted asd (c) p by substituting X =X k (·) andX = S k (U 2 , S 1 k ). An equivalent sensor control strategy can also be designed for Controller 1 by using the PDST metric. Based on the output of the Virtual Filter, we can get the virtual updated multi-target posterior density π 1 k X 1 k |Z 1 k (U, S k−1 ) , Then, the corresponding control equation is whereX 1 k (·) is the estimated state of the targets extracted from the virtual updated multi-target posterior density π 1 k X 1 k |Z 1 k (U, S k−1 ) , which is produced by the Virtual Filter. Compared to theX k (·) in Equation (28), the main difference is the value ofX 1 k (·) is associated with the control action U 1 . S 1 k U 1 , S k−1 is a one-step ahead position of the sensors associated with a control action U 1 . Actually, the metrics for the two controllers do not need to be the same. Next, we are committed to developing a more efficient metric for Controller 1 .

Maximize the Predicted Average Probability of Detection
Since the computational mechanism of the existing evaluation functions depends on the virtual updated multi-target posterior densities, the Virtual Filter has to be employed several times. Obviously, it is very time-consuming in the Controller 1 .
In order to improve the efficiency, we define a novel evaluation function, namely the predicted average probability of detection (PAPD) metric. Equation (12) is simplified as whereX k|k−1 (·) is the estimated state of the targets extracted from the predicted multi-target posterior density π k|k−1 (X k |Z 1:k−1 ), which is calculated by Equation (6). S 1 k U 1 , S k−1 is a one-step ahead position of the sensors associated with a control action U 1 . Since the calculation of the PAPD metric P D (·) is carried out based on the prediction, the update step is avoided. In addition, there is no need to involve the PIMs. Consequently, the Virtual Observer and the Virtual Filter in Figure 3 are reduced to a Predictor.
For a comprehensive understanding, the following section describes the main steps of the proposed dual sensor control algorithms.

Dual Sensor Control Algorithms
Following the structure in Figure 3, we present the details of the dual sensor control algorithms for the MTT problem with one controllable sensor via δ-GLMB filter. We use a metric pair to distinguish different algorithms. Algorithm 1 shows the pseudo-code of the dual sensor control algorithms.

Algorithm 1 Dual sensor control algorithms
Input: sensor position s k−1 ,the posterior pdf π k−1 , and admissible control set U k 1. Prediction: compute the predicted pdf π k|k−1 by Section 3.1 State estimation: extracted the predicted estimated targets' stateX k|k−1 by Section 3.3 3.
Endfor: obtain the optimal control u 1 k , and drive the sensor to the new position s 1 k (u 1 k , s k−1 ) Elseif Metric == PDST 7.
Calculate the center of the virtual estimation 1 Calculate the admissible sensor position s 1 k (u 1 , s k−1 ) by Section 2.2 13.
Endfor: obtain the optimal control u 1 k , and drive the sensor to the new position s 1 k (u 1 k , s k−1 ) End Observer: get the real observation Z k 15. Update: compute the posterior pdf π k by Section 3.2 16. State estimation: extract the estimated targets' stateX k by Section 3.3 Controller 2 : 17. Calculate the center of the estimation 1 Calculate the admissible sensor position s 2 k u 2 , s 1 k (u 1 k , s k−1 ) by Section 2.2 20.
Evaluation: minimize the PDST by Equation (35) 21. Endfor: obtain the optimal control u 2 k , and drive sensor to the new position s 2 k u 2 k , s 1 k (u 1 k , s k−1 ) Output: control pair{u 1 k , u 2 k }, sensor position s k , the posterior pdf π k , and the estimationX k

Dual Sensor Control Algorithm with PAPD and PDST
In this part, we set the PDST as the metric in the Controller 1 . At time k, input the previous position of the sensor s k−1 , the representation of the previous posterior density π k−1 , and the set of admissible control actions U k . The output is the optimal control pair {u 1 k , u 2 k }, the position of the sensor s k , and the representation of the resulting posterior density π k .
LetX k|k−1 = x k|k−1,1 , . . . ,x k|k−1,N k|k−1 denote the predicted multi-target state andX k = x k,1 , . . . ,x k,N k denote the posterior multi-target state. For a given p D (·) function, the metric pair for Algorithm 1 is calculated as where D(·) is the Euclidean distance between the sensor and the center of the posterior multi-target x k,j in Controller 2 . Each time, the optimal control is determined, and the sensor will be driven to a new location. At time k, the final position of sensor is s k = s 2 k u 2 k , s 1 k (u 1 k , s k−1 ) .

Dual Sensor Control Algorithm with PDST and PDST
In this part, we set the PDST as the metric in the Controller 1 . Compared to the above formulas, the main difference is in the Controller 1 . Recalling the term in Section 2.4, PIMs are involved in the Virtual Filter. Due to the mechanism of a PIM generation, the result of the Virtual Filter changes with different control actions. Hence, the metric pair for Algorithm 1 is calculated as whereN 1 k,u 1 = X 1 k (u 1 ) denotes the number of virtual estimated targets associated with control u 1 . Note that the structure of the proposed algorithm does not depend on a specific filter. When it comes to the dual sensor control scheme, the evaluation function for Controller 2 is specified to be a PDST metric. In addition, single sensor control counterparts of the above algorithms can be easily implemented by omitting the Controller 2 .

Setup of the Simulations
A nonlinear multi-target scenario is studied in this section. The number of targets varies over time, and the observations are affected by imperfect detection and clutter. Figure 4 is the ground truths of targets with a total of six targets in the surveillance area of 4000 m × 4000 m.
The survival time of each target during the simulations: Target1 from 1 to 100; Target2 from 10 to 100; Target3 from 20 to 100; Target4 from 40 to 100; Target5 40 to 100; and Target6 from 60 to 100.
The single-target state x k = (x k ) , w k is comprised of the location and velocityx k = [x k ,ẋ k , y k ,ẏ k ] and the turning rate w k . In addition the single-target transition density is defined as where m (x k ) = (F (w k )x k ) , w k , Q = diag σ 2 w GG , σ 2 u , and the parameters where T = 1 s is the sampling time. The standard deviation of the process noise σ w = 5 m/s 2 , the standard deviation of the turn rate noise σ u = π 180 rad/s. The survival probability (for prediction) is set to be constant p s = 0.99. For the birth process, we follow the model in [20].
The initial position of the mobile sensor is (−2000 m, −2000 m). We assume that the sensor control process is an ideal control process in Section 2.2. In addition, the admissible control actions is calculated by Equation (11). The number of sensor s = 1. Given the previous position of the sensor s k−1 = (x s k−1 , y s k−1 ), the admissible positions of sensor follows , for j = {0, 1, 2} and N R = 2. v = 50 m/s represents for the maximum speed of the sensor and N θ = 8 is the directions. Figure 5 shows seventeen admissible positions of the mobile sensor. We use s k = x s k , y s k to denote the current position of the sensor.  Target1  Target2  Target3  Target4  Target5  Target6 Birth Death  The location-dependent probability of detection is where p D max = 0.98 is its peak value. If detected, each target shall produce one observation z = [θ, r] , namely bearing and range measurement. The likelihood is calculated as where µ (x k , s k ) = arctan We set σ θ = π 180 rad and σ r = 5 m. Recalling the clutter model in the update step, it follows a Poisson RFS with intensity κ k (z) = λ c U(Z), where U(Z) denotes a uniform density on the disc of radius 2000 m, and we set λ c = 8 × 10 −4 (radm) −1 for an average of 10 clutters per scan. The particle implementation of σ-GLMB filter is employed in the simulations, for its flexibility in dealing with the nonlinear tracking problems. The setup of particle filter is 1000 particles per target, and 500 particles is the threshold on effective number of particles before resampling.

Results and Analysis
In this part, we use the "double controller" to denote the algorithm with a dual sensor control scheme. Figure 6 is a single run of the Algorithm 1 with metric pair PAPD-PDST. We assign each track of the estimation results with different colors in Figure 6a. Figure 6b shows the trajectory of the sensor, and each point denotes the sensor position after Controller 2 . Figure 7 provides the results of Algorithm 1 with metric pair PDST-PDST. It shows that the proposed algorithms can obtain acceptable estimation results while achieving the task of tracking. Both of the algorithms can drive the sensor to the center of targets.  To analyze the statistically characteristics of the algorithms, we performed 100 Monte Carlo simulations on each of them. The expected sensor trajectories with timestamps after employing the sensor control processes are given in Figure 8a,b. For comparison, we use the sensor control method based on Cauchy-Schwarz divergence, and the results are provided in Figure 8c. The trajectories of the first two pictures are quite similar, while exhibiting significant difference with Figure 8c. Approximate analysis based on the timestamps, we find that the proposed algorithms with double controller are more effective. The tracking results of the Cauchy-Schwarz divergence-based algorithms are tardy, especially for its single controller situation. Fortunately, the introduction of double controller has improved the efficiency of tracking.  The average values of each metric during the Controller 1 are drawn in Figure 9. Figure 9a shows the tendency of average probability of detection gradually increases until it reaches its peak (approximately 40 s for double controller and 70 s for single controller). Figure 9b shows the tendency of average distance between sensor and targets gradually decreases until the sensor reaches targets' center (approximately 45 s for double controller and 90 s for single controller). No obvious trend is indicated in Figure 9c. Note that the missing part in Figure 9b (first 20 s) is because no target is extracted in some Monte Carlo trials. To evaluate the performance of different algorithms in multi-target state estimation, the optimal sub-pattern assignment (OSPA) [22] metric is utilized, the calculation is carried out by Equation (29). Figure 10a shows the results of OSPA (with parameters, c = 100 and p = 1), and it indicates that the errors of double controllers are smaller. Figure 10b shows the cardinality of estimations compared to the truth. It indicates a similar performance, except for the single controller with a Cauchy-Schwarz divergence (C-S divergence). Whenever the number of targets changes, the OSPA errors will suddenly increase (at 20, 40, 60, 80 time steps) and eventually settle to a certain value. Table 1 shows the average computing time for 100-time steps. It indicates that a double controller with the PAPD metric uses the least computing time, approximately 8.34 s per step. For one step, the PDST-based algorithms need about 44 s and Cauchy-Schwarz divergence-based algorithms needs 50 s. This result validates the effectiveness of the proposed PAPD metric. Actually, Table 1 indicates that double sensor control algorithms require less computing time overall than that of a single controller. This is because the additional controller can increase the overall tracking efficiency by driving the sensor to a location with a higher detection probability. The simulations are implemented in MATLAB R2017b (The MathWorks, Inc., Natick, MA, USA) on a desktop computer with an Intel Core i5-4570 CPU (Santa Clara, CA, USA) and 4 GB of RAM.

Further Discussion
To investigate the adaptability of the dual sensor control algorithm with PAPD and PDST, another commonly used location-dependent detection function is introduced, where p D max = 0.98, R = 300 m, and h = 1.25 × 10 −4 m −1 . Figure 11a indicates that the expected trajectory is similar to the double controller in Figure 8a. In addition, Figure 11b exhibits a significant tendency of increasing in PAPD during the first 40 time steps. Based on these simulation results, the adaptability of Algorithm 1 is verified.

Conclusions
A dual sensor control scheme for multi-target tracking was proposed in the context of POMDPs with FISST. The proposed scheme does not rely on a specific filter, and the existing evaluation function can be applied to the dual sensor control scheme straightforwardly. Typically, a dual sensor control algorithm is characterized by the metric pair. From a task-driven perspective, two novel metrics were developed. The PDST metric is based on an understanding of multi-target tracking. In addition, the motivation of the PAPD metric is to improve the efficiency. Simulation results demonstrated that the proposed dual sensor control scheme can improve the multi-target state estimation accuracy and the overall efficiency. Moreover, the algorithm that uses the recommended metric pair (PAPD-PDST) has shown excellent performance and adaptability. For the further study, we will apply these methods to the multi-sensor and more complex scenarios.