A Novel Square-Root Cubature Information Weighted Consensus Filter Algorithm for Multi-Target Tracking in Distributed Camera Networks

This paper deals with the problem of multi-target tracking in a distributed camera network using the square-root cubature information filter (SCIF). SCIF is an efficient and robust nonlinear filter for multi-sensor data fusion. In camera networks, multiple cameras are arranged in a dispersed manner to cover a large area, and the target may appear in the blind area due to the limited field of view (FOV). Besides, each camera might receive noisy measurements. To overcome these problems, this paper proposes a novel multi-target square-root cubature information weighted consensus filter (MTSCF), which reduces the effect of clutter or spurious measurements using joint probabilistic data association (JPDA) and proper weights on the information matrix and information vector. The simulation results show that the proposed algorithm can efficiently track multiple targets in camera networks and is obviously better in terms of accuracy and stability than conventional multi-target tracking algorithms.


Introduction
With the rapid development of image processing, sensor and semiconductor technology, the availability of inexpensive hardware, such as CMOS cameras, that are able to ubiquitously capture video content from the environment has fostered the development of camera networks [1]. Cameras have been widely used in smart homes, wide-area surveillance, intelligent transportation, medical care, industrial control, etc.
Multiple cameras can cover a large area, communicate with each other through the network and then fuse all of their measurements to achieve robust scene understanding. However, factors, such as weather, illumination and shadow, make the measurements suffer noise easily. At the same time, there are multiple targets in the scene, which increases the difficulty of targets tracking. In this paper, we focus on the problem of tracking multi-targets through a camera network. In many application scenarios of camera networks, the observation is a nonlinear function of the target state. Consequently, we propose a novel algorithm for these complicated application scenarios.
A camera network is a set of resource-constrained camera-equipped sensor nodes that are spread over a large area. The limit of the centralized architecture is obvious. When there are large volumes of data that need to be transmitted, processed and interpreted by resource-constrained nodes to deliver to the fusion center, the network may easily fail because of the energy consumption and communication burden.
One way of addressing this issue is through the novel paradigm of distributed algorithms. Recently, distributed algorithms have witnessed a surge in interest that has enabled a wide range of cooperation and information fusion in bandwidth-limited sensor networks. They are advantageous for target tracking in camera networks due to their scalability and high fault tolerance [2,3].
In a distributed estimation scheme, the system must adopt certain strategies to share information. In recent years, many researchers have proposed linear consensus protocols to deal with this problem through multiple iterations of communication between the local node and its neighboring nodes. For example, Olfati-Saber et al. [4] provided a theoretical framework for consensus and cooperation in multi-agent systems. In their paper, they made a detailed analysis of a consensus algorithm for multi-agent networked systems with an emphasis on the role of directed information flow, robustness to changes in the network topology due to link/node failures, time delays and performance guarantees. Ren et al. [5] considered the problem of information consensus among multiple agent exchange with dynamically changing interaction topologies and gave conditions for asymptotic consensus under dynamically changing interaction topologies and the weighting factors using update schemes.
Combining with the above-mentioned consensus algorithm and then using a filter algorithm, such as Kalman filter, one can achieve the goal of target tracking. Olfati-Saber introduced a novel distributed Kalman consensus filtering (KCF) algorithm for sensor networks [6]. The KCF algorithm works under the assumption that every sensor has the ability to sense all targets. However, in a realistic camera network, a target could usually be seen by none or only a few of the cameras. In [7], Olfati-Saber et al. considered the case mentioned above. However, the solution is a hybrid P2P/hierarchical architecture, not fully distributed and not suitable for large-scale networks. Kamal et al. [2] proposed an information weighted consensus filter (IWCF) to deal with this problem by proper weights on the prior state and the measurement information. In camera networks, the measurement model does not evolve linearly. Hence, tracking algorithms depending on linear filters, such as the traditional Kalman filter and the information filter, cannot be applied. Katragadda et al. [8] proposed two consensus-based distributed algorithms for nonlinear systems using the extended information filter (EIF). However, this filter adopts multivariate Taylor series expansions to linearize a model. The accuracy may not meet the requirements when they are used in the case of camera networks.
To solve these problems, this paper proposes a novel consensus filter based on the square-root cubature Kalman filter (SCKF) [9]. The SCKF adopts a third-degree spherical-radial cubature rule that provides a set of cubature points scaling linearly with the state-vector dimension. The SCKF can provide a robust and systematic solution for high-dimensional nonlinear filtering problems. Meanwhile, compared with the unscented Kalman filter (UKF) [10], the SCKF can preserve two properties of the error covariance matrix: symmetry and positive definiteness in each update cycle [9]. In the UKF, due to errors introduced by arithmetic operations performed on finite word-length digital computers, these two properties are often lost.
One advantage of the information filter over the Kalman filter arises from its natural fit for multi-agent problems. Multi-agent problems often involve the integration of sensor data collected decentrally. Such integration is commonly performed using Bayes' rule. When represented in logarithmic form, Bayes' rule becomes an addition. Information integration is achieved by summing up information from multiple sensors. Addition is commutative. Because of this, information filters often integrate the information in an arbitrary order, with arbitrary delays and in a completely decentralized manner [11]. In this paper, we use the information form of SCKF, which is called the square-root cubature information filter (SCIF) [12].
Multi-target tracking is the combination of data association and estimation. However, the above-mentioned methods do not consider the measurement-to-track association. Among many algorithms that are available for data association, the multiple hypothesis tracking (MHT) [13] and joint probabilistic data association (JPDA) [14] are two popular schemes. JPDA achieves reasonable results at a much lower computational cost than MHT and can be easily integrated into a distributed system.
The main contribution of this paper is proposing data association with a square-root cubature information filter, taking special care of the issues of nonlinearity and finite word-length digital computers and using the proposed algorithm to track multi-targets in a camera network. In Section 2 the state-of-the-art in distributed multi-target tracking in camera networks is described. Section 3 presents preliminaries for this paper, such as the model, average consensus and JPDA. In Section 4, the distributed square-root cubature information weighted consensus filter (DSCIWCF) is proposed. We describe the JPDA with DSCIWCF for the multi-target tracking algorithm, called the multi-target square-root cubature information weighted consensus filter (MTSCF), in Section 5. In Section 6, the proposed method is compared against others experimentally. The simulation results show that the proposed algorithm can efficiently track multiple targets in camera networks. Finally, we will give the conclusion of this paper in Section 7.

Related Work
This section discusses consensus-based distributed multi-target tracking in camera networks, focusing on the problems of nonlinearity, redundancy and robustness.
There are many research papers on multi-target tracking under sensor networks [15][16][17][18][19][20]. However, most of these methods do not consider the problem of naive nodes [2] and numerical difficulties resulting from the finite word-length of computers. Now that computers have become so much more capable, we do not have to worry about numerical problems as before. Nevertheless, numerical issues still arise in finite-word-length implementations of algorithms, especially in sensor networks.
In [15], a distributed data association for multi-target tracking in sensor networks was proposed by Sandell et al. In their paper, they considered that each sensor node can make noisy measurements of the target state. In this situation, data association techniques must be employed. Therefore, they used the JPDA algorithm to deal with the data association. Although their proposed method is distributed, their method is based on KCF, which is used in a linear system and does not consider naive nodes.
In [18], Roy-Chowdhury et al. extended the method proposed in [15] to deal with nonlinear problems. Although it can be used in nonlinear camera networks, their method is based on EKCF, and thus, naive nodes have not been considered.
Kamal proposed extended multi-target information consensus to deal with the problems of nonlinearity and naive nodes [20]. Their method is based on IWCF [2], which is more robust and accurate than the KCF algorithm. However, their method has the problem of numerical difficulties mentioned above in resource-constrained camera networks.
As described above, this paper uses JPDA with the SCIF-based tracking algorithm at each camera node to track multi-targets in a camera network. Our algorithm can not only overcome the numerical difficulties mentioned above, but also gets much more accurate results at the same time.

System Model
The general nonlinear system model for camera networks is the form: where the system equation f (·) and the measurement equation h(·) are time-varying nonlinear functions. At time k, x j k,i ∈ R nx is the state vector of the j-th target. Each camera C i gets m i (k) measurements denoted as {z j k,i } m i (k) j=1 , and z j k,i ∈ R nz is the nonlinear measurement from the j-th target measured by the node C i at time k. Cameras do not know the relationship between measurements and targets. That is to say, they do not know which measurement is generated from which target. v j k−1,i ∈ R nx is the process noise of the node C i on time k − 1, w j k,i ∈ R nz is the measurement noise of node C i at time k. The noise sequences v i k−1 and w i k are assumed to be independent and white with v i k−1 ∼ N(0, Q i k−1 ) and w i k ∼ N(0, R i k ), respectively. Given a camera network with N C cameras, there are no specific assumptions about the overlap among the FOVs of these cameras. In the FOV, there are N T moving targets. In this paper, we assume that all cameras have been calibrated, so we can get the target position corresponding to the same reference plane. The communication in the network can be represented using an undirected connected graph G(τ) = (C, E(τ), A(τ)) [21,22]. The set of vertices C = {C 1 , C 2 , · · · , C N C } represents the cameras. The set E ⊆ C × C contains the edges of the graph, which represents the available communication channels between different cameras. A(τ) = [a ij ] N C ×N C is an adjacency matrix, which is a symmetric 01−matrix. Because the graph has no loops, the diagonal entries of A(τ) are zero (a ii = 0, i = 1, · · · , N C ). Ω i = {j ∈ C | (i, j) ∈ E} is an adjacency set of node C i . (i, j) represents the direct communication channel between node C i and node C j . The degree of node C i is the number of its In this paper, we use the "+" superscript to denote the a posteriori estimate and the "-" superscript to denote the a priori estimate. For example,x j− k,i (and its covariance P j− k,i ) represents the prior/predicted state estimate (and covariance) of x j k,i .

Average Consensus
To compute the average, average consensus [23,24] is a popular distributed algorithm. Suppose, each node i holds an initial scalar value a i (0) ∈ R, and a(0) = {a i } N C i=1 denotes the vector of the initial node values on the network. We are interested in computing the average of the initial values, 1 In the average consensus algorithm, at the beginning of iteration τ, a node C i sends its previous state a i (τ − 1) to its direct network neighbors C j ∈ Ω i and also receives the neighbors' previous states a j (τ − 1). Then, the iterative form of the average consensus algorithm can be stated as follows in discrete-time: By several iterations, a consensus is asymptotically reached for all initial states. The rate parameter should be chosen in 0 ∼ 1/∆ max , where ∆ max is the maximum degree of the network graph G. Choosing a larger value of will result in faster convergence, but choosing values equal or more than 1/∆ max will render the algorithm unstable. The paper [24] provided a good choice of using Metropolis weights. The Metropolis weight matrix is defined as: Arranging the local consensus states into the vector a , the update Equation (3) can be written in the matrix form as: where Ψ[τ] = I − L(τ), I is the appropriate size identity matrix and ⊗ denotes the matrix Kronecker product; Ψ[τ] is a stochastic matrix.

Joint Probabilistic Data Association
In the real world, in addition to the data originating from the target, a set of measurements are clutter, which correspond to no targets. A direct measurement to target assignment may lead to poor performance. Thus, a data association algorithm is needed. In this paper, we use JPDA for data association [14,15]. Here, we briefly review this algorithm.
The idea of JPDA is to compute the smoothing property of expectations. In other words, the conditional mean of the state is obtained by averaging over all of the association events. Let . β i0 denotes the probability that no measurement is associated with target t for node C i , and χ t ij denotes the event that the measurement j on node i originated from target t. See [14] for details about computing β ij 's values. The JPDA filter (JPDAF) state estimate is:x wherex t+ i andx t− i denote the a posteriori estimate and prior estimate of the state of target t by node i at time k, respectively. z t i and K t i denote the mean measurement and the Kalman gain for target From Equation (6), the mean measurement innovationz t i for target t is defined as: where z t i = m i (k) j=1 β t ij z ij . The covariance estimate for JPDAF is given by: The JPDAF is based on the Kalman filter, which is the best linear estimator. However, in this paper, the camera model is a nonlinear system. The JPDAF needs to be modified to fit the nonlinear system. Details will be discussed in Section 5.

Square-Root Cubature Information Weighted Consensus Filter
The square-root cubature Kalman filter (SCKF) algorithm [9] was proposed by Arasaratnam et al. It is a more accurate nonlinear filter that could be applied to solve high-dimensional nonlinear filtering problems with minimal computational effort. In multi-sensor data fusion applications, because of the advantages of the information filter mentioned above, this paper uses the information square-root cubature information filter (SCIF) [12]. Firstly, we will give a brief review of SCIF of node i as follows; thus, in order to facilitate the description, the sensor index i will be dropped in this review.

Square-Root Cubature Information Filter: A Brief Review
The information filter propagates the inverse of P , rather than propagating P . The state estimate and its corresponding covariance in the Kalman filter are replaced by the information vector and information matrix, respectively, in the information filter. The updated information vector and information matrix can be written as: where S y,k|k−1 is the square-root information matrix. The information update at time k is given by: Here, I k and i k are defined as follows, respectively [12]: In matrix theory, an covariance matrix P can be written as: where P ∈ R n×n , A ∈ R n×m , m ≥ n. Equation (16) can be considered as the square-root of P . For a simple calculation, in this paper, A is transformed into a n × m triangular matrix S using a triangularization decomposition algorithm, as follows: Tria denotes a triangularization decomposition algorithm. If we use QR decomposition, A T will be decomposed into an orthogonal matrix Q ∈ R m×m and an upper triangular matrix R ∈ R m×n , A T = QR; then, Equation (16) can be written as: then S = R T . S is a lower triangular matrix, which is a sparse matrix. The sparsity of S will benefit calculations and reduce storage space. The steps involved in the square-root cubature information filter algorithm are summarized in the following: In the time update, at time k, assume that (ŷ k−1|k−1 , S y,k−1|k−1 ) is known; we compute the square-root of the predicted information matrix S y,k|k−1 and the predicted information vectorŷ k|k−1 .
In the measurement update, we will compute the updated information vectorŷ k|k and the square-root of the updated information matrix S y,k|k according to the results of the time update step. See [15] for details about these two steps.

Centralized Square-Root Cubature Information Filter
Multi-sensor fusion is the process by which information from many sensors is combined to yield an improved description of the observed system. In this section, we will give a brief introduction of the centralized square-root cubature information filter (CSCIF), which is the base of the proposed algorithm. A centralized camera network system comprises a fusion center with connections to all other cameras. In order to distinguish between the information state contribution i and node index, we use the s to denote the node index in the rest of paper. Each camera C s obtains data about the environment, which is forwarded to the fusion center, where s = (1, 2, · · · , N c ). The global estimate in the fusion center can be computed from N C sensor measurements at time k by simple summing of the local information vectors and matrices (the "c" superscript denotes "centralized"): In Equation (20), the i s k can be computed using the equation Then, we compute the square-root of the predicted information matrix S c y,k+1|k and the predicted information vectorŷ c k+1|k using the standard time update step of SCIF. Although the centralized camera network system is an improvement over a single camera system, it has a number of disadvantages. These include severe computational loads imposed on the fusion center, the possibility of catastrophic failure and high communication overheads.

Distributed Square-Root Cubature Information Weighted Consensus Filter
Generally, there are no fusion centers in large-scale camera networks, and the capabilities of all cameras are equal in the network. In this scenario, a distributed approach is required. The average consensus algorithm, which has been introduced in Section 3 of this paper, meets the needs of this scenario. In this section, we propose a novel DSCIWCF algorithm for target tracking in camera networks.
In the average consensus algorithm, node C s only communicates with its direct neighbors C j ∈ Ω s , then the values of the states at all of the nodes converge to the average of the initial values.
In [2], a distributed state estimation framework was proposed by Kamal et al. They used a value 1/N c as weights on the information matrix and information vector. This algorithm can overcome the naivety issue and information redundancy in camera networks. In this paper, we use a similar strategy to deal with the square-root cubature information matrix and information vector. Here, we summarize the DSCIWCF algorithm as follows.
(1) Compute the square-root form of the local information vectorŷ s k|k and the information matrix Y s k|k : where Y s k|k−1 = S s y,k|k−1 (S s y,k|k−1 ) T and P s,xz,k|k−1 = T 21 T T 11 can be computed using Equation (A5) (see Appendix A). Equation (22) is equivalent to the equation below.
(2) Let ν s 0 =ŷ s k|k and V s 0 = S s y,k|k , then perform average consensus on ν s 0 and V s 0 independently for K iterations: For k = 1 to K: Broadcast (ν s k−1 , V s k−1 ) to neighbors C j , C j ∈ Ω s and receive (ν j k−1 , V j k−1 ) from neighbors. Run average consensus on ν s k−1 , V s k−1 : END for: where N s,E is the number of the direct neighbors of node s, N s,E = j a sj . If V s 0 = 1 Nc Y s k|k−1 + I s k , the equivalent square form of the Equation (26) is as follows: (3) Compute the a posteriori information vector and information matrix for time k: (4) Compute the predicted information matrix S s y,k+1|k and the predicted information vectorŷ s k+1|k using the standard time update step of SCIF.
In practice, due to errors introduced by arithmetic operations (such as matrix square-root, matrix inversion, etc.) performed on finite word-length digital computers, the symmetry and positive definiteness of the error covariance matrix are often lost [9]. A square root filter is the best choice to deal with these problems. In this paper, we use the square-root cubature information filter for target tracking in camera networks. At the same time, we use Equations (26) and (27) for average consensus iterations. Therefore, the whole algorithm runs under the condition of the square-root filter.

Multi-Target Data Association
The JPDA algorithm has been introduced in Section 3. However, the traditional JPDA algorithm is often used for linear sensing models. In this section, we will extend the JPDA algorithm to handle nonlinear sensing models. The main contribution of this section is that we propose the algorithm derived from the combination of JPDA and the information filter mentioned in the last section. The JPDAF is a single sensor algorithm; thus, we firstly introduce the algorithm of the single node s.

Joint Probabilistic Data Association With Square-Root Cubature Information Filter
In the SCIF, in order to make the information contribution equations compatible with those of the Kalman filter, a pseudo-measurement matrix H t s [25] is defined as (at target t, similarly hereinafter): where the subscript s denotes the terms from the s-th node.
In the cubature Kalman filter (CKF), the Kalman gain K t s,k gives: [15], and substituting Equation (30) into the innovation covariance matrix, we can get:

be computed using Equation (A5) (see Appendix A). The innovation covariance matrix is given by
where Y t s,k|k−1 = (P t s,k|k−1 ) −1 is a symmetric positive definite matrix, and P t s,xz,k|k−1 , P t s,zz,k|k−1 , P t s,k|k−1 and Φ t k|k−1 come from Equation (A1). P t s,zz,k|k−1 can be computed using T t 11 T t T 11 (see Appendix A). Now, we can rewrite Equation (31) as follows: where H t s,k and W t s are defined by Equations (30) and (32); P t− s is the short form of P t s,k|k−1 . In the CKF, the measurement innovation term of Equation (7) becomes, whereẑ s,k|k−1 = 1 m m j=1 Z sj,k|k−1 and Z sj,k|k−1 denotes one of the propagated cubature points; see [15] for details about computing Z sj,k|k−1 . Substituting Equation (34) into Equations (6) gives: The JPDA state update Equation (35) has a similar form as the standard Kalman filter update, and it can be converted to the information form using u t where R t s is the measurement noise covariance of node s for target t and Y t− s = (P t− s ) −1 = (P t s,k|k−1 ) −1 . The information matrix is denoted as follows (see Appendix B): . Equations (36) and (37) form the JPDA-SCIF algorithm.

Joint Probabilistic Data Association With Centralized Square-Root Cubature Information Filter
In Equations (12) and (13), the information filter form has the advantage that the update equations for the estimator are computationally simpler than the equations for the Kalman filter. Here, we rewrite I t s and i t s using H t s from Equation (30) as follows: where the measurement innovation termz t s = z t s −ẑ t− s andx t s,k|k−1 represents the prior/predicted state estimate. In the JPDAF,z t s can be written as Equation (34). Therefore, Equation (39) can be rewritten as follows: From Equations (38) and (40), we rewrite Equations (12) and (13) as follows: Substituting Equation (34) into Equation (42), we can extend Equations (36) and (37) into the multi-sensor centralized estimate in the information form as follows: andP t s is defined as follows:

Multi-Target Square-Root Cubature Information Weighted Consensus Filter
In the MTSCF, if all nodes have reached consensus on the previous time step, we will havex t− Thus, for the MTSCF algorithm, the consensus variables are initialized as, The MTSCF algorithm is summarized in Algorithm 1.

Computing the Square-Root of the Information Matrix
The algorithms of this section that were mentioned above are based on information matrix. However, in order to be consistent with the algorithms proposed in this section, we need to transform the information matrix into the square-root form.
Illustrated by the case of the MTSCF algorithm, we will describe how to compute the square-root of the information matrix and other algorithms using similar methods. Let It is easy to get S U t s and S Y t− s ; then, the square-root form of Equation (49) is as follows: where S Y t− s and S U t s are the square-root form of Y t− s and U t s , respectively. They can be computed using Algorithm 1 MTSCF Algorithm for target t at node C s at time k.

Inter-Camera Association
In distributed tracking of multiple targets, every node has a set of information from each of its neighbors about the targets and its own set of estimated tracks. Therefore, it is necessary to use an assignment algorithm to form a set of optimal matchings g sj , where g sj matches the tracks of node s with the tracks of node j. We can use the Hungarian algorithm [26] to find the maximum matching. The matching cost between two track estimates from different cameras can be defined as the Mahalanobis distance as follows [15]:

Experimental Evaluation
In this section, we evaluate the performance of the proposed MTSCF algorithm in a nonlinear simulated environment and compare it with other methods: JPDA-EKCF [18] and EMTIC [20]. Our experiments are performed on an Intel 3.4 GHz PC with 4 G memory and implemented in MATLAB.
Four simulated targets (N T = 4) moving in a 500 m × 500 m area under the observation of nine cameras (N C = 9) with overlapping FOVs is considered. To simplify the simulation, the FOV of each camera is assumed to be a square region of 200 m × 200 m around the camera. The target's state vector is a 5D vector, which includes the target's position (x k , y k ) at discrete time instant k, its velocity (v x , v y ) and the time interval δ k between the two consecutive measurements. Accordingly, the state vector is given by The motion model of the targets is described by the nonlinear equation [8]: where the target acceleration (a x , a y ) is modeled as Gaussian noise. To account for synchronization errors among cameras, we consider a time uncertainty e, which is also assumed to be a Gaussian variable. We consider the vector v = (a x , a y , e) as the Gaussian noise vector with zero mean and covariance Q = diag([5 5 0.01]). The initial speed is randomly obtained from the range 10 ∼ 30 units per time step and with a random direction uniformly chosen from 0-2π. The measurement model can be defined as: where (γ s k , φ s k ) is the pixel coordinates of the target in the image plane of camera C s at time k. The values H s 11 , · · · , H s 33 are the elements of homography; w k is the measurement noise, which is considered to be Gaussian with zero mean and variance R = diag( [5 5] The initial prior covariance P t− s (1) = diag([100, 100, 10, 10, 0.01]) is used at each node for each target. The initial prior statex t− s (1) is generated by adding zero-mean Gaussian noise of covariance P t− s (1) to theinitial ground truth state. The observations are generated using Equation (61). The total number of consensus iterations per measurement step, K, is set to 20. The parameters for computing the association probabilities, β t sj , are set as follows (see [14] for details about computing β t sj ). False measurements (clutter) are generated at each node at each measurement step using a Poisson process with λ, where λ is the average number of false measurements per sensor per measurement step. Gate probability P G is set to 0.99. The probability of detecting a target P D in each camera is set to 0.8.
In this paper, we perform the experiments for a sparse connectivity network with a low average network degree equal to two (see Figure 1). Therefore, the ∆ max = 2; then, the consensus rate parameter is set to 0.65/∆ max . In the experiment, four targets' trajectories are generated (see Figure 2). The simulation results are averaged over 20 Monte Carlo simulation runs.  Figure 3 shows the performance comparison by varying the amount of clutter. The average amount of clutter per sensor per measurement step λ is varied from 1/64-8 (consensus is run for a fixed number of iterations (eight)). From Figure 3, it can be seen that both EMTIC and MTSCF are very robust, even to a very high amount of clutter. The amount of clutter is kept at λ = 1 for the other experiments in the rest of the paper. The result of target tracking can be seen in Figure 4 in one experiment from one run (the result is based on the consensus algorithm, and the number of consensus iterations is eight). As you can see from Figure 4, the MTSCF algorithm is closer to the ground-truth curves than EMTIC.  To show the convergence of different methods, the total number of iterations per measurement step, K, is varied. It can be seen from Figures 5 and 6 (Figure 6 shows an enlarged part of Figure 5 focusing on MTSCF and EMTIC) that with an increased number of iteration, MTSCF approaches the ground-truth tracks. It can also be seen that MTSCF outperforms EMTIC for any given K. Meanwhile, It can be seen that JPDA-EKCF has large mean error, which does not suit nonlinear multi-target tracking in distributed camera networks.
Our simulation is based on MATLAB. In MATLAB, it handles floating-point numbers in double precision (default setting) format; while double precision numbers use 64 bits, based on IEEE Standard 754. In the experiment, we convert all double precision numbers to single precision (32 bits) numbers using the command single (number). Unfortunately, it may be impossible for us to use the single precision in JPDA-EKCF and EMTIC. The reason is that, when the single precision number is used to calculate the updated inverse matrix, the resulting matrix may possibly be non-positive definite. In the simulation, we get the warning "Matrix is close to singular or badly scaled. Results may be inaccurate." Hence, errors may occur during the execution of the JPDA-EKCF and EMTIC algorithms in a limited word-length system. This is not a problem for the MTSCF algorithm.

Conclusions
In this paper, we propose a novel multi-target square-root cubature information weighted consensus filter (MTSCF) algorithm, which is a generalized consensus-based distributed multi-target tracking scheme applicable to a wide variety of sensor networks. MTSCF handles the issue of naivety, which makes it applicable to sensor networks where sensors may have limited FOV (which is the case for a camera network). The algorithm is efficient for considering the estimation errors in tracking and data association, the influence of naivety and the numerical difficulties from the finite word-length of computers, which makes it resistive to false measurements/clutter. Experimental analysis shows the strength of the proposed method over existing ones.
In our future work, we will explore applying the MTSCF to a real camera network, which may be a limited word-length embedded system. Handling out-of-sequence measurements, the unknown number of targets and asynchronous networks are some other possible future works.
Applying the triangularization procedure to the square-root factor available on the RHS of Equation (A1) yields: where T 11 ∈ R nz×nz , T 22 ∈ R nx×nx are lower triangular matrices, and T 21 ∈ R nx×nz . Equation (A1) can be rewritten as follows: Kalman gain can be rewritten as follows [20]. The time index k has been dropped for simplicity.
The state estimate,x t+ s =x t− s + K t sz t s For the covariance, rewrite Equation (8) as follows: where,P Let M t s = (1 − β t s0 )W t s −P t s , then use the matrix inversion lemma on Equation (B3); we get: