Iterative Diffusion-Based Distributed Cubature Gaussian Mixture Filter for Multisensor Estimation

In this paper, a distributed cubature Gaussian mixture filter (DCGMF) based on an iterative diffusion strategy (DCGMF-ID) is proposed for multisensor estimation and information fusion. The uncertainties are represented as Gaussian mixtures at each sensor node. A high-degree cubature Kalman filter provides accurate estimation of each Gaussian mixture component. An iterative diffusion scheme is utilized to fuse the mean and covariance of each Gaussian component obtained from each sensor node. The DCGMF-ID extends the conventional diffusion-based fusion strategy by using multiple iterative information exchanges among neighboring sensor nodes. The convergence property of the iterative diffusion is analyzed. In addition, it is shown that the convergence of the iterative diffusion can be interpreted from the information-theoretic perspective as minimization of the Kullback–Leibler divergence. The performance of the DCGMF-ID is compared with the DCGMF based on the average consensus (DCGMF-AC) and the DCGMF based on the iterative covariance intersection (DCGMF-ICI) via a maneuvering target-tracking problem using multiple sensors. The simulation results show that the DCGMF-ID has better performance than the DCGMF based on noniterative diffusion, which validates the benefit of iterative information exchanges. In addition, the DCGMF-ID outperforms the DCGMF-ICI and DCGMF-AC when the number of iterations is limited.


Introduction
With the rapid progress of the sensing and computing technologies, multiple sensors have been widely used in estimation applications, such as target tracking, wireless sensor networks, guidance and navigation, and environmental monitoring. Effective information fusion from multiple sensors is of utmost importance. It can be done in a centralized or distributed manner. For the centralized fusion, the information obtained by all sensors is collected and processed by the central node. This approach enables the global solution but requires a large amount of power and resources in communication and computation. The failure or delay on the central node may significantly degrade the estimation performance. For the distributed estimation, the information at each sensor node is processed locally and then fused to establish the global information by well-designed distributed fusion algorithms using only the local information. In contrast to the centralized estimation, the distributed estimation offers a number of advantages, such as scalability, robustness to single point of failure, low communication load, and low operation cost.
When the estimation is processed at each local sensor, it is a regular filtering problem, which has been intensively researched for decades. In many practical estimation problems, the system dynamics and measurement equations are nonlinear and the uncertainties or noises are non-Gaussian. To address this challenging filtering problem, Gaussian mixture-based filters [1] and sequential Monte Carlo-based filters [2] are two classes of widely used approaches. The rationale behind the Gaussian mixture-based filters is that any probability density function (pdf) can be approximated by the summation of a finite number of Gaussian distributions. The Monte Carlo-based filters or particle filters use a large number of particles to represent the pdf. Although some solutions have been proposed to alleviate the curse of the dimensionality problem for application of particle filters in high-dimensional problems, the computation complexity is still prohibitive. Therefore, from the computation efficiency perspective in the sensor network setting, the Gaussian mixture filter is a better alternative and will be used in this paper for multiple sensor estimation. The mean and covariance of each Gaussian component are predicted and updated using the cubature Kalman filtering (CKF) algorithm [3,4]. The fifth-degree CKF [4] is used because it is more accurate than the conventional third-degree CKF in [3] and other well-known nonlinear Gaussian filters such as the extended Kalman filter (EKF) [5] and the unscented Kalman filter (UKF) [6], which is a third-degree Gaussian filter as well.
After the local estimation is obtained at each sensor node, information fusion of the estimates from multiple sensors is conducted using the distributed estimation algorithm. Distributed estimation has been a research subject of considerable interest in the past few years [7][8][9][10][11][12][13][14][15][16][17]. Olfati-Saber [7,8] first addressed the distributed estimation problem by reducing it to two average consensus filters, one for weighted measurement and the other for information form of the covariance matrix. Because each sensor node only communicates with its immediate neighbors, the average consensus strategy is effective to obtain the average of each node's initial value. In each iteration, each node updates its state by weighting its prior state and its neighbors' prior states. When the number of iterations approaches infinity, average consensus can be achieved. In the consensus-based distributed estimation framework, certain requirement on the network topology is usually necessary. In [9,10], information from an individual node is propagated through the entire network via a new information-weighted consensus scheme. Although each node has limited observability of the states, even including naive agents (not having measurement), the proposed information-weighted consensus filter for distributed maximum a posterior parameter estimation and state estimation is capable of obtaining a final estimate comparable to that obtained from the centralized filter. However, it only considered the scenario that all local estimates and measurement errors are independent or uncorrelated. Sun et al. [11] proposed a batch covariance intersection technique combined with average consensus algorithms to address the correlation issue. But, the Gaussian assumption is made on all estimation processes. It may be inadequate for highly nonlinear systems and/or non-Gaussian systems. On the other hand, due to the constraints on energy and communication frequency, a large number of iterations in consensus algorithms are not feasible in practice, especially for the systems in which the time interval between two consecutive measurements is very small.
Diffusion strategies for distributed estimation proposed in [12] overcome the disadvantage of excessive energy and communication requirements in the average consensus-based estimation. There are two steps between consecutive filtering cycle in the diffusion algorithm: incremental and diffusion. The incremental step runs a local filtering at each node with a regular time update and multiple measurement updates by incrementally incorporating measurements from every neighboring node. The diffusion step computes the ultimate fused estimate by convex combination of all estimates from the present node and its neighbors. Each node only communicates with its direct neighbors twice in each filtering cycle. The first communication collects the innovation information from its neighbors. The second communication exchanges the state estimate among neighbors from the incremental step to do the diffusion update. The estimate obtained through the diffusion strategy has been proved unbiased for linear systems. The paper [12] also provides the mean, mean square, and convergence analysis and shows that the estimate is stable under the assumption that the state space model is time invariant and each local system (joint measurement model of one node and its immediate neighbors) is detectable and stabilizable. As long as the individual node satisfies the assumption, this diffusion strategy does not have any requirement for the network topology. Diffusion recursive least-squares (RLS) algorithm was developed in [13] to deal with the distributed estimation problem and achieved the performance close to the global solution. It does not require transmission or inversion of matrices and, therefore, reduces computational complexity. It was shown that the distributed solution is asymptotically unbiased and stable if the regressors are zero-mean and temporally independent, and the inverse of covariance matrices at different time indexes can be replaced by its expected value. A diffusion least-mean-squares (LMS) algorithm was proposed in [14] with two versions: adapt-then-combine and combine-then-adapt. Mean and mean square performance were analyzed. Besides, the scheme of optimizing the diffusion LMS weights was discussed. The work of [15] extended the work in [12] by using the covariance intersection to yield a consistent estimate and relaxing the assumption made in [12]. It only requires partial local uniform observability rather than all local systems' observability assumed in [12]. The case of no local uniform observability was discussed in [15] as well but relied on the consensus filter. Hlinka et al. [16] proposed the distributed estimation scheme using the iterative covariance intersection (ICI). Like the consensus strategy, the ICI needs recursive update of each node's state and covariance until they converge. Each iteration can guarantee a consistent estimate. However, the ICI does not include the incremental update as the diffusion does.
Most of the aforementioned work assumes a linear dynamic process and measurement with Gaussian noise or initial uncertainty with Gaussian pdf. For highly nonlinear dynamic systems with non-Gaussian statistics, the performance of those distributed estimation methods may degrade. In this paper, we propose a new distributed Gaussian mixture filtering based on an iterative diffusion strategy to handle the distributed nonlinear estimation. There is limited literature on the distributed Gaussian mixture filtering. In [17], the likelihood consensus strategy was used in the design of a distributed Gaussian mixture filter in a sensor network that was not fully connected. Unlike the original consensus-based distributed estimation, the Gaussian mixture weight cannot be updated through the consensus filter directly since it needs to evaluate a product term of the likelihood function. By the natural logarithm transformation, the product term is transformed to a summation to which the consensus algorithm can be applied. The contributions of the proposed approach in this paper are: (1) a new distributed Gaussian mixture filtering framework with an embedded cubature rule can more accurately handle nonlinear and non-Gaussian distributed estimation problems; (2) the iterative diffusion strategy provides better fusion performance than the original diffusion method, the average consensus, and the ICI; (3) it does not need intensive communications as required in the consensus-based estimation; (4) the convergence analysis and information theoretic interpretation of the proposed approach are given.
The remainder of this paper is organized as follows. In Section 2, a centralized cubature Gaussian mixture filter is introduced. The distributed cubature Gaussian mixture filter using iterative diffusion is proposed in Section 3. In Section 4, the performance demonstration via a target-tracking problem is presented. Concluding remarks are given in Section 5.

Centralized Cubature Gaussian Mixture Filter
Consider a class of nonlinear discrete-time dynamic systems described by where x k ∈ R n is the state vector and y k,j ∈ R m is the measurement by the jth sensor where the subscript "j" denotes the sensor index. v k−1 and n k,j are the process noise and measurement noise, respectively, and their probability density functions (pdf) are represented by the Gaussian mixtures (GM) denotes a normal distribution with mean n q k,j and covariance R q k,j and α is the weight of the Gaussian component. The superscripts "p" and "q" denote the pth and qth component of the GM; "N p " and "N q " denote the number of Gaussian components. Due to the non-Gaussian noise and nonlinear dynamics, the estimated state will have a non-Gaussian pdf, which can be modeled as the GM as well.

Cubature Gaussian Mixture Kalman Filter
Assume that the initial state pdf at the beginning of each filtering cycle can be represented by the GM p (x) = N l ∑ l=1 α l N x;x l , P l . In Figure 1, one cycle of the cubature Gaussian mixture filter (CGMF) is illustrated. The cubature Kalman filter (CKF) [3,4] runs on each component of the GM to predict and update the component's mean and covariance. The prediction step of the CKF is first used for each of the N l GM components. Note that after the prediction step, there are N l × N p Gaussian components contributed by the GM of the initial state pdf and the GM of the process noise. After that, the update step of the CKF is used for each Gaussian component and leads to N l × N p × N q Gaussian components added by the GM of the measurement noise. It can be seen that the number of Gaussian components increases after each filtering cycle. To limit the computational complexity, the number of Gaussian components has to be reduced after the update step. In the following, the prediction step and the update step for each Gaussian component using the CKF framework [3,4] are introduced. weight of the Gaussian component. The superscripts "p" and "q" denote the pth and qth component of the GM; " p N " and " q N " denote the number of Gaussian components. Due to the non-Gaussian noise and nonlinear dynamics, the estimated state will have a non-Gaussian pdf, which can be modeled as the GM as well.

Cubature Gaussian Mixture Kalman Filter
Assume that the initial state pdf at the beginning of each filtering cycle can be represented by Gaussian components added by the GM of the measurement noise. It can be seen that the number of Gaussian components increases after each filtering cycle. To limit the computational complexity, the number of Gaussian components has to be reduced after the update step. In the following, the prediction step and the update step for each Gaussian component using the CKF framework [3,4] are introduced.

Prediction Step
Given the initial estimate of the mean Gaussian component, the predicted mean and covariance can be computed by the quadrature approximation [3,4]   where u N is the total number of cubature points, The superscript "l,p" denotes the value using the lth Gaussian component of the GM of the initial state pdf and the pth

Prediction Step
Given the initial estimate of the meanx l k−1|k−1 and covariance P l k−1|k−1 at time k − 1 for the lth Gaussian component, the predicted mean and covariance can be computed by the quadrature approximation [3,4] where N u is the total number of cubature points, l = 1, · · · , N l , p = 1, · · · , N p ; The superscript "l,p" denotes the value using the lth Gaussian component of the GM of the initial state pdf and the pth component of the GM of the process noise. v p k−1 is the mean of the pth Gaussian component of the GM representation of the process noise; ξ l k−1,i is the transformed cubature point given by The cubature points γ i and weights W i of the third-degree cubature rule [3] are given by where e i is a unit vector with the ith element being 1. In this paper, the fifth-degree cubature rule [4] is also used to improve the estimation accuracy. The weights W i and points γ i of the fifth-degree rule are given by where the points s i are given by and is α l · α p · α q . The final GM can be represented by Note that the number of Gaussian components increases significantly as the time evolves. In order to avoid excessive computation load, some Gaussian components can be removed or merged. There are many GM reduction algorithms [18][19][20], such as pruning Gaussian components with negligible weights, joining near Gaussian components, and regeneration of GM via Kullback-Leibler approach. In this paper, near Gaussian components are joined to reduce the number of Gaussian components. The detailed description of this method is omitted since it is not the focus of this paper and can be seen in [20]. Note that to keep the estimation accuracy, the GM reduction procedure is not necessary if the number of Gaussian components is less than a specified threshold. For the convenience of implementing the diffusion update step in the proposed distributed estimation algorithm, the number of reduced Gaussian components at each sensor node is specified a priori to be the same.

Centralized Cubature Gaussian Mixture Filter
The centralized cubature Gaussian mixture filter (CCGMF) can be more conveniently expressed using the information filtering form. In the information filter, the information state and the information matrix of the Gaussian component with index l, p, q at time k − 1 are defined aŝ , respectively. The prediction of the information state and information matrix can be obtained via Equations (3) and (4). Using the information from multiple sensors, the information state and the information matrix can be updated by [4,21] y l,p,q k|k =ŷ l,p where N sn is the number of sensor nodes.ŷ l,p k|k−1 and Y l,p k|k−1 can be obtained from the results of Equations (3) and (4). The information state contribution i l,p,q k,j and the information matrix contribution I l,p,q k,j of the jth sensor are given by [4,21] i l,p,q Note that z l,p,q k,j and P l,p k|k−1,xz j can be calculated by the cubature rules Equations (15) and (16), respectively, given in Section 2.1.2. (19) and (20), it can be seen that the local information contributions of i l,p,q k,j and I l,p,q k,j are only computed at sensor j and the total information contribution is simply the sum of the local contributions. Therefore, the information filter is more convenient for multiple sensor estimation than the original Kalman filter.

Remark 2: From Equations
The CCGMF needs to know the information from all sensor nodes and thus demands a large amount of communication energy, which is prohibitive for large-scale sensor networks. In the next section, an iterative diffusion-based distributed cubature Gaussian mixture filter is proposed to provide more efficient multisensor estimation.

Iterative Diffusion-Based Distributed Cubature Gaussian Mixture Filter
The distributed estimation lets each sensor node process its local estimation and then fuse the information from its neighboring nodes by distributed estimation algorithms to establish the global estimate. In this paper, a new distributed cubature Gaussian mixture filter based on iterative diffusion (DCGMF-ID) is introduced.
The diffusion strategy is more feasible in practice when the measurement needs to be processed in a timely manner without many iterations as in the consensus algorithm. The ordinary diffusion Kalman filter (DKF) [12][13][14][15] was designed for linear estimation problems. In this paper, the new DCGMF-ID integrates the cubature rule as well as the GM into the DKF framework to address the nonlinear distributed estimation problem. The prediction step of the DCGMF-ID at each sensor node uses the cubature rule given in Section 2.1.1. The update steps of the DCGMF-ID include the incremental update and the diffusion update, which are described as follows.

Incremental Update
Each node broadcasts its prediction information to its immediate neighbors and receives the prediction information from its immediate neighbors at the same time step. For every node j, once receiving the information, the information state and the information matrix are updated bŷ where N j denotes the set of sensor nodes containing node j and its immediate neighbors.

Diffusion Update
As mentioned in Section 2.1.2, the number of Gaussian components after the GM reduction at each node is specified a priori to be the same, for the convenience of implementing the diffusion update. The covariance intersection algorithm can be utilized for the diffusion update. The covariance for node j can be updated by where P l,p,q k|k,j denotes the covariance of the j th sensor associated with the l, p, qth Gaussian component. w l,p,q j,j is the covariance intersection weight.
The state estimation for node j can be updated by The weights w l,p,q j,j are calculated by [22] where tr (·) denotes the trace operation.
Remark 3: Different from the conventional diffusion-based distributed estimation algorithms, the DCGMF-ID performs the diffusion update multiple times iteratively, rather than updating it only once. The advantage of the iterative diffusion update is that estimates from different sensors eventually converge.
The DCGMF-ID algorithm (Algorithm 1) can be summarized as follows:

Algorithm 1
Step 1: Each sensor node calculates the local prediction using Equations (3) and (4), and the cubature rule, and transforms them to the information stateŷ l,p k|k−1,j and the information matrix Y l,p k|k−1,j .
Step 2: When new measurements are available, each node evaluates the information state contribution i l,p,q k,j and the information matrix contribution I l,p,q k,j by using Equations (21) and (22).
Step 3: Each node communicates with its immediate neighbors to update its information state and information matrix through the incremental update (i.e., Equations (23) and (24)).
Step 4: Each node runs the diffusion update by Equations (26) and (28)  The final GM can be represented by Step 5: Conduct GM reduction.
The iterative diffusion update is identical to the iterative covariance intersection (ICI) algorithm [16]. Thus, the proposed distributed estimation has the same properties of unbiasedness and consistency as the ICI. For linear systems, if the initial estimate at each sensor node is unbiased, the estimate through the incremental update and the diffusion update in each filtering cycle is still unbiased. For nonlinear systems, however, the unbiasedness may not be preserved. It is also true for the analysis of consistency. When the covariance intersection (CI) method is used for data fusion, consistency is ensured based on the assumption that the estimate at each sensor node is consistent [23]. If it is assumed that each node's local estimate after the incremental step is consistent (i.e., P k|k,j ≥ E x k|k,j − x k x k|k,j − x k T ), then by the diffusion update, the fused estimate is still consistent because the CI is applied. Without this assumption, consistency is not guaranteed by the CI technique. For linear systems, this assumption can be easily met and consistency can be guaranteed. For nonlinear systems, the high-degree (fifth-degree) cubature rule based-filtering is utilized in this paper for the local estimation at each node. It can provide more accurate estimate of x k|k,j and P k|k,j than the third-degree cubature Kalman filter (CKF) and the unscented Kalman filter (UKF). Therefore, although the unbiasedness and consistency cannot be guaranteed for nonlinear systems, they can be better approached by the proposed distributed estimation scheme than other distributed nonlinear filters. It is necessary to compare the DCGMF-ID with the consensus-based distributed estimation. For the iterative diffusion strategy in the DCGMF-ID, if the local estimate obtained at each node after the incremental update is consistent, the fused estimate by the diffusion update is also consistent, no matter how many iterations of the iterative diffusion update since the CI is applied. In addition, it was shown in [16] that the covariance and estimate from each node converge to a common value (i.e., lim t→∞ P k,j (t) = P k and lim t→∞x k,j (t) =x k ). Recall that "t" represents the tth diffusion iteration, not the time. However, for the consensus-based distributed estimation [24], even if the local estimate obtained at each node is consistent, if the number of iterations of consensus is not infinite, the consistency of the fused estimate cannot be preserved [24]. Because the average consensus cannot be achieved in a few iterations, a multiplication by |N|, the cardinality of the network, will lead to an overestimate of the information, which is not desirable. Although another approach was proposed in [24]-to fuse the information from each node in order to preserve consistency-the new consensus algorithm results in more computation complexity.
In the following, we provide a more complete analysis of the convergence by the following two propositions.

Proposition 1:
The iterative diffusion update Equations (30a) and (30b) can be represented in a general form of η(t + 1) = A(t)η(t), where each (j, j ) entry of the transition matrix A(t) denoted by a j,j (t) corresponds to the weight w l,p,q j,j (t). Assume that the sensor network is connected. If there exists a positive constant α < 1 and the following three conditions are satisfied (a) a j,j (t) ≥ α for all j, t; (b) a j,j (t) ∈ {0} ∪ [α, 1], j = j ; (c) ∑ N sn j =1 a j,j (t) = 1 for all j, j , t; the estimates using the proposed DCGMF-ID reach a consensus value.

Proof:
The proof uses the theorem 2.4 in [25]. If the connected sensor network satisfies these three conditions, η(t), using the algorithm: converges to a consensus value. For the scalar case (the dimension of the state is one), a j,j (t) corresponds to w l,p,q j,j (t). The jth element of η(t) corresponds to the information stateŷ l,p,q k|k,j (t). For the vector case, the transition matrix A(t) ⊗ I n should be applied where ⊗ denotes the Kronecker product and n is the dimension of the state. For the matrix case, each column of the matrix can be treated as the vector case.
As seen from Equation (29), the weight w l,p,q j,j (t) only depends on the covariance matrix. Here we assume that the covariance in the first iteration is upper bounded, and for any t there is no covariance matrix equal to 0 (no uncertainty). As long as node j and node j are connected, w l,p,q j,j (t) ∈ (0, 1). Thus, condition (b) is satisfied. In addition, from Equation (29), ∑ N sn j =1 w l,p,q j,j (t) = 1 always holds; that is, the transition matrix A(t) is always row-stochastic. Therefore, condition (c) is satisfied.
For any arbitrary large t, say t max , the non-zero weight set w l,p,q j,j (t) , t = 1, · · · , t max for all j, j is a finite set since the number of nodes and the number of iterations are finite. There always exists a minimum value in this finite set. Thus, α can be chosen to be 0 < α ≤ min w l,p,q j,j (t) such that conditions (a) and (b) are satisfied.
According to the theorem 2.4 in [25] for the agreement algorithm Equation (31), the estimate η(t) reaches a consensus value.

Proposition 2:
If the assumption and conditions in Proposition 1 are satisfied, the consensus estimate using the DCGMF-ID is unique.
Proof: Let U 0,t = A(t)A(t − 1) · · · A(0) be the backward product of the transition matrices and lim t→∞ U 0,t = U * according to Proposition 1. On the other hand, when the consensus is achieved, the covariance matrix or the information matrix Y l,p,q k|k,j associated with each node becomes the same. According to Equation (29), the weights w l,p,q j,j (t) converge to the same value. Thus, lim t→∞ A(t) = A * and A * 1 = 1 since A * is a row-stochastic matrix where A * = [a 1 a 2 · · · a n ] T with a j being the row vector of the matrix A * . Furthermore, because Y l,p,q k|k,j converges to the same value, from Equation (29), all the non-zero weights w l,p,q j,j (t) or all non-zero entries of the row vector a j are identical and equal to the reciprocal of the degree of the jth node, i.e., 1 δ j (where δ j degree of the jth node cardinality of N j ).
Hence, A * is deterministic given the connected sensor network.
A * is irreducible since the sensor network is connected. Moreover, the diagonal elements of A * are all positive (equal to the reciprocal of the degree of each node). Hence, 1 is a unique maximum eigenvalue of A * [26] and, in fact, A * is a primitive matrix [26].
In the sense of consensus, lim t→∞ η(t) = U * η(0), we have A * U * = U * or (A * − I)U * = 0 (note, it is not possible for U * to be 0 since it is the backward product of non-negative matrices). The column of U * belongs to the null space of A * − I. Since 1 is the unique maximum eigenvalue of A * , 0 is the unique eigenvalue of A * − I and the dimension of the null space of A * − I is 1. Thus, 1 (or any scalar multiplication of 1) is the unique vector belonging to the null space of A * − I. Therefore, U * is ergodic, i.e., U * = 1 [α 1 , α 2 , · · · , α n ] where α i is a scalar constant. According to Theorem 4.20 in [27], [α 1 , α 2 , · · · , α n ] and the consensus value of η(t) are unique. The iterative diffusion update in the DCGMF-ID can be interpreted from the information theory perspective as the process of minimizing the Kullback-Leibler divergence (KLD) [28]. In the information theory, a measure of distance between different pdfs can be given by the KLD. Given the local pdf p i with the weight π i , the fused pdf p f can be obtained by minimizing the KLD: with N sn ∑ i=1 π i = 1 and π i ≥ 0. D(p||p i ) is the KLD defined as: The KLD is always non-negative, and equal to zero only when p(x) = p i (x). The solution to Equation (32) turns out to be [28] The above equation is also the Chernoff fusion [29]. Under the Gaussian assumption, which is true for each component of the GM model in this paper, it was shown in [29] that the Chernoff fusion yields update equations identical to the covariance intersection Equations (25)- (28).
Therefore, from the information-theoretic perspective, the iterative diffusion update Equation (30) is actually equivalent to minimizing the KLD repeatedly. For instance, the diffusion update at the tth iteration is equivalent to When t approaches t max , from the convergence property of the iterative diffusion (i.e., Propositions 1 and 2), the cost for the minimization problem in Equation (35) approaches 0 since p j (t max ) = p for all j = 1, . . . , N sn , and D(p||p) = 0 where p is the final convergent pdf.

Numerical Results and Analysis
In this section, the performance of DCGMF based on different fusion strategies is demonstrated via a benchmark target-tracking problem using multiple sensors, which is to track a target executing a maneuvering turn in a two-dimensional space with unknown and time-varying turn rate [3]. The target dynamics is highly nonlinear due to the unknown turn rate. It has been used as a benchmark problem to test the performance of different nonlinear filters [3,30].
The discrete-time dynamic equation of the target motion is given by: y k ] are the position and velocity at time k, respectively; ∆t is the time-interval between two consecutive measurements; ω k−1 is the unknown turn rate at the time k − 1; and v k−1 is the white Gaussian noise with mean zero and covariance Q k−1 , The measurements are the range and angle given by where atan2 is the four-quadrant inverse tangent function; n k is the measurement noise with an assumed non-Gaussian distribution n k ∼ 0.5N (n 1 , R 1 ) + 0.5N (n 2 , R 2 ), where n 1 = 5 m, −2 × 10 −6 mrad T and n 2 = [−5 m, 0 mrad] T . The variances R 1 and R 2 are assumed to be R 1 = diag 100 m 2 , 10 mrad 2 and R 2 = 80 m 2 10 −1 mmrad 10 −1 mmrad 10 mrad 2 . The sampling interval is ∆t = 1 s. The simulation results are based on 100 Monte Carlo runs. The initial estimatê x 0 is generated randomly from the normal distribution N (x 0 ; x 0 , P 0 ) with x 0 being the true initial state x 0 = [1000 m, 300 m/s, 1000 m, 0, −3 deg/s] T and P 0 being the initial covariance P 0 = diag 100 m 2 , 10 m 2 /s 2 , 100 m 2 , 10 m 2 /s 2 , 100 mrad 2 /s 2 . Sixteen sensors are used in simulation. The topology of the sensor network is shown in Figure 2. Note that the "circle" denotes the sensor node. It is assumed that the target is always in the range and field of view of all sensors. The metric used to compare the performance of different filters is the root mean square error (RMSE). The RMSEs of the position, velocity, and turn rate using different filters with the thirddegree cubature rule are shown in Figures 3-5, respectively. The cubature Gaussian mixture filter (CGMF) using a single sensor, the distributed cubature Gaussian mixture filter based on the iterative covariance intersection [16] (DCGMF-ICI), average consensus (DCGMF-AC), iterative diffusion strategies (DCGMF-ID), and the centralized cubature Gaussian mixture filter (CCGMF) are compared. Since DCGMF-ICI, DCGMF-AC, and DCGMF-ID all involve iterations, it is more illustrative to use the number of iterations as a parameter to compare their performance. "M" in the figures is the iteration number.  The metric used to compare the performance of different filters is the root mean square error (RMSE). The RMSEs of the position, velocity, and turn rate using different filters with the third-degree cubature rule are shown in Figures 3-5, respectively. The cubature Gaussian mixture filter (CGMF) using a single sensor, the distributed cubature Gaussian mixture filter based on the iterative covariance intersection [16] (DCGMF-ICI), average consensus (DCGMF-AC), iterative diffusion strategies (DCGMF-ID), and the centralized cubature Gaussian mixture filter (CCGMF) are compared. Since DCGMF-ICI, DCGMF-AC, and DCGMF-ID all involve iterations, it is more illustrative to use the number of iterations as a parameter to compare their performance. "M" in the figures is the iteration number. The metric used to compare the performance of different filters is the root mean square error (RMSE). The RMSEs of the position, velocity, and turn rate using different filters with the thirddegree cubature rule are shown in Figures 3-5, respectively. The cubature Gaussian mixture filter (CGMF) using a single sensor, the distributed cubature Gaussian mixture filter based on the iterative covariance intersection [16] (DCGMF-ICI), average consensus (DCGMF-AC), iterative diffusion strategies (DCGMF-ID), and the centralized cubature Gaussian mixture filter (CCGMF) are compared. Since DCGMF-ICI, DCGMF-AC, and DCGMF-ID all involve iterations, it is more illustrative to use the number of iterations as a parameter to compare their performance. "M" in the figures is the iteration number.     It can be seen from the figures that (1) DCGMFs and CCGMF exhibit better performance than CGMF using single sensor since more information from multiple sensors can be exploited; (2) with the increase of iterations, the performance of all DCGMFs is improved; (3) the DCGMF-ICI is less accurate than the DCGMF-AC and the DCGMF-ID since the ICI algorithm does not do the incremental update; (4) both the DCGMF-AC (M = 10) and the DCGMF-ID (M = 10) achieve very close performance to the CCGMF. However, fewer iterations have a more negative effect on the performance of the DCGMF-AC than that on the DCGMF-ID. The DCGMF-ID is more effective in terms of iterations since the DCGMF-ID with M = 1 has close performance to the DCGMF-AC with M = 5. Hence, when the allowable number of information exchanges is limited, DCGMF-ID would be   It can be seen from the figures that (1) DCGMFs and CCGMF exhibit better performance than CGMF using single sensor since more information from multiple sensors can be exploited; (2) with the increase of iterations, the performance of all DCGMFs is improved; (3) the DCGMF-ICI is less accurate than the DCGMF-AC and the DCGMF-ID since the ICI algorithm does not do the incremental update; (4) both the DCGMF-AC (M = 10) and the DCGMF-ID (M = 10) achieve very close performance to the CCGMF. However, fewer iterations have a more negative effect on the performance of the DCGMF-AC than that on the DCGMF-ID. The DCGMF-ID is more effective in terms of iterations since the DCGMF-ID with M = 1 has close performance to the DCGMF-AC with M = 5. Hence, when the allowable number of information exchanges is limited, DCGMF-ID would be It can be seen from the figures that (1) DCGMFs and CCGMF exhibit better performance than CGMF using single sensor since more information from multiple sensors can be exploited; (2) with the increase of iterations, the performance of all DCGMFs is improved; (3) the DCGMF-ICI is less accurate than the DCGMF-AC and the DCGMF-ID since the ICI algorithm does not do the incremental update; (4) both the DCGMF-AC (M = 10) and the DCGMF-ID (M = 10) achieve very close performance to the CCGMF. However, fewer iterations have a more negative effect on the performance of the DCGMF-AC than that on the DCGMF-ID. The DCGMF-ID is more effective in terms of iterations since the DCGMF-ID with M = 1 has close performance to the DCGMF-AC with M = 5. Hence, when the allowable number of information exchanges is limited, DCGMF-ID would be the best filter. It is also worth noting that the DCGMF-AC requires less computational effort at each node, but requires more communication expense than the DCGMF-ID. If the communication capability of the sensor network is not a main constraint, the DCGMF-AC would be a competitive approach.
Next, we compare the performance of DCGMFs using the third-degree cubature rule and the DCGMFs using the fifth-degree cubature rule. The metric is the averaged cumulative RMSE (CRMSE). The CRMSE for the position is defined by where N sim = 100 s is the simulation time and N mc = 100 is the number of Monte Carlo runs. The superscript "j" denotes the jth state variable and the subscripts "i" and "m" denote the ith simulation time step and the mth simulation, respectively. The CRMSE for the velocity and CRMSE for the turn rate can be similarly defined.
The results of DCGMF-AC using the third-degree cubature rule and the fifth-degree cubature rule show indistinguishable difference. Similar results can be observed for CCGMF. DCGMF-ID and DCGMF-ICI using the fifth-degree cubature rule, however, show better performance than those using the third-degree cubature rule. The reason is that DCGMF-ID and DCGMF-ICI depend heavily on the local measurement to perform estimation, while DCGMF-AC and CCGMF update the estimate based on global observations from all sensors. Specifically, for DCGMF-AC, although each sensor communicates measurement only with its neighbors, after convergence of the consensus iterations, each sensor actually obtained a fused measurement information from all sensors. Because the high degree numerical rule affects the accuracy of estimates extracted from the observations, the fifth-degree cubature rule can more noticeably improve the performance of DCGMF-ID and DCGMF-ICI based on only local observations. However, the benefit of using the high-degree numerical rule will be mitigated if more information from more sensors is available as for the DCGMF-AC and CCGMF. Hence, we only compare the results of DCGMF-ID and DCGMF-ICI using the third-degree and the fifth-degree cubature rules in Table 1. In order to see merely the effect of the cubature rules with different accuracy degrees on the performance, we want to minimize the effect of different iterations on the performance of different filters. Therefore, a sufficiently large iteration number, M = 20, is used to ensure that the different filters already converge after iterations. It can be seen from the Table 1 that DCGMF-ID and DCGMF-ICI using the fifth-degree cubature rule can achieve better performance than those using the third-degree cubature rule.

Conclusions
A new iterative diffusion-based distributed cubature Gaussian mixture filter (DCGMF-ID) was proposed for the nonlinear non-Gaussian estimation using multiple sensors. The convergence property of the DCGMF-ID was analyzed. It has been shown via a target-tracking problem that the DCGMF-ID can successfully approximate the performance of the centralized cubature Gaussian mixture filter and has all the advantages of the distributed filters. Among the iterative distributed estimation strategies, the DCGMF-ID exhibits more accurate results than the iterative covariance intersection based method (i.e., DCGMF-ICI). It also shows better performance than the average consensus-based method given the same number of iterations. In addition, the fifth-degree cubature rule can improve the accuracy of the DCGMF-ID.