A Novel Distributed State Estimation Algorithm with Consensus Strategy

Owing to its high-fault tolerance and scalability, the consensus-based paradigm has attracted immense popularity for distributed state estimation. If a target is neither observed by a certain node nor by its neighbors, this node is naive about the target. Some existing algorithms have considered the presence of naive nodes, but it takes sufficient consensus iterations for these algorithms to achieve a satisfactory performance. In practical applications, because of constrained energy and communication resources, only a limited number of iterations are allowed and thus the performance of these algorithms will be deteriorated. By fusing the measurements as well as the prior estimates of each node and its neighbors, a local optimal estimate is obtained based on the proposed distributed local maximum a posterior (MAP) estimator. With some approximations of the cross-covariance matrices and a consensus protocol incorporated into the estimation framework, a novel distributed hybrid information weighted consensus filter (DHIWCF) is proposed. Then, theoretical analysis on the guaranteed stability of the proposed DHIWCF is performed. Finally, the effectiveness and superiority of the proposed DHIWCF is evaluated. Simulation results indicate that the proposed DHIWCF can achieve an acceptable estimation performance even with a single consensus iteration.


Introduction
Recently, distributed state estimation has been a hot topic in the field of target tracking in sensor networks [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. As a traditional method, the centralized scheme needs to simultaneously process the local measurements from all sensors in the fusion center at each time instant [3,15]. This scheme guarantees the optimality of estimates, but a lot of communication and a powerful fusion center are required to maintain the operation, which may give rise to problems when the network size is increased or the communication resources are restricted.
Unlike the centralized scheme, the distributed mechanism tries to recover the centralized performance via local communications between neighboring nodes. Specifically, each node in the network only exchanges information with its immediate neighbors to achieve a comparable performance to its centralized counterpart, which reduces communication cost and makes the network robust against possible failures of some nodes [8]. The consensus filter, which computes the average of interested values in a distributed manner, has attracted immense popularity for distributed state estimation [4][5][6][7][9][10][11][12][13][14][16][17][18][19][20][21][22]. Recently, in [23,24], the multiscale consensus scheme, in which the local estimated states achieve asymptotically prescribed ratios in terms of multiple scales, has been discussed and analyzed. The well-known Kalman consensus filter (KCF) [4][5][6]9,14] combines the local Kalman network topology is fixed, the normalization factor can be computed offline to save bandwidth. In [32], a novel distributed set-theoretic information flooding (DSIF) protocol is proposed. The DSIF protocol benefits from avoiding the reuse of information and offering the highest converging efficiency for network consensus, but it suffers from growing requirements of node-storage, more communication iterations, and higher communication load.
However, it takes sufficient consensus iterations for the algorithms discussed above to achieve an expected estimation performance. In practical applications, only a limited number of consensus iterations is allowed, and thus the performance of the afore-mentioned algorithms is corrupted. In addition, the estimation performance of the afore-mentioned algorithms depends closely on the selection of consensus weights. Inappropriate consensus weights may cause the algorithms to diverge or require more iterations to achieve consensus on the local estimates [9]. It is a common way to set the weights as a constant value as discussed in [6], which is an intuitive choice to maintain the stability of the error dynamics. However, the constant value needs the knowledge of maximum degree across the entire sensor network. Even the maximum degree is available, it remains a problem how to determine a proper constant weight to achieve the best performance while preserving the property of consistency. In addition, the initial consensus terms determined in ICF require the knowledge of the total number of nodes in the network. The global parameters, such as the maximum degree or the total number of nodes, may vary over time when the communication topology is changed, some new nodes are joined, or some existing nodes fail to communicate with others. Without the accurate knowledge of these global parameters, each node would either overestimate or underestimate the state of interest.
To deal with the problems analyzed above, a novel distributed hybrid information weighted consensus filter (DHIWCF) is proposed in this paper. Firstly, different from the previous work [4][5][6]16,18,22], each node assigns consensus weights to its neighbors based on their local degrees, which is fully distributed with no requirement for any knowledge of the global parameters. Secondly, the prior estimate information and measurement information at current time instant within the inclusive neighborhood are, respectively, combined together to form the local generalized prior estimate equation and the local generalized measurement equation. Then, a distributed local MAP estimator is derived with some reasonable approximations of the error covariance matrices, which achieves higher accuracy than the approaches introduce in [4][5][6]11,16,18,19,[25][26][27][28]. Finally, the average consensus protocol with the aforementioned consensus weights is incorporated into the estimation framework, and the proposed DHIWCF is obtained. In addition, the theoretical analysis on consistency of the local estimates, stability and convergence of the estimator is performed. The comparative experiments on three different target tracking scenarios validate the effectiveness and feasibility of the proposed DHIWCF. Even with a single consensus iteration, the DHIWCF is still able to achieve acceptable estimation performance.
The remainder of this paper is organized as follows. Section 2 formulates the problem of distributed state estimation in sensor networks. The distributed local MAP estimator is derived in Section 3. Section 4 presents distributed hybrid information weighted consensus filter. The theoretical analysis on the consistency of estimates, stability and convergence of the estimator is provided in Section 5. The experimental results and analysis are considered in Section 6. The concluded remarks are given in Section 7.
Notation: R n denotes the n-dimensional Euclidean space. · is the Euclidean norm in R n . For arbitrary matrix A, A −1 and A T are respectively its inverse and transpose. A > 0 means A is positive definite, and tr{A} is the shorthand for the trace of A. diag(B 1 , B 2 , . . . , B n ) denotes a block diagonal matrix with its main diagonal block being B 1 , B 2 , . . . , B n . I n represents the n × n identity matrix. For a set C, |C| means the cardinality of C. E{·} is the expectation operator.

System Model
Consider a discrete time linear system with dynamics where x k ∈ R n x represents the state vector at time instant k ∈ Z + , where Z + is the set of positive integers. F k is the state transition matrix. w k is the Gaussian process noise with zero-mean, covariance Q k . The state of interest is observed by a sensor network consisting of N s nodes in the surveillance area. The measurement model of node i is where z i, k ∈ R n z is the measurement of node i at time instant k. i = 1, 2, . . . , N s denote the sensor labels. H i, k is the measurement matrix of node i. v i,k is the Gaussian measurement noise with zero mean, covariance R i,k .
are mutually uncorrelated.
Assumption 2 [16,18,22,[26][27][28]30]. If node i does not directly sense the target of interest at time instant k, then R −1 i,k = 0. Remark 1. Assumption 2 indicates that a node with no direct sensing ability is of infinite uncertainties about its local measurement, which guarantees consistency of the local measurements.

Network Topology
The communication topology of the networked sensors can be represented by an undirected graph G = (S, E). Here, S = {1, 2, . . . , N S } is the set of sensor nodes, and E ⊆ S × S is the available communication links in the network. A communication link means that any two neighboring nodes can exchange state or measurement information with each other. A connected network means that any node in the network may directly exchange information with at least one other node. The immediate neighbors of node i is denote by N i = j (i, j) ∈ E . d i = |N i | is degree of node i, which is the number of neighboring nodes linked to node i. The inclusive neighborhood which includes node i is represented by J i = {i} ∪ N i . A more comprehensible way to describe the network topology is using adjacency matrix A, where element a i,j = 1 means that node i can exchange information with node j, and a i,j = 0 means that there is no direct communication link between node i and node j. The immediate neighbors of node i can be easily represented by N i = j a i,j = 1 , and The degree of node i is d i = N s j=1 a i,j .

Definition 1.
If the target of interest is neither observed by node i nor observed by its neighbors j ∈ N i , then node i is referred as a naive node. It should be noted that if node i is naive about the target of interest, R −1 j,k = 0 for j ∈ J i in view of Assumption 2.
For instance, there are 8 sensor nodes in the monitored area, and the communication topology is shown in Figure 1. Assume that only node 1 can directly observe the interested target, then nodes {2, 3, 4} can acquire state information from node 1 by local communication. However, there are no measurements within the inclusive neighborhood of nodes {5, 6, 7, 8}, thus they are naive about the target's state.

Average Consensus
As an effective method to compute the mean value, the average consensus operates in a distributed fashion, which sheds light on the problem of distributed state estimation. Suppose the initial value of each node is ( ) Here, is the consensus weight, which should satisfy certain conditions to ensure convergence to the mean of initial values [25,28,33]. A sufficient and necessary condition guaranteeing finite-time weighted average consensus has been provided in [34]. In the derivation of the proposed DHIWCF, the average consensus protocol is involved in the state update step, hence we only discuss the design of consensus weights ensuring average consensus on local estimates of all the nodes in the network. If it is possible for the above protocol in (4) to iterate for infinite times, the estimated state of all nodes in the network will asymptotically converge to the average value, that is, , where max d is the maximum node degree in the network [6]. From the perspective of adjacency matrix, the communication topology shown in Figure 1 can be simply represented by adjacency matrix A, where element a i,j = 1 means the available communication links in the illustrated network. It is easy to obtain the degree of each node from A. The neighbors of each nodes are also evident in A.

Average Consensus
As an effective method to compute the mean value, the average consensus operates in a distributed fashion, which sheds light on the problem of distributed state estimation. Suppose the initial value of each node is α 0 i . The goal is to compute the mean N s i=1 α 0 i /N s by local communications between neighboring nodes. At time instant k, node i sends its previous state α l−1 i to its immediate neighbors j ∈ N i , and in a similar way receives the previous state α l−1 j from nodes j ∈ N i . Then it updates its current state by the following fusion rule.
Here, π i,j is the consensus weight, which should satisfy certain conditions to ensure convergence to the mean of initial values [25,28,33]. A sufficient and necessary condition guaranteeing finite-time weighted average consensus has been provided in [34]. In the derivation of the proposed DHIWCF, the average consensus protocol is involved in the state update step, hence we only discuss the design of consensus weights ensuring average consensus on local estimates of all the nodes in the network.
If it is possible for the above protocol in (4) to iterate for infinite times, the estimated state of all nodes in the network will asymptotically converge to the average value, that is, lim In the original KCF, the consensus rate is set to be a constant value ε ∈ (0, 1/d max ), where d max is the maximum node degree in the network [6].

Remark 2.
A larger ε will accelerate the convergence of the protocol, but a ε equal to or more than 1/d max will render the protocol unstable [9,16]. The constant ε treats states from different nodes with the same weights, which may slow down the convergence rate of the whole system. The choice of ε depends on d max , which is not always available, especially in sparse sensor networks with time-varying communication topologies. In addition, there is no theoretical analysis on how to choose such a constant consensus weight.
To avoid the requirements for global parameters and speed up convergence rate, the Metropolis weights determine consensus rate between neighboring nodes based on their local node degree. As is discussed in [30], Metropolis weights enable the protocol in (4) to achieve convergence faster. The definition of Metropolis weights is Remark 3. The above definition in (5) indicates that a node with a larger degree will be assigned a smaller weight. All the consensus weights are computed only with the knowledge of local node degree, which is applicable to almost any kind of sensor networks. The interested reader is referred to [30] and the references therein for details. In this paper, the Metropolis weights are chosen for the proposed algorithm.
The goal of the proposed DHIWCF is to achieve consensus on the local estimates of each node over the entire network by consensus iterations between neighboring nodes, and at the same time approach the estimation performance of CKF. If the network is fully connected, only a single iteration is enough to accomplish the estimation task. But in practical applications, a general network is often partially connected. To ensure the estimation accuracy and consensus on local estimates, it requires several iterations for the concerned information to spread throughout the entire network. However, due to constrained computation and communication resources, only a limited number of consensus iterations is available. It is urgent to design a more efficient distributed estimation scheme, which is able to achieve satisfactory estimation accuracy and consensus simultaneously with less consensus iterations.

Distributed Local MAP Estimation
This section starts with the centralized MAP estimator. Then we formulate the local generalized prior estimate equation based on prior estimates from the inclusive neighbors and the local generalized measurement equation based on the current measurements from the inclusive neighbors. By maximizing the local posterior probability, the local MAP estimator is derived. To implement the estimation steps in a distributed manner, approximation of the error cross covariance is required. Two special cases, where the prior errors from neighboring nodes are uncorrelated or completely identical, are considered here. The practical importance of such an approximation can be seen from the numerical examples in Section 6, which indicate that the proposed DHIWCF is effective even if the assumed cases are not fulfilled.

Global MAP Estimator
represents the collective measurements of the entire sensor network at time instant k. The stacked measurement matrix of all the nodes is denoted as Then the global measurement model can be formulated as where p(z k |Z k−1 ) = p(z k |x k )p(x k |Z k−1 )dx k is a normalization constant. Since the process noise and measurement noise are both Gaussian, then the conditional PDF p(z k |x k ) and p(x k |Z k−1 ) are also Gaussian. The explicit form of the prior PDF p(x k |Z k−1 ) and the likelihood PDF p(z k |x k ) is formed as Since the process noise and measurement noise are both Gaussian, then the conditional PDF p(z k |x k ) and p(x k |Z k−1 ) are also Gaussian. The explicit form of the prior PDF p(x k |Z k−1 ) and the likelihood PDF p(z k |x k ) is formed as where Based on Gaussian product in the numerator, the criterion in (7) can be reformulated by minimizing the following cost function.
Here, the cost function in (11) is strictly convex on x k and hence the optimalx MAP k|k is available.
The corresponding posterior error covariance is The equivalent information form of the estimate in (12) and (13) can be rewritten aŝ

Local MAP Estimation
Assume that each node, for instance, node i, is able to receive its neighbor's prior local estimatê x j,k|k−1 and the corresponding covariance P j,k|k−1 , as well as its neighbor's local measurement z j,k and the corresponding noise covariance R j,k by local communication. The local generalized prior estimate, denoted byx i,k|k−1 , is defined asx where j h ∈ N i (h = 1, 2, . . . , d i ) denotes the index of node i's neighbors. Let η i,k|k−1 =x i,k|k−1 − x k be the prior error of node i. The local collective prior error of node i with respect to its inclusive neighbors can be formulated as where H I = [I p , I p , . . . , I p ] T is the matrix stacked by d i + 1 identity matrices. x k is the true state at time instant k. The local collective prior error covariance of node i can be written as Here, the block matrix Similarly, the local generalized measurement of node i with regard to its inclusive neighbors can be formulated as Here, Combining (17) and (19) together, one has where the error covariance Here the operator blkdiag( · ) denotes the block diagonal matrix. According to the derivation of the global maximum a posterior estimator described in Section 3.1, the updated local information matrix can be computed by Similarly, the updated local information vector iŝ Here, It is shown in (22) and (23) that the key to acquire the local posterior estimate is to compute the inverse of the local collective prior error covariance, that is, However, as is shown in (18), the computation of P i,k|k−1 −1 requires the knowledge of cross-covariance between neighbors of node i. As is shown in [6], to compute the cross-covariance matrix P ij h ,k|k−1 , the information of the neighbors of node j h is also required. Therefore, it is not practical to directly compute P i,k|k−1 −1 due to the fact that large amounts of communication among neighboring nodes are required, which may cause tremendous burden on computation and communication for the networked system. Although some work has been done in [35,36] to incorporate cross-covariance information into the estimation framework, no technique for computing the required terms are offered and predefined values are used instead [4].
Therefore, an approximation of P i,k|k−1 in a distributed manner is necessary. In the following derivation, two special cases are discussed. The first case is that the prior estimates from different nodes are completely uncorrelated with each other. This is true at the beginning of the estimation procedure when the prior information are initialized with random quantities. The second case is for converged priors, which is critical for the reason that with sufficient consensus iterations, the prior estimates from all nodes will converge to the identical value.

Case 1: Uncorrelated Priors
In this case, the prior errors from different nodes are assumed to be uncorrelated with each The local posterior estimate in (22) and (23) can be approximated asŷ Note that after enough consensus iterations, the prior estimates of each node in the network asymptotically converges to the centralized result, i.e., Y i,k|k−1 = Y c,k|k−1 andŷ i,k|k−1 =ŷ c,k|k−1 . In such a case, the local prior information matrix in (25) However, after convergence, the total amount of prior information in the network is Y c,k|k−1 . That is, the local prior information matrix in the inclusive neighborhood is overestimated by a factor (1 + d i ). Therefore, the approximation of P i,k|k−1 should be modified by multiplying a factor (1 + d i ), which is P i,k|k−1 = (1 + d i )blkdiag P i,k|k−1 , P j 1 ,k|k−1 , . . . , P j d i ,k|k−1 to avoid underestimation of the prior covariance. Hence, the results in (24) and (25) should be modified as

Case 2: Converged Priors
When the prior estimate of each node converges to the centralized result, one has Note that for converged priors, Therefore, the estimated results in (22) and (23) can be transformed into the weighted summation of the prior information and current measurement innovations, which are the same forms as the results shown in (26) and (27).

Remark 4.
It should be noted that the assumed cases are not always satisfied in realistic applications, but it is still of great significance for distributed filtering algorithms. The effectiveness and feasibility of such an approximation is evaluated by numerical examples in Section 6.

Hybrid Information Weighted Consensus Filter
In Section 3, the prior estimate of each node is assumed to be known. Here, the prediction step is given.
For simplicity, the prediction step in (31) can be rewritten as with The corresponding prior information vector iŝ With the above prediction steps and a weighted consensus protocol incorporated into the distributed local MAP estimator, a novel state estimation algorithm is obtained. Since the prior estimates and the measurement innovation are fused with different schemes, the proposed algorithm is referred as distributed hybrid information weighted consensus filter (DHIWCF). The recursive form of DHIWCF is detailed in Algorithm 1. Algorithm 1. DHIWCF implemented by node i at time instant k.
1. Obtain the local measurement z i, k with covariance matrix R i,k . 2. Compute the measurement contribution vector and contribution matrix.
Update its consensus state.
end for 7. Compute the posterior estimate.

Consistency of Estimates
One of the most fundamental but important properties of a recursive filtering algorithm is that the estimated error statistics should be consistent with the true estimation errors. The approximated error covariance of an inconsistent filtering algorithm is too small or optimistic, which does not really indicate the uncertainty of the estimate and may result in divergence since subsequent measurements in this case are prone to be neglected [28].
Definition 2 [28,30,37,38]. Consider a random vector x. Letx and P be, respectively, the estimate of x and the estimate of the corresponding error covariance. Then the pair (x, P) is said to be consistent if It is shown in (38) that consistency requires that the true error covariance should be upper bounded (in the positive definite sense) by the approximated error covariance P. In the distributed estimation paradigm, due to the unaware reuse of the redundant data in the consensus iteration and the possible correlation between measurements from different nodes, the filter may suffer from inconsistency and divergence. In such a case, preservation of consistency is even much more important.
For convenience, consider the information pair (ŷ, Y), whereŷ = P −1x and Y = P −1x . The consistency defined by (38) can be rewritten as Assumption 3. The initialized estimate of each node, represented by x i,0|0 , P i,0|0 , i ∈ S, is consistent.
Remark 5. In general, Assumption 3 can be easily satisfied. The initial information on the state vector can be acquired in an off-line fashion before the fusion process. In the worst case where no prior information is available, each node can simply set the initialized information matrix as P −1 i,0|0 = 0, which implies infinite estimate uncertainty in each node at the beginning so that Assumption 3 is fulfilled. Assumption 4. The system matrix F k is invertible.
Lemma 1 [28]. Under Assumption 4, if two positive semidefinite matrices Y 1 and . In other words, the function Ψ k ( · ) is monotonically nondecreasing for any k ≥ 0. Theorem 1. Let Assumptions 1, 2, and 3 hold. Then, for each time instant k and each node i ∈ S, the information pair ŷ i,k|k , Y i,k|k of the DHIWCF is consistent in that Proof. An inductive method is utilized here to prove this theorem. It is supposed that, at time instant k − 1 for any i ∈ S. For brevity, the predicted information matrix in (31) can be rewritten as On the basis of Lemma 1, it is immediate to see According to (26) and (27), the local estimation error iŝ According to the consistency property of covariance intersection [29,38], it holds that Then, exploiting (47) and (43) in (45), the following result is obtained.
Since the information pair x l+1 i,k|k , Y l+1 i,k|k is computed based on the previous information pair x l i,k|k , Y l i,k|k by (3), and the covariance intersection involved in (3) preserves the consistency of estimates [29,[37][38][39], it can be concluded that E In other words, if the estimate obtained with l consensus iterations is consistent, the estimate obtained with l + 1 consensus iterations is also consistent. Therefore, it is straightforward to conclude that (40) holds withx i,k|k =x L i,k|k and Y i,k|k = Y L i,k|k . The proof is concluded since the initial estimatex i,0|0 , ∀i ∈ S is consistent.

Boundedness of Error Covariances
According to the consistency of the proposed DHIWCF in Theorem 1, it is sufficient to prove that Y i,k|k is lower bounded by a certain positive matrix (or equivalently, to prove P i,k|k = Y −1 i,k|k is upper bounded by some constant matrix) for the proof of the boundedness of the error covariance To derive the bounds for the information matrix Y i,k|k , The following assumptions are required.
Let Π be the consensus matrix, whose elements are the consensus weights π i,j for any i, j ∈ S. Further, let π L i,j be the (i, j)-th element of Π L , which is the L-th power of Π. Assumption 6. The consensus matrix Π is row stochastic and primitive.

Assumption 7.
There exist real numbers f , f , h, h 0 and positive real numbers p > p > 0, q > q > 0, such that the following bounds are fulfilled for each k ≥ 0, i ∈ S.
Lemma 2 [28]. Under Assumptions 4 and 5, and the proposed DHIWCF algorithm, if there exists a positive semidefinite matrix Y such that Y i,k|k ≤ Y, ∀k ≥ 0, i ∈ S, then there always exists a strictly positive constant By virtue of Lemma 2, Theorem 2 which depicts the boundedness of error covariances is presented below. Theorem 2. Let Assumptions 4-7 hold, there exist positive definite matrices Ω and Ω such that where Y i, k|k is the information matrix given by the proposed DHIWCF.
Proof. For simplicity, the proof is concluded for the case L = 1. The generalization for L > 1 can be directly derived in a similar way. According to the proposed DHIWCF, the information matrix for node i at time instant k can be written as In view of Assumption 6, 7 and fact that Y r,k|k−1 ≤ Q −1 k−1 by (31), one can get Hence, the upper bound is achieved. Next a lower bound will be guaranteed under Assumption 5. According to Lemma 2 and Assumption 7 and (31), (53), it follows from (52) that where α is a positive scalar with 0 < α < 1. By recursively exploiting (52) and (54) for a certain number (denoted by k) of times, there is where π τ i,j is the (i, j)-th element of Π τ . Ξ is a matrix with elements Note that the matrix Ξ is constructed based on the network topology and is naturally stochastic. According to [40,41], as long as the undirected network is connected, similar to the definition of Π, Ξ is primitive. Therefore, there exist strictly positive integers m and n such that all the elements of Π s and Ξ t are positive for s ≥ m, t ≥ n. Let us define It should be noted that, under Assumption 5, Ω 1 is definite positive for k ≥ max(m, n + 1). Therefore, for k ≥ k, Y i,k|k ≥ Ω 1 > 0. Since k is finite, for 0 ≤ k ≤ k − 1, there exists a constant positive definite matrix Ω 2 such that Y i,k|k ≥ Ω 2 > 0. Hence, there exists a positive definite matrix Ω such that 0 < Ω ≤ Y i, k|k . The proof is now complete.

Remark 6.
The result shown in Theorem 2 is only dependent on collective observability. This is distinct from some algorithms that require some sort of local observability or detectability condition [5,6,8,11,25], which poses a great challenge to the sensing abilities of sensors and restricts the scope of application.

Convergence of Estimation Errors
In line with the boundedness of Y i,k|k proven in Theorem 2, the convergence of local estimation errors obtained by the proposed DHIWCF is analyzed in this section. To facilitate the analysis, the following preliminary lemmas are required.
Lemma 3 [26,28,31]. Given an integer N ≥ 2, N positive definite matrices M 1 , . . . , M N and N vectors v 1 , . . . , v N , the following inequality holds Lemma 4 [26,28]. Under Assumptions 4 and 5, and the proposed DHIWCF algorithm, if there exists a positive semidefinite matrix Y such that Y i,k|k ≥ Y, ∀k ≥ 0, i ∈ S, then there always exists a strictly positive scalar For the sake of simplicity, let us denote the prediction and estimation error at node i by x i,k|k−1 =x i,k|k−1 − x k and x i,k|k =x i,k|k − x k , respectively. The collective forms are, respectively, x k|k−1 = col x 1,k|k−1 , . . . , x N s ,k|k−1 and x k|k = col x 1,k|k , . . . , x N s ,k|k .
Theorem 3. Under Assumptions 4-6, the proposed DHIWCF algorithm yields an asymptotic estimate in each node of the network in that lim Proof. Under Assumptions 4-6, Theorem 2 holds. Therefore, Y i,k|k is uniformly lower and upper bounded. Let us define the following candidate Lyapunov function By virtue of Lemma 2, it can be concluded that there exists a positive real number 0 < β < 1 such that Then, one has Notice that Here, pre-multiplying (65) by Y −1 i,k|k and post-multiplying it by x k yields In a similar way, pre-multiplying (66) by Y −1 i,k|k yieldŝ Therefore, According to (36), there iŝ Since E v r, k = 0, one can get Substituting (71) into (64) yields Applying the fact that Y i, k|k ≥ j∈S r∈J j π L i,j 1 + d j −1 Y r,k|k−1 and Lemma 3 to the right hand side of (72), one can obtain that Writing (73) for i = 1, 2, . . . , N s in a collective form, it turns out that and Ξ is a matrix with elements satisfying Since the consensus matrix Π and the constructed matrix Ξ are both row stochastic, thus their spectral radiuses are both 1. As a consequence, for 0 < β < 1, the elements of vector V k+1 E x k+1|k vanishes as k tends to infinity in that lim and Assumption 4, it is straightforward to conclude that lim Remark 7. The Lyapunov function defined in (61) plays a crucial role in the convergence proof of the proposed algorithm, which can be easily extended to stability analysis of Kalman-like consensus filters in other scenarios. The reason for the non-singularity requirements of F k in Theorem 3 is that the proof of the Lyapunov method depends on Lemma 4, the establishment of which needs the invertibility of F k .

Simulation Setting
A target tracking scenario is adopted here to validate the effectiveness and superiority of the proposed DHIWCF. The centralized Kalman filter (CKF) is chosen as a benchmark to compare the proposed DHIWCF with following algorithms: The Kalman consensus filter (KCF), the generalized Kalman consensus filter (GKCF), the information weighted consensus filter (ICF), the consensus on information algorithm (CI), the consensus on measurements algorithm (CM), the hybrid consensus on measurement + consensus on information algorithm (HCMCI).
In the surveillance area, a target is moving with the discrete time linear model shown in (1).
y k T is the state vector at time instant k. (x k , y k ) and .
x k , . y k are, respectively, the position and velocity components of the state. The state transition matrix F k and the process noise covariance matrix Q k are set as follows.
The initial position of the target is randomly located at the 500 × 500 space. The initial speed is set to 2 units per time step, with a random direction uniformly chosen from 0 to 2π. In each simulation run, the initial prior error covariance is P 0 = diag(100, 100, 10, 10), and all nodes in the network share the same P 0 . The initial prior estimate of each node is generated by adding zero-mean Gaussian noise with covariance P 0 to the true initial state. The total number of time steps is K = 100 unless stated otherwise. The sampling time interval is T = 1 s.
The target of interest is observed by a number of networked sensors with measurement model shown in (2). The measurement matrix H i, k and the measurement noise covariance R i,k are given below.

Performance Metrics
For a fair comparison, a total number M c = 200 of independent Monte Carlo runs are carried out. The consensus iterations L is set from 1 to 10. The consensus rate parameter is selected as ε = 0.65/d max . For the proposed DHIWCF algorithm, the Metropolis weight matrix is chosen, which is computed only with knowledge of local node degree. Following metrics are chosen to evaluate the estimation performance from different aspects.
(1) The position root mean squared error (PRMSE), which indicates the tracking accuracy, is defined as where x m i,k ,ŷ m i,k and x m k , y m k are, respectively, the estimated position and the true position in the m-th Monte Carlo run.
(2) The averaged position root mean squared error (APRMSE), which implies the overall tracking accuracy of an algorithm over all simulation runs, all time instants and all sensors, is defined as (3) The averaged consensus estimate error (ACEE), which indicates the degree of consensus among estimates from all nodes in the network, is defined as (4) The normalized estimation error squared (NEES), which is used to check for filter consistency, is defined as where x k andx k|k are, respectively, the true state and estimated state. P k|k is the posterior covariance at time instant k. Suppose that the filter is consistent, the NEES is subject to Chi-squared distribution with n x degrees of freedom. A way to check filter consistency is by testing the average NEES over a number of M c Monte Carlo runs, i.e., Under similar assumptions M c ε k will be Chi-squared distributed with M c n x degrees of freedom. Suppose the acceptance interval is [r 1 , r 2 ], the Chi-square test is accepted if ε k ∈ [r 1 , r 2 ]. The filter is optimistic if the computed ε k is much higher than r 2 , while it is conservative with the computed ε k below r 1 .
(5) Computational cost. The computational cost is defined as the averaged running time over all Monte Carlo runs.

Reuslts and Analysis
In this subsection, three simulation scenarios are chosen to evaluate and compare the estimation performance of the proposed DHIWCF algorithm with respect to the afore-mentioned metrics.

Evaluation of the Effectiveness of the Proposed DHIWCF Algorithm
This scenario is designed to validate the effectiveness of the proposed DHIWCF algorithm. The target of interest is tracked by 8 networked sensors with a communication topology shown in Figure 1. Only node 1 can observe the target, then the node set {5, 6, 7, 8} is naive about the target state. Figure 2 shows the estimated tracks obtained by local nodes with the proposed DHIWCF and the CKF. For simplicity, only the estimated tracks of node 1 and node 8 are plotted. To illustrate the evolution of each track, checkpoints are plotted in the same color every 20 steps. The covariance ellipses with 95% confidence at each checkpoint are plotted in dashed lines. As is shown, the true position (cross in black) is always enveloped by the corresponding ellipse in red (node 1) or blue (node 8), which validates the consistency of the local estimates. Compared with the CKF, the local estimates by the proposed DHIWCF is much more conservative. This is due to the fact that the network in Figure 1  In Figure 3, the PRMSE of the compared algorithms with a single consensus iteration is given. Both KCF and CM diverge in the considered scenario, while others can effectively track the target. Later on in this section KCF and CM are not considered for their poor performance. The proposed DHIWCF is more accurate for its lower PRMSE close to the CKF. Due to limited consensus iterations, GKCF, ICF, and HCMCI obtain PRMSE higher than DHIWCF, but is much lower than CI. In Figure 3, the PRMSE of the compared algorithms with a single consensus iteration is given. Both KCF and CM diverge in the considered scenario, while others can effectively track the target. Later on in this section KCF and CM are not considered for their poor performance. The proposed DHIWCF is more accurate for its lower PRMSE close to the CKF. Due to limited consensus iterations, GKCF, ICF, and HCMCI obtain PRMSE higher than DHIWCF, but is much lower than CI.
Both KCF and CM diverge in the considered scenario, while others can effectively track the target. Later on in this section KCF and CM are not considered for their poor performance. The proposed DHIWCF is more accurate for its lower PRMSE close to the CKF. Due to limited consensus iterations, GKCF, ICF, and HCMCI obtain PRMSE higher than DHIWCF, but is much lower than CI.  Figure 4 compares the APRMSE of different algorithms. It shows that GKCF and ICF obtain APRMSE much higher than that of ICF, HCMCI and DHIWCF. Specifically, DHIWCF performs the best with limited consensus iterations 2 L  . As consensus iteration increases, DHIWCF asymptotically converges to the CKF. In addition, the performance of DHIWCF is a little much better than that of ICF. In Figure 5, the average NEES of different algorithms with a single consensus iteration is compared. It is obvious that the NEES curve of ICF lies much higher than the 95% concentration regions, which indicates that ICF has poor consistency in such a scenario. The NEES curve of CI always lies below the concentration regions, and hence its estimates is much conservative. The NEES curves of GKCF, HCMCI, DHIWCF, and CKF lie either below or within the concentration regions all the time steps, which shows an enhanced consistency.  Figure 4 compares the APRMSE of different algorithms. It shows that GKCF and ICF obtain APRMSE much higher than that of ICF, HCMCI and DHIWCF. Specifically, DHIWCF performs the best with limited consensus iterations L ≤ 2. As consensus iteration increases, DHIWCF asymptotically converges to the CKF. In addition, the performance of DHIWCF is a little much better than that of ICF. In Figure 5, the average NEES of different algorithms with a single consensus iteration is compared. It is obvious that the NEES curve of ICF lies much higher than the 95% concentration regions, which indicates that ICF has poor consistency in such a scenario. The NEES curve of CI always lies below the concentration regions, and hence its estimates is much conservative. The NEES curves of GKCF, HCMCI, DHIWCF, and CKF lie either below or within the concentration regions all the time steps, which shows an enhanced consistency.     The ACEE comparison of different algorithms with a single consensus iteration is shown in Figure 6. The proposed DHIWCF performs much better with regard to consensus in that it has relatively lower ACEE than other algorithms. Figure 7 shows the computational time with different number of consensus iterations. Although HCMCI performs a little better than DHIWCF in the aspect of APRMSE as shown in Figure 4, its ACEE and computational time are much higher than that of DHIWCF. Moreover, DHIWCF is a little more time-consuming than the most efficient CI as shown in Figure 7. The ACEE comparison of different algorithms with a single consensus iteration is shown in Figure 6. The proposed DHIWCF performs much better with regard to consensus in that it has relatively lower ACEE than other algorithms. Figure 7 shows the computational time with different number of consensus iterations. Although HCMCI performs a little better than DHIWCF in the aspect of APRMSE as shown in Figure 4, its ACEE and computational time are much higher than that of DHIWCF. Moreover, DHIWCF is a little more time-consuming than the most efficient CI as shown in Figure 7. he ACEE comparison of different algorithms with a single consensus iteration is show e 6. The proposed DHIWCF performs much better with regard to consensus in that i vely lower ACEE than other algorithms. Figure 7 shows the computational time with diff er of consensus iterations. Although HCMCI performs a little better than DHIWCF in the a RMSE as shown in Figure 4, its ACEE and computational time are much higher than th CF. Moreover, DHIWCF is a little more time-consuming than the most efficient CI as sh ure 7.  about the target's state information. It takes at least 7 consensus iterations for node 8 ed by node 1. As is shown in Figure 4, the APRMSE of GKCF and CI is much higher than , HCMCI, and DHIWCF. Here, the estimation results of GKCF and CI are not considered.

Performance Comparison under Chain Topology
In this subsection, an even worse scenario is considered, where the networked sensors are connected with a chain topology as shown in Figure 8. As is illustrated, the target is observed only by node 1, and the remaining are communication nodes with no sensing abilities. Node set {3, 4, 5, 6, 7, 8} and their immediate neighbors do not have measurement of the target, so they are naive about the target's state information. It takes at least 7 consensus iterations for node 8 to be affected by node 1. As is shown in Figure 4, the APRMSE of GKCF and CI is much higher than that of ICF, HCMCI, and DHIWCF. Here, the estimation results of GKCF and CI are not considered.

Performance Comparison under Chain Topology
In this subsection, an even worse scenario is considered, where the networked sensors are connected with a chain topology as shown in Figure 8. As is illustrated, the target is observed only by node 1, and the remaining are communication nodes with no sensing abilities. Node set { } 3, 4,5, 6, 7,8 and their immediate neighbors do not have measurement of the target, so they are naive about the target's state information. It takes at least 7 consensus iterations for node 8 to be affected by node 1. As is shown in Figure 4, the APRMSE of GKCF and CI is much higher than that of ICF, HCMCI, and DHIWCF. Here, the estimation results of GKCF and CI are not considered. In Figure 9, the PRMSE averaged over all nodes and all Monte Carlo runs for different consensus iterations is given. With a single consensus iteration, DHIWCF performs much better than ICF and HCMCI. When consensus iterations increase to 3 L = , DHIWCF is still smaller than the improved HCMCI. The result of averaged ACEE with a single consensus iteration is provided in Figure 10, which indicates that except for ICF, the remaining algorithms preserve good consistency.  In Figure 9, the PRMSE averaged over all nodes and all Monte Carlo runs for different consensus iterations is given. With a single consensus iteration, DHIWCF performs much better than ICF and HCMCI. When consensus iterations increase to L = 3, DHIWCF is still smaller than the improved HCMCI. The result of averaged ACEE with a single consensus iteration is provided in Figure 10, which indicates that except for ICF, the remaining algorithms preserve good consistency.  he APRMSE for the algorithms under discussion is plotted in Figure 11. Similar to the resu e 4, DHIWCF has smaller APRMSE with 3 L  , and its performance approaches the KC nsus iterations progress. Although HCMCI obtains APRMSE a little smaller than DHIWC ge ACEE is relatively higher as shown in Figure 12, especially in the case 1 L = . Therefore sed DHIWCF makes a good tradeoff between estimation accuracy and consensus on ates.  he APRMSE for the algorithms under discussion is plotted in Figure 11. Similar to the resu e 4, DHIWCF has smaller APRMSE with 3 L  , and its performance approaches the KC nsus iterations progress. Although HCMCI obtains APRMSE a little smaller than DHIWC ge ACEE is relatively higher as shown in Figure 12, especially in the case 1 L = . Therefore sed DHIWCF makes a good tradeoff between estimation accuracy and consensus on ates. The APRMSE for the algorithms under discussion is plotted in Figure 11. Similar to the result in Figure 4, DHIWCF has smaller APRMSE with L ≤ 3, and its performance approaches the KCF as consensus iterations progress. Although HCMCI obtains APRMSE a little smaller than DHIWCF, its average ACEE is relatively higher as shown in Figure 12, especially in the case L = 1. Therefore, the proposed DHIWCF makes a good tradeoff between estimation accuracy and consensus on local estimates. nsus iterations progress. Although HCMCI obtains APRMSE a little smaller than DHIWC ge ACEE is relatively higher as shown in Figure 12, especially in the case 1 L = . Therefore sed DHIWCF makes a good tradeoff between estimation accuracy and consensus on ates.  The se are able to observe the target, while the communication nodes act as relays of information am t nodes and has no observation ability [24]. As is shown in Figure 13, most of nodes in the net ive about the target's state, which brings great challenges to target tracking.

Performance Comparison in Large-Scale Sparse Sensor Networks
This experiment is designed to test the performance of DHIWCF in large-scale sparse sensor networks. Assume that the interested target is tracked by 100 sensors, which are randomly located within the 500 × 500 space. The communication range of each sensor is set to be R c = 10 √ N s . As is shown in Figure 13, there are 10 sensor nodes and 90 communication nodes in the surveillance area. The sensor nodes are able to observe the target, while the communication nodes act as relays of information among distant nodes and has no observation ability [24]. As is shown in Figure 13, most of nodes in the network are naive about the target's state, which brings great challenges to target tracking. e 13, there are 10 sensor nodes and 90 communication nodes in the surveillance area. The s s are able to observe the target, while the communication nodes act as relays of information a nt nodes and has no observation ability [24]. As is shown in Figure 13, most of nodes in the net aive about the target's state, which brings great challenges to target tracking. Figure 13. A large-scale sparse sensor network with 100 nodes.
In Figure 14, the estimated tracks by different algorithms with a single consensus iteration in Monte Carlo run are plotted. It is intuitive to see that DHIWCF and KCF perform much b ICF and HCMCI. Especially when the target suffers from relatively obvious process nois nce, the target suddenly changes its moving direction), DHIWCF recovers its estimate t more quickly. The results in Figure 15 further suggest that with limited consensus itera CF is able to obtain more accurate estimates than ICF and HCMCI in that it has relatively l SE compared with its counterparts. With respect to estimation consistency, it is shown in F at the NEES curve of ICF lies higher than the concentration region, while the NEES curves ining algorithms are always within or below the concentration region. Therefore, both HC HIWCF show sound consistency on local estimates. In Figure 14, the estimated tracks by different algorithms with a single consensus iteration in a certain Monte Carlo run are plotted. It is intuitive to see that DHIWCF and KCF perform much better than ICF and HCMCI. Especially when the target suffers from relatively obvious process noise (for instance, the target suddenly changes its moving direction), DHIWCF recovers its estimate to the CKF more quickly. The results in Figure 15 further suggest that with limited consensus iterations, DHIWCF is able to obtain more accurate estimates than ICF and HCMCI in that it has relatively lower PRMSE compared with its counterparts. With respect to estimation consistency, it is shown in Figure 16 that the NEES curve of ICF lies higher than the concentration region, while the NEES curves of the remaining algorithms are always within or below the concentration region. Therefore, both HCMCI and DHIWCF show sound consistency on local estimates.     o compare the overall performance of the distributed algorithms under discussion, Ta the APRMSE for different algorithms versus consensus iterations. Compared with ICF CI, the proposed DHIWCF has lower APRMSE. Especially in case of consensus iterations L dvantage is more obvious. This implies that DHIWCF is relatively more accurate. utational time relative to that of CKF is investigated in Table 2, where RCT means the rel utation time. The proposed DHIWCF runs faster than HCMCI for fewer information exchan ugh it takes less time for ICF to operate, the lower estimation accuracy and poor consist it not a good choice to estimate the state of interest.    o compare the overall performance of the distributed algorithms under discussion, Ta the APRMSE for different algorithms versus consensus iterations. Compared with ICF CI, the proposed DHIWCF has lower APRMSE. Especially in case of consensus iterations dvantage is more obvious. This implies that DHIWCF is relatively more accurate. utational time relative to that of CKF is investigated in Table 2, where RCT means the rel utation time. The proposed DHIWCF runs faster than HCMCI for fewer information excha ugh it takes less time for ICF to operate, the lower estimation accuracy and poor consist it not a good choice to estimate the state of interest.
CKF ICF with L =1 ICF with L =3 DHIWCF with L =1 DHIWCF with L =3 HCMCI with L =1 HCMCI with L =3 Figure 16. The averaged NEES for different algorithms in a spare sensor network.
To compare the overall performance of the distributed algorithms under discussion, Table 1 gives the APRMSE for different algorithms versus consensus iterations. Compared with ICF and HCMCI, the proposed DHIWCF has lower APRMSE. Especially in case of consensus iterations L ≤ 3, the advantage is more obvious. This implies that DHIWCF is relatively more accurate. The computational time relative to that of CKF is investigated in Table 2, where RCT means the relative computation time. The proposed DHIWCF runs faster than HCMCI for fewer information exchanges. Although it takes less time for ICF to operate, the lower estimation accuracy and poor consistency make it not a good choice to estimate the state of interest.

Conclusions
This paper considers the problem of distributed state estimation in presence of naive nodes with constrained communication resources. A novel distributed hybrid information weighted consensus filter, in which each node exploits not only the measurement information but also the prior estimate information from its immediate neighbors to update its local posterior estimate, is proposed. The proposed DHIWCF is able to settle the problem under consideration without any knowledge of global parameters, and preserve consistency of local estimates as well as achieve relatively high estimation accuracy and satisfactory consensus. Theoretical analysis with regard to consistency of local estimates, stability, and convergence of the estimator is also provided. The experimental results indicate that with limited consensus iterations, the proposed DHIWCF is much more accurate and reaches better consensus than the existing algorithms. In addition, DHIWCF preserves good consistency of local estimates in the experiments. Even a single consensus iteration is allowed, the proposed DHIWCF still performs much better. If more consensus iterations are available, the proposed DHIWCF would approach the performance of the centralized scheme. In the future research, a further investigation for distributed state estimation in mobile sensor networks, consensus protocol with event-triggered communication, more efficient design of consensus weights, and distributed nonlinear filtering problems and stability analysis will be taken into account.