Open Access
This article is
 freely available
 reusable
Sensors 2019, 19(9), 2134; https://doi.org/10.3390/s19092134
Article
A Novel Distributed State Estimation Algorithm with Consensus Strategy
^{1}
Research Institute of Information Fusion, Naval Aviation University, Yantai 264001, China
^{2}
School of Electronic and Information Engineering, Beihang University, Beijing 100191, China
^{*}
Author to whom correspondence should be addressed.
Received: 26 April 2019 / Accepted: 4 May 2019 / Published: 8 May 2019
Abstract
:Owing to its highfault tolerance and scalability, the consensusbased paradigm has attracted immense popularity for distributed state estimation. If a target is neither observed by a certain node nor by its neighbors, this node is naive about the target. Some existing algorithms have considered the presence of naive nodes, but it takes sufficient consensus iterations for these algorithms to achieve a satisfactory performance. In practical applications, because of constrained energy and communication resources, only a limited number of iterations are allowed and thus the performance of these algorithms will be deteriorated. By fusing the measurements as well as the prior estimates of each node and its neighbors, a local optimal estimate is obtained based on the proposed distributed local maximum a posterior (MAP) estimator. With some approximations of the crosscovariance matrices and a consensus protocol incorporated into the estimation framework, a novel distributed hybrid information weighted consensus filter (DHIWCF) is proposed. Then, theoretical analysis on the guaranteed stability of the proposed DHIWCF is performed. Finally, the effectiveness and superiority of the proposed DHIWCF is evaluated. Simulation results indicate that the proposed DHIWCF can achieve an acceptable estimation performance even with a single consensus iteration.
Keywords:
sensor networks; distributed state estimation; naive node; Kalman filter; maximum a posterior estimator; consensus filter1. Introduction
Recently, distributed state estimation has been a hot topic in the field of target tracking in sensor networks [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]. As a traditional method, the centralized scheme needs to simultaneously process the local measurements from all sensors in the fusion center at each time instant [3,15]. This scheme guarantees the optimality of estimates, but a lot of communication and a powerful fusion center are required to maintain the operation, which may give rise to problems when the network size is increased or the communication resources are restricted.
Unlike the centralized scheme, the distributed mechanism tries to recover the centralized performance via local communications between neighboring nodes. Specifically, each node in the network only exchanges information with its immediate neighbors to achieve a comparable performance to its centralized counterpart, which reduces communication cost and makes the network robust against possible failures of some nodes [8]. The consensus filter, which computes the average of interested values in a distributed manner, has attracted immense popularity for distributed state estimation [4,5,6,7,9,10,11,12,13,14,16,17,18,19,20,21,22]. Recently, in [23,24], the multiscale consensus scheme, in which the local estimated states achieve asymptotically prescribed ratios in terms of multiple scales, has been discussed and analyzed. The wellknown Kalman consensus filter (KCF) [4,5,6,9,14] combines the local Kalman filter with the average consensus protocol together to update the posterior state. In the update stage, each node exploits the measurement innovations as well as the prior estimates from its inclusive neighbors (including the node itself and its immediate neighbors) to correct its prior estimate. However, the prior estimates from its immediate neighbors are assigned with same weights. This may ensure consensus on the estimates from different nodes after a period of time, but the estimation accuracy is not guaranteed. It is very likely that a target is neither observed by a certain node nor observed by any of its immediate neighbors. That is, there is no measurement in the inclusive neighborhood of the node, and it is naive about the target’s state. Similar to [16,22], such a node is referred as a naive node. Since a naive node contains less information about the target, it usually results in an inaccurate estimate. If a naive node is given an identical weight to the informed nodes, the final estimate will be severely contaminated, which may even cause the final estimates to diverge [9,13]. In addition, the cross covariances among different nodes are ignored in the derivation of KCF for computational and bandwidth requirements, and thus the covariance of each node is updated without regard to its neighbor’s prior covariance during the consensus step. Given no naive nodes in the network, KCF is able to provide satisfactory results. However, due to limited sensing abilities or constrained communication resources, a network often consists of some naive nodes. Especially in sparse sensor networks, this phenomenon is even more serious. In such a case, KCF would result in poor estimates [3]. To solve this problem, before updating the posterior estimate, the generalized Kalman consensus filter (GKCF) performs consensus on the prior information vectors and information matrices within the inclusive neighborhood of each node [4,16,19]. As is analyzed in [4], this procedure greatly improves the estimation performance in presence of naive nodes. GKCF updates current state based on consensus on prior estimates, but the current measurements are not considered for naive nodes. Each naive node only has access to measurements of the previous time instant existing in prior estimates. Therefore, there is a delay for naive nodes to access current measurements. On the contrary, consensus on measurements algorithm (CM) performs consensus on measurements [5,25,26,27], which can achieve the centralized performance with infinite consensus iterations. However, the stability is not guaranteed unless the number of consensus iterations is large enough [26]. Consensus on information algorithm (CI) performs consensus on both prior estimates and measurements [26,27,28], which can be viewed as a generalization of the covariance intersection fusion rule to multiple iterations [29]. CI guarantees stability for any number of consensus iterations, but its estimation confidence can be degraded as a conservative rule is adopted by assuming that the correlation between estimates from different nodes are completely unknown [26,28,30].
With more consensus iterations carried out, estimates from different nodes achieve a reasonable consensus. Therefore, each node has almost completely redundant or same prior information, and hence the prior estimation errors between nodes are highly correlated. In this situation, the algorithms such as KCF, GCKF, or CM, which do not take the crosscovariance into account, are suboptimal [16,17]. Note that the redundant information only exists in the prior estimates, which come from the converged results in the previous time instant. Using this property, the information weighted consensus filter (ICF) [18,20,21] divides the prior information of each node by ${N}_{s}$ where ${N}_{s}$ is the total number of nodes in the network. If each node can interact with its neighbors for infinite times, ICF will achieve the optimal estimation performance as the centralized Kalman consensus filter (CKF). In addition, ICF performs better than KCF, GKCF, CI, and CM under the same consensus iterations, which has been validated in [16,22,26]. As is pointed out in [26,30,31], the correction step by multiplying ${N}_{s}$ may cause an overestimation of the measurement innovation for some nodes, which is often the case in sparse sensor networks. As a consequence, the estimates of some nodes may be too optimistic such that the estimation consistency will be lost, which should be avoided in recursive estimation. To address this problem, HCMCI algorithm combines the positive features of both CM and CI is proposed. It should be noted that HCMCI represents a family of different distributed algorithms dependent on the selection of scalar weights. Both CI and ICF are special cases of HCMCI. To preserve consistency of local filters as well as improve the estimation performance, the socalled normalization factor is introduced. If the network topology is fixed, the normalization factor can be computed offline to save bandwidth. In [32], a novel distributed settheoretic information flooding (DSIF) protocol is proposed. The DSIF protocol benefits from avoiding the reuse of information and offering the highest converging efficiency for network consensus, but it suffers from growing requirements of nodestorage, more communication iterations, and higher communication load.
However, it takes sufficient consensus iterations for the algorithms discussed above to achieve an expected estimation performance. In practical applications, only a limited number of consensus iterations is allowed, and thus the performance of the aforementioned algorithms is corrupted. In addition, the estimation performance of the aforementioned algorithms depends closely on the selection of consensus weights. Inappropriate consensus weights may cause the algorithms to diverge or require more iterations to achieve consensus on the local estimates [9]. It is a common way to set the weights as a constant value as discussed in [6], which is an intuitive choice to maintain the stability of the error dynamics. However, the constant value needs the knowledge of maximum degree across the entire sensor network. Even the maximum degree is available, it remains a problem how to determine a proper constant weight to achieve the best performance while preserving the property of consistency. In addition, the initial consensus terms determined in ICF require the knowledge of the total number of nodes in the network. The global parameters, such as the maximum degree or the total number of nodes, may vary over time when the communication topology is changed, some new nodes are joined, or some existing nodes fail to communicate with others. Without the accurate knowledge of these global parameters, each node would either overestimate or underestimate the state of interest.
To deal with the problems analyzed above, a novel distributed hybrid information weighted consensus filter (DHIWCF) is proposed in this paper. Firstly, different from the previous work [4,5,6,16,18,22], each node assigns consensus weights to its neighbors based on their local degrees, which is fully distributed with no requirement for any knowledge of the global parameters. Secondly, the prior estimate information and measurement information at current time instant within the inclusive neighborhood are, respectively, combined together to form the local generalized prior estimate equation and the local generalized measurement equation. Then, a distributed local MAP estimator is derived with some reasonable approximations of the error covariance matrices, which achieves higher accuracy than the approaches introduce in [4,5,6,11,16,18,19,25,26,27,28]. Finally, the average consensus protocol with the aforementioned consensus weights is incorporated into the estimation framework, and the proposed DHIWCF is obtained. In addition, the theoretical analysis on consistency of the local estimates, stability and convergence of the estimator is performed. The comparative experiments on three different target tracking scenarios validate the effectiveness and feasibility of the proposed DHIWCF. Even with a single consensus iteration, the DHIWCF is still able to achieve acceptable estimation performance.
The remainder of this paper is organized as follows. Section 2 formulates the problem of distributed state estimation in sensor networks. The distributed local MAP estimator is derived in Section 3. Section 4 presents distributed hybrid information weighted consensus filter. The theoretical analysis on the consistency of estimates, stability and convergence of the estimator is provided in Section 5. The experimental results and analysis are considered in Section 6. The concluded remarks are given in Section 7.
Notation: ${\mathbb{R}}^{n}$ denotes the ndimensional Euclidean space. $\Vert \text{}\cdot \text{}\Vert $ is the Euclidean norm in ${\mathbb{R}}^{n}$. For arbitrary matrix $\mathit{A}$, ${\mathit{A}}^{1}$ and ${\mathit{A}}^{\mathrm{T}}$ are respectively its inverse and transpose. $\mathit{A}>0$ means $\mathit{A}$ is positive definite, and $\mathrm{tr}\left\{\mathit{A}\right\}$ is the shorthand for the trace of $\mathit{A}$. $\mathrm{diag}\left({\mathit{B}}_{1},{\mathit{B}}_{2},\dots ,{\mathit{B}}_{n}\right)$ denotes a block diagonal matrix with its main diagonal block being ${\mathit{B}}_{1},{\mathit{B}}_{2},\dots ,{\mathit{B}}_{n}$. ${\mathit{I}}_{n}$ represents the $n\times n$ identity matrix. For a set $\mathit{C}$, $\left\mathit{C}\right$ means the cardinality of $\mathit{C}$. $\mathbb{E}\{\cdot \}$ is the expectation operator.
2. Problem Formulation
2.1. System Model
Consider a discrete time linear system with dynamics
where ${\mathit{x}}_{k}\in {\mathbb{R}}^{{n}_{x}}$ represents the state vector at time instant $k\in {\mathbb{Z}}^{+}$, where ${\mathbb{Z}}^{+}$ is the set of positive integers. ${\mathit{F}}_{k}$ is the state transition matrix. ${\mathit{w}}_{k}$ is the Gaussian process noise with zeromean, covariance ${\mathit{Q}}_{k}$.
$${\mathit{x}}_{k+1}={\mathit{F}}_{k}{\mathit{x}}_{k}+{\mathit{w}}_{k}$$
The state of interest is observed by a sensor network consisting of ${N}_{s}$ nodes in the surveillance area. The measurement model of node $i$ is
where ${\mathit{z}}_{i,\text{}k}\in {\mathbb{R}}^{{n}_{\mathit{z}}}$ is the measurement of node $i$ at time instant $k$. $i=1,2,\dots ,{N}_{s}$ denote the sensor labels. ${\mathit{H}}_{i,\text{}k}$ is the measurement matrix of node $i$. ${\mathit{v}}_{i,k}$ is the Gaussian measurement noise with zero mean, covariance ${\mathit{R}}_{i,k}$.
$${\mathit{z}}_{i,\text{}k}={\mathit{H}}_{i,\text{}k}{\mathit{x}}_{k}+{\mathit{v}}_{i,k}$$
Assumption 1.
The noise sequences ${\left\{{\mathit{w}}_{k}\right\}}_{k=\text{}0}^{\infty}$ and ${\left\{{\mathit{v}}_{i,k}\right\}}_{k=\text{}0}^{\infty}$ are mutually uncorrelated.
Assumption 2
Remark 1.
Assumption 2 indicates that a node with no direct sensing ability is of infinite uncertainties about its local measurement, which guarantees consistency of the local measurements.
2.2. Network Topology
The communication topology of the networked sensors can be represented by an undirected graph $\mathcal{G}=\left(\mathcal{S},\text{}\mathcal{E}\right)$. Here, $\mathcal{S}=\left\{1,2,\dots ,{N}_{S}\right\}$ is the set of sensor nodes, and $\mathcal{E}\subseteq \mathcal{S}\times \mathcal{S}$ is the available communication links in the network. A communication link means that any two neighboring nodes can exchange state or measurement information with each other. A connected network means that any node in the network may directly exchange information with at least one other node. The immediate neighbors of node $i$ is denote by ${\mathcal{N}}_{i}=\left\{j\left(i,j\right)\in \mathcal{E}\right\}$. ${d}_{i}=\left{\mathcal{N}}_{i}\right$ is degree of node $i$, which is the number of neighboring nodes linked to node $i$. The inclusive neighborhood which includes node $i$ is represented by ${\mathcal{J}}_{i}=\left\{i\right\}\cup {\mathcal{N}}_{i}$. A more comprehensible way to describe the network topology is using adjacency matrix $\mathit{A}$, where element ${a}_{i,j}=1$ means that node $i$ can exchange information with node $j$, and ${a}_{i,j}=0$ means that there is no direct communication link between node $i$ and node $j$. The immediate neighbors of node $i$ can be easily represented by ${\mathcal{N}}_{i}=\left\{j{a}_{i,j}=1\right\}$, and The degree of node $i$ is ${d}_{i}={\displaystyle {\sum}_{j=1}^{{N}_{s}}{a}_{i,j}}$.
Definition 1.
If the target of interest is neither observed by node $i$ nor observed by its neighbors $j\in {\mathcal{N}}_{i}$, then node $i$ is referred as a naive node. It should be noted that if node $i$ is naive about the target of interest, ${\mathit{R}}_{j,k}^{1}=\mathbf{0}$ for $j\in {\mathcal{J}}_{i}$ in view of Assumption 2.
For instance, there are 8 sensor nodes in the monitored area, and the communication topology is shown in Figure 1. Assume that only node 1 can directly observe the interested target, then nodes $\left\{2,3,4\right\}$ can acquire state information from node 1 by local communication. However, there are no measurements within the inclusive neighborhood of nodes $\left\{5,6,7,8\right\}$, thus they are naive about the target’s state.
From the perspective of adjacency matrix, the communication topology shown in Figure 1 can be simply represented by adjacency matrix $\mathit{A}$, where element ${a}_{i,j}=1$ means the available communication links in the illustrated network. It is easy to obtain the degree of each node from $\mathit{A}$. The neighbors of each nodes are also evident in $\mathit{A}$.
$$\mathit{A}=\left[\begin{array}{llllllll}0\hfill & 1\hfill & 1\hfill & 1\hfill & 0\hfill & 0\hfill & 0\hfill & 0\hfill \\ 1\hfill & 0\hfill & 1\hfill & 1\hfill & 0\hfill & 0\hfill & 0\hfill & 0\hfill \\ 1\hfill & 1\hfill & 0\hfill & 1\hfill & 1\hfill & 0\hfill & 0\hfill & 0\hfill \\ 1\hfill & 1\hfill & 1\hfill & 0\hfill & 0\hfill & 1\hfill & 0\hfill & 0\hfill \\ 0\hfill & 0\hfill & 1\hfill & 0\hfill & 0\hfill & 0\hfill & 1\hfill & 0\hfill \\ 0\hfill & 0\hfill & 0\hfill & 1\hfill & 0\hfill & 0\hfill & 0\hfill & 1\hfill \\ 0\hfill & 0\hfill & 0\hfill & 0\hfill & 1\hfill & 0\hfill & 0\hfill & 0\hfill \\ 0\hfill & 0\hfill & 0\hfill & 0\hfill & 0\hfill & 1\hfill & 0\hfill & 0\hfill \end{array}\right]$$
2.3. Average Consensus
As an effective method to compute the mean value, the average consensus operates in a distributed fashion, which sheds light on the problem of distributed state estimation. Suppose the initial value of each node is ${\alpha}_{i}^{0}$. The goal is to compute the mean ${\sum}_{i=1}^{{N}_{s}}{\alpha}_{i}^{0}}/{N}_{s$ by local communications between neighboring nodes. At time instant $k$, node $i$ sends its previous state ${\alpha}_{i}^{l1}$ to its immediate neighbors $j\in {\mathcal{N}}_{i}$, and in a similar way receives the previous state ${\alpha}_{j}^{l1}$ from nodes $j\in {\mathcal{N}}_{i}$. Then it updates its current state by the following fusion rule.
$${\alpha}_{i}^{l}={\alpha}_{i}^{l1}+{\displaystyle \sum _{j\in {\mathcal{N}}_{i}}{\pi}_{i,j}\left({\alpha}_{j}^{l1}{\alpha}_{i}^{l1}\right)}$$
Here, ${\pi}_{i,j}$ is the consensus weight, which should satisfy certain conditions to ensure convergence to the mean of initial values [25,28,33]. A sufficient and necessary condition guaranteeing finitetime weighted average consensus has been provided in [34]. In the derivation of the proposed DHIWCF, the average consensus protocol is involved in the state update step, hence we only discuss the design of consensus weights ensuring average consensus on local estimates of all the nodes in the network.
If it is possible for the above protocol in (4) to iterate for infinite times, the estimated state of all nodes in the network will asymptotically converge to the average value, that is, $\underset{l\to \infty}{\mathrm{lim}}{\alpha}_{i}^{l}={\displaystyle {\sum}_{i=1}^{{N}_{s}}{\alpha}_{i}^{0}}/{N}_{s}$. In the original KCF, the consensus rate is set to be a constant value $\epsilon \in \left(0,\text{}1/{d}_{\mathrm{max}}\right)$, where ${d}_{\mathrm{max}}$ is the maximum node degree in the network [6].
Remark 2.
A larger $\epsilon $ will accelerate the convergence of the protocol, but a $\epsilon $ equal to or more than $1/{d}_{\mathrm{max}}$ will render the protocol unstable [9,16]. The constant $\epsilon $ treats states from different nodes with the same weights, which may slow down the convergence rate of the whole system. The choice of $\epsilon $ depends on ${d}_{\mathrm{max}}$, which is not always available, especially in sparse sensor networks with timevarying communication topologies. In addition, there is no theoretical analysis on how to choose such a constant consensus weight.
To avoid the requirements for global parameters and speed up convergence rate, the Metropolis weights determine consensus rate between neighboring nodes based on their local node degree. As is discussed in [30], Metropolis weights enable the protocol in (4) to achieve convergence faster. The definition of Metropolis weights is
$${\pi}_{ij}=\{\begin{array}{ll}{\left(1+\mathrm{max}\left\{{d}_{i},\text{}{d}_{j}\right\}\right)}^{1},\hfill & \mathrm{if}\text{}\left\{i,j\right\}\in \mathcal{E}\hfill \\ 1{\displaystyle {\sum}_{\left\{i,j\right\}\in \mathcal{E}}{\pi}_{ij},}\hfill & \mathrm{if}\text{}i=j\hfill \\ 0,\hfill & \mathrm{otherwise}\hfill \end{array}$$
Remark 3.
The above definition in (5) indicates that a node with a larger degree will be assigned a smaller weight. All the consensus weights are computed only with the knowledge of local node degree, which is applicable to almost any kind of sensor networks. The interested reader is referred to [30] and the references therein for details. In this paper, the Metropolis weights are chosen for the proposed algorithm.
The goal of the proposed DHIWCF is to achieve consensus on the local estimates of each node over the entire network by consensus iterations between neighboring nodes, and at the same time approach the estimation performance of CKF. If the network is fully connected, only a single iteration is enough to accomplish the estimation task. But in practical applications, a general network is often partially connected. To ensure the estimation accuracy and consensus on local estimates, it requires several iterations for the concerned information to spread throughout the entire network. However, due to constrained computation and communication resources, only a limited number of consensus iterations is available. It is urgent to design a more efficient distributed estimation scheme, which is able to achieve satisfactory estimation accuracy and consensus simultaneously with less consensus iterations.
3. Distributed Local MAP Estimation
This section starts with the centralized MAP estimator. Then we formulate the local generalized prior estimate equation based on prior estimates from the inclusive neighbors and the local generalized measurement equation based on the current measurements from the inclusive neighbors. By maximizing the local posterior probability, the local MAP estimator is derived. To implement the estimation steps in a distributed manner, approximation of the error cross covariance is required. Two special cases, where the prior errors from neighboring nodes are uncorrelated or completely identical, are considered here. The practical importance of such an approximation can be seen from the numerical examples in Section 6, which indicate that the proposed DHIWCF is effective even if the assumed cases are not fulfilled.
3.1. Global MAP Estimator
Assume ${\mathit{z}}_{k}={\left[{\mathit{z}}_{1,\text{}k}^{\mathrm{T}},{\mathit{z}}_{2,\text{}k}^{\mathrm{T}},\dots ,{\mathit{z}}_{{N}_{s},\text{}k}^{\mathrm{T}}\right]}^{\mathrm{T}}$ represents the collective measurements of the entire sensor network at time instant $k$. The stacked measurement matrix of all the nodes is denoted as ${\mathit{H}}_{k}={\left[{\mathit{H}}_{1,\text{}k}^{\mathrm{T}},{\mathit{H}}_{2,\text{}k}^{\mathrm{T}},\dots ,{\mathit{H}}_{{N}_{s},\text{}k}^{\mathrm{T}}\right]}^{\mathrm{T}}$. The stacked measurement noise is ${\mathit{v}}_{k}={\left[{\mathit{v}}_{1,\text{}k}^{\mathrm{T}},{\mathit{v}}_{2,\text{}k}^{\mathrm{T}},\dots ,{\mathit{v}}_{{N}_{s},\text{}k}^{\mathrm{T}}\right]}^{\mathrm{T}}$ with block diagonal covariance matrix ${\mathit{R}}_{k}=\mathrm{blkdiag}\left({\mathit{R}}_{1,k},{\mathit{R}}_{2,k},\dots ,{\mathit{R}}_{{N}_{s},k}\right)$. Then the global measurement model can be formulated as
$${\mathit{z}}_{k}={\mathit{H}}_{k}{\mathit{x}}_{k}+{\mathit{v}}_{k}$$
Suppose the centralized prior estimate is ${\widehat{\mathit{x}}}_{kk1}^{c}$. The corresponding prior estimation error is ${\mathit{e}}_{kk1}^{c}={\widehat{\mathit{x}}}_{kk1}^{c}{\mathit{x}}_{k}$ with covariance matrix ${\mathit{P}}_{kk1}^{c}=\mathbb{E}\left\{{\mathit{e}}_{kk1}^{c}{\left({\mathit{e}}_{kk1}^{c}\right)}^{\mathrm{T}}\right\}$. Let ${\widehat{\mathit{x}}}_{kk}^{\mathrm{MAP}}$ be the maximum a posterior (MAP) estimate, we have
where $p\left({\mathit{z}}_{k}{\mathit{Z}}_{k1}\right)={\displaystyle \int p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)\mathrm{d}{\mathit{x}}_{k}}$ is a normalization constant. Since the process noise and measurement noise are both Gaussian, then the conditional PDF $p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)$ and $p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)$ are also Gaussian. The explicit form of the prior PDF $p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)$ and the likelihood PDF $p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)$ is formed as
where $p\left({\mathit{z}}_{k}{\mathit{Z}}_{k1}\right)={\displaystyle \int p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)\mathrm{d}{\mathit{x}}_{k}}$ is a normalization constant. Since the process noise and measurement noise are both Gaussian, then the conditional PDF $p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)$ and $p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)$ are also Gaussian. The explicit form of the prior PDF $p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)$ and the likelihood PDF $p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)$ is formed as
where ${\Vert \mathit{x}\Vert}_{\mathit{A}}^{2}={\mathit{x}}^{\mathrm{T}}\mathit{A}\mathit{x}$. Based on Gaussian product in the numerator, the criterion in (7) can be reformulated by minimizing the following cost function.
Here, the cost function in (11) is strictly convex on ${\mathit{x}}_{k}$ and hence the optimal ${\widehat{\mathit{x}}}_{kk}^{\mathrm{MAP}}$ is available.
$${\widehat{\mathit{x}}}_{kk}^{\mathrm{MAP}}=\underset{{\mathit{x}}_{k}}{\mathrm{arg}\mathrm{max}}p\left({\mathit{x}}_{k}{\mathit{Z}}_{k}\right)$$
$$p\left({\mathit{x}}_{k}{\mathit{Z}}_{k}\right)=\frac{p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)}{p\left({\mathit{z}}_{k}{\mathit{Z}}_{k1}\right)}$$
$$p\left({\mathit{x}}_{k}{\mathit{Z}}_{k1}\right)\propto \mathrm{exp}\left(\frac{1}{2}\text{}{\left({\widehat{\mathit{x}}}_{kk1}^{c}{\mathit{x}}_{k}\right)}^{\mathrm{T}}{\left({\mathit{P}}_{kk1}^{c}\right)}^{1}\left({\widehat{\mathit{x}}}_{kk1}^{c}{\mathit{x}}_{k}\right)]\right)$$
$$p\left({\mathit{z}}_{k}{\mathit{x}}_{k}\right)\propto \mathrm{exp}\left(\frac{1}{2}{\left({\mathit{z}}_{k}{\mathit{H}}_{k}{\mathit{x}}_{k}\right)}^{\mathrm{T}}{\mathit{R}}_{k}^{1}\left({\mathit{z}}_{k}{\mathit{H}}_{k}{\mathit{x}}_{k}\right)\right)$$
$${\widehat{\mathit{x}}}_{kk}^{\mathrm{MAP}}=\underset{{\mathit{x}}_{k}}{\mathrm{arg}\mathrm{min}}[{\left({\mathit{z}}_{k}{\mathit{H}}_{k}{\mathit{x}}_{k}\right)}^{\mathrm{T}}{\mathit{R}}_{k}^{1}\left({\mathit{z}}_{k}{\mathit{H}}_{k}{\mathit{x}}_{k}\right)+\text{}{\left({\widehat{\mathit{x}}}_{kk1}^{c}{\mathit{x}}_{k}\right)}^{\mathrm{T}}{\left({\mathit{P}}_{kk1}^{c}\right)}^{1}\left({\widehat{\mathit{x}}}_{kk1}^{c}{\mathit{x}}_{k}\right)]$$
$${\widehat{\mathit{x}}}_{kk}^{\mathrm{MAP}}={\left({\left({\mathit{P}}_{kk1}^{c}\right)}^{1}+{\mathit{H}}_{k}^{\mathrm{T}}{\mathit{R}}_{k}^{1}{\mathit{H}}_{k}\right)}^{1}\left({\left({\mathit{P}}_{kk1}^{c}\right)}^{1}{\widehat{\mathit{x}}}_{kk1}^{c}+{\mathit{H}}_{k}^{\mathrm{T}}{\mathit{R}}_{k}^{1}{\mathit{z}}_{k}\right)$$
The corresponding posterior error covariance is
$${\mathit{P}}_{kk}^{\mathrm{MAP}}={\left({\left({\mathit{P}}_{kk1}^{c}\right)}^{1}+{\mathit{H}}_{k}^{\mathrm{T}}{\mathit{R}}_{k}^{1}{\mathit{H}}_{k}\right)}^{1}$$
The equivalent information form of the estimate in (12) and (13) can be rewritten as
$${\widehat{\mathit{y}}}_{kk}^{\mathrm{MAP}}={\left({\mathit{P}}_{kk1}^{c}\right)}^{1}{\widehat{\mathit{x}}}_{kk1}^{c}+{\mathit{H}}_{k}^{\mathrm{T}}{\mathit{R}}_{k}^{1}{\mathit{z}}_{k}$$
$${\mathit{Y}}_{kk}^{\mathrm{MAP}}={\left({\mathit{P}}_{kk1}^{c}\right)}^{1}+{\mathit{H}}_{k}^{\mathrm{T}}{\mathit{R}}_{k}^{1}{\mathit{H}}_{k}$$
3.2. Local MAP Estimation
Assume that each node, for instance, node $i$, is able to receive its neighbor’s prior local estimate ${\widehat{\mathit{x}}}_{j,kk1}$ and the corresponding covariance ${\mathit{P}}_{j,kk1}^{}$, as well as its neighbor’s local measurement ${\mathit{z}}_{j,k}$ and the corresponding noise covariance ${\mathit{R}}_{j,k}$ by local communication. The local generalized prior estimate, denoted by ${{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}$, is defined as
where ${j}_{h}\text{}\in {\mathcal{N}}_{i}\left(h=1,2,\dots ,{d}_{i}\right)$ denotes the index of node $i$’s neighbors. Let ${\mathit{\eta}}_{i,kk1}^{}={\widehat{\mathit{x}}}_{i,kk1}^{}{\mathit{x}}_{k}^{}$ be the prior error of node $i$. The local collective prior error of node $i$ with respect to its inclusive neighbors can be formulated as ${{\mathit{\eta}}^{\prime}}_{i,kk1}^{}={\left[{\mathit{\eta}}_{i,kk1}^{\mathrm{T}},{\mathit{\eta}}_{{j}_{1},kk1}^{\mathrm{T}},\dots ,{\mathit{\eta}}_{{j}_{{d}_{i}},kk1}^{\mathrm{T}}\right]}^{\mathrm{T}}$. The local generalized prior estimate can be expressed by
where ${\mathbf{\mathscr{H}}}_{\mathit{I}}={[{\mathbf{I}}_{p},{\mathbf{I}}_{p},\dots ,{\mathbf{I}}_{p}]}^{\mathrm{T}}$ is the matrix stacked by ${d}_{i}+1$ identity matrices. ${\mathit{x}}_{k}^{}$ is the true state at time instant $k$. The local collective prior error covariance of node $i$ can be written as
Here, the block matrix ${{\mathit{P}}^{\prime}}_{i,kk1}^{}\in {\mathbb{R}}^{\left(1+{d}_{i}\right){n}_{\mathit{x}}\times \left(1+{d}_{i}\right){n}_{\mathit{x}}}$.
$${{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}={\left[{\widehat{\mathit{x}}}_{i,kk1}^{\mathrm{T}},{\widehat{\mathit{x}}}_{{j}_{1},kk1}^{\mathrm{T}},\dots ,{\widehat{\mathit{x}}}_{{j}_{{d}_{i}},kk1}^{\mathrm{T}}\right]}^{\mathrm{T}}$$
$${{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}={\mathbf{\mathscr{H}}}_{\mathit{I}}{\mathit{x}}_{k}^{}+{{\mathit{\eta}}^{\prime}}_{i,kk1}^{}$$
$${{\mathit{P}}^{\prime}}_{i,kk1}^{}=\mathbb{E}\left\{{{\mathit{\eta}}^{\prime}}_{i,kk1}^{}{\left({{\mathit{\eta}}^{\prime}}_{i,kk1}^{}\right)}^{\mathrm{T}}\right\}=\left[\begin{array}{llll}{\mathit{P}}_{i,kk1}^{}\hfill & {\mathit{P}}_{i{j}_{1},kk1}^{}\hfill & \cdots \hfill & {\mathit{P}}_{i{j}_{{d}_{i}},kk1}^{}\hfill \\ {\mathit{P}}_{{j}_{1}i,kk1}^{}\hfill & \ddots \hfill & \hfill & \vdots \hfill \\ \vdots \hfill & \hfill & \ddots \hfill & \vdots \hfill \\ {\mathit{P}}_{{j}_{{d}_{i}}i,kk1}^{}\hfill & \cdots \hfill & \cdots \hfill & {\mathit{P}}_{{j}_{{d}_{i}},kk1}^{}\hfill \end{array}\right]$$
Similarly, the local generalized measurement of node $i$ with regard to its inclusive neighbors can be formulated as
$${{\mathit{z}}^{\prime}}_{i,\text{}k}={{\mathit{H}}^{\prime}}_{i,\text{}k}{\mathit{x}}_{k}+{{\mathit{v}}^{\prime}}_{i,k}$$
Here, ${{\mathit{z}}^{\prime}}_{i,\text{}k}={\left[{\mathit{z}}_{i,k}^{\mathrm{T}},{\mathit{z}}_{{j}_{1},k}^{\mathrm{T}},\dots ,{\mathit{z}}_{{j}_{{d}_{i}},k}^{\mathrm{T}}\right]}^{\mathrm{T}}$ is the local generalized measurement. ${{\mathit{H}}^{\prime}}_{i,k}={\left[{\mathit{H}}_{i,\text{}k}^{\mathrm{T}},{\mathit{H}}_{{j}_{1},\text{}k}^{\mathrm{T}},\dots ,{\mathit{H}}_{{j}_{{d}_{i}},\text{}k}^{\mathrm{T}}\right]}^{\mathrm{T}}$ is the local generalized measurement matrix. ${{\mathit{v}}^{\prime}}_{i,k}={\left[{\mathit{v}}_{i,\text{}k}^{\mathrm{T}},{\mathit{v}}_{{j}_{1},\text{}k}^{\mathrm{T}},\dots ,{\mathit{v}}_{{j}_{{d}_{i}},\text{}k}^{\mathrm{T}}\right]}^{\mathrm{T}}$ denotes the local generalized measurement noise with covariance matrix ${{\mathit{R}}^{\prime}}_{i,k}=\mathrm{blkdiag}\left({\mathit{R}}_{i,k},{\mathit{R}}_{{j}_{1},k},\dots ,{\mathit{R}}_{{j}_{{d}_{i}},k}\right)$.
Combining (17) and (19) together, one has
where the error covariance
Here the operator $\mathrm{blkdiag}\left(\text{}\cdot \text{}\right)$ denotes the block diagonal matrix.
$$\left[\begin{array}{c}{{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}\\ {{\mathit{z}}^{\prime}}_{i,\text{}k}\end{array}\right]=\left[\begin{array}{c}{\mathbf{\mathscr{H}}}_{\mathit{I}}\\ {{\mathit{H}}^{\prime}}_{i,\text{}k}\end{array}\right]{\mathit{x}}_{k}+\left[\begin{array}{c}{{\mathit{\eta}}^{\prime}}_{i,kk1}^{}\\ {{\mathit{v}}^{\prime}}_{i,k}\end{array}\right]$$
$$\mathbb{E}\left\{\left[\begin{array}{c}{{\mathit{\eta}}^{\prime}}_{i,kk1}^{}\\ {{\mathit{v}}^{\prime}}_{i,k}\end{array}\right]{\left[\begin{array}{c}{{\mathit{\eta}}^{\prime}}_{i,kk1}^{}\\ {{\mathit{v}}^{\prime}}_{i,k}\end{array}\right]}^{\mathrm{T}}\right\}=\mathrm{blkdiag}\left({{\mathit{P}}^{\prime}}_{i,kk1}^{},\text{}{{\mathit{R}}^{\prime}}_{i,k}\right)$$
According to the derivation of the global maximum a posterior estimator described in Section 3.1, the updated local information matrix can be computed by
$$\begin{array}{ll}{\mathit{Y}}_{i,kk}^{\mathrm{LMAP}}& ={\left[\begin{array}{c}{\mathbf{\mathscr{H}}}_{\mathit{I}}\\ {{\mathit{H}}^{\prime}}_{i,\text{}k}\end{array}\right]}^{\mathrm{T}}{\left[\begin{array}{cc}{{\mathit{P}}^{\prime}}_{i,kk1}^{}& \mathbf{0}\\ \mathbf{0}& {{\mathit{R}}^{\prime}}_{i,k}\end{array}\right]}^{1}\left[\begin{array}{c}{\mathbf{\mathscr{H}}}_{\mathit{I}}\\ {{\mathit{H}}^{\prime}}_{i,\text{}k}\end{array}\right]\\ & ={\mathbf{\mathscr{H}}}_{\mathit{I}}^{\mathrm{T}}{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}{\mathbf{\mathscr{H}}}_{\mathit{I}}^{}+{\left({{\mathit{H}}^{\prime}}_{i,k}\right)}^{\mathrm{T}}{\left({{\mathit{R}}^{\prime}}_{i,k}\right)}^{1}{{\mathit{H}}^{\prime}}_{i,k}\\ & ={\displaystyle \sum _{r=1}^{1+{d}_{i}}{\displaystyle \sum _{s=1}^{1+{d}_{i}}{\left[{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}\right]}_{r,s}}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\left({\mathit{R}}_{j,k}\right)}^{1}{\mathit{H}}_{j,\text{}k}}\end{array}$$
Similarly, the updated local information vector is
Here, ${\left[{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}\right]}_{r,s}$ denotes the $\left(r,s\right)$th block matrix of ${\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}$. Similarly, ${\left[{{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}\right]}_{s}$ denotes the $s$th block vector of ${{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}$.
$$\begin{array}{ll}{\widehat{\mathit{y}}}_{i,kk}^{\mathrm{LMAP}}& ={\left[\begin{array}{c}{\mathbf{\mathscr{H}}}_{\mathit{I}}\\ {{\mathit{H}}^{\prime}}_{i,\text{}k}\end{array}\right]}^{\mathrm{T}}{\left[\begin{array}{cc}{{\mathit{P}}^{\prime}}_{i,kk1}^{}& \mathbf{0}\\ \mathbf{0}& {{\mathit{R}}^{\prime}}_{i,k}\end{array}\right]}^{1}\left[\begin{array}{c}{{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}\\ {{\mathit{z}}^{\prime}}_{i,\text{}k}\end{array}\right]\\ & ={\mathbf{\mathscr{H}}}_{\mathit{I}}^{\mathrm{T}}{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}{{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}+{\left({{\mathit{H}}^{\prime}}_{i,k}\right)}^{\mathrm{T}}{\left({{\mathit{R}}^{\prime}}_{i,k}\right)}^{1}{{\mathit{z}}^{\prime}}_{i,k}\\ & ={\displaystyle \sum _{r=1}^{1+{d}_{i}}{\displaystyle \sum _{s=1}^{1+{d}_{i}}{\left[{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}\right]}_{r,s}{\left[{{\widehat{\mathit{x}}}^{\prime}}_{i,kk1}^{}\right]}_{s}}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{z}}_{j,\text{}k}}\end{array}$$
3.3. Approximation of ${\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}$
It is shown in (22) and (23) that the key to acquire the local posterior estimate is to compute the inverse of the local collective prior error covariance, that is, ${\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}$. However, as is shown in (18), the computation of ${\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}$ requires the knowledge of crosscovariance between neighbors of node $i$. As is shown in [6], to compute the crosscovariance matrix ${\mathit{P}}_{i{j}_{h},kk1}^{}$, the information of the neighbors of node ${j}_{h}\text{}$ is also required. Therefore, it is not practical to directly compute ${\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}$ due to the fact that large amounts of communication among neighboring nodes are required, which may cause tremendous burden on computation and communication for the networked system. Although some work has been done in [35,36] to incorporate crosscovariance information into the estimation framework, no technique for computing the required terms are offered and predefined values are used instead [4].
Therefore, an approximation of ${{\mathit{P}}^{\prime}}_{i,kk1}^{}$ in a distributed manner is necessary. In the following derivation, two special cases are discussed. The first case is that the prior estimates from different nodes are completely uncorrelated with each other. This is true at the beginning of the estimation procedure when the prior information are initialized with random quantities. The second case is for converged priors, which is critical for the reason that with sufficient consensus iterations, the prior estimates from all nodes will converge to the identical value.
3.3.1. Case 1: Uncorrelated Priors
In this case, the prior errors from different nodes are assumed to be uncorrelated with each other, i.e., $\mathbb{E}\left\{{\mathit{\eta}}_{i,kk1}^{}{\mathit{\eta}}_{{j}_{h},kk1}^{\mathrm{T}}\right\}=\mathbf{0}$. Hence, ${{\mathit{P}}^{\prime}}_{i,kk1}^{}$ in (18) turns into a block diagonal matrix ${{\mathit{P}}^{\prime}}_{i,kk1}^{}=\mathrm{blkdiag}\left({\mathit{P}}_{i,kk1}^{},{\mathit{P}}_{{j}_{1},kk1}^{},\dots ,{\mathit{P}}_{{j}_{{d}_{i}},kk1}^{}\right)$. The local posterior estimate in (22) and (23) can be approximated as
$${\widehat{\mathit{y}}}_{i,kk}^{\mathrm{LMAP}}={\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}{\widehat{\mathit{x}}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{z}}_{j,\text{}k}}$$
$${\mathit{Y}}_{i,kk}^{\mathrm{LMAP}}={\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{H}}_{j,\text{}k}}$$
Note that after enough consensus iterations, the prior estimates of each node in the network asymptotically converges to the centralized result, i.e., ${\mathit{Y}}_{i,kk1}^{}={\mathit{Y}}_{c,kk1}^{}$ and ${\widehat{\mathit{y}}}_{i,kk1}^{}={\widehat{\mathit{y}}}_{c,kk1}^{}$. In such a case, the local prior information matrix in (25) turns into ${\sum}_{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}}=\left(1+{d}_{i}\right){\mathit{Y}}_{c,kk1}^{$. However, after convergence, the total amount of prior information in the network is ${\mathit{Y}}_{c,kk1}^{}$. That is, the local prior information matrix in the inclusive neighborhood is overestimated by a factor $\left(1+{d}_{i}\right)$. Therefore, the approximation of ${{\mathit{P}}^{\prime}}_{i,kk1}^{}$ should be modified by multiplying a factor $\left(1+{d}_{i}\right)$, which is ${{\mathit{P}}^{\prime}}_{i,kk1}^{}=\left(1+{d}_{i}\right)\mathrm{blkdiag}\left({\mathit{P}}_{i,kk1}^{},{\mathit{P}}_{{j}_{1},kk1}^{},\dots ,{\mathit{P}}_{{j}_{{d}_{i}},kk1}^{}\right)$ to avoid underestimation of the prior covariance. Hence, the results in (24) and (25) should be modified as
$${\mathit{Y}}_{i,kk}^{}=\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{H}}_{j,\text{}k}}$$
$${\widehat{\mathit{y}}}_{i,kk}^{}=\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{J}}_{j,kk1}^{}{\widehat{\mathit{x}}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{z}}_{j,\text{}k}}$$
3.3.2. Case 2: Converged Priors
When the prior estimate of each node converges to the centralized result, one has
$$\sum _{r=1}^{1+{d}_{i}}{\displaystyle \sum _{s=1}^{1+{d}_{i}}{\left[{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}\right]}_{r,s}}}={\mathit{Y}}_{c,kk1}^{}=\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{c,kk1}^{}$$
Note that for converged priors, ${\mathit{Y}}_{j,kk1}^{}={\mathit{Y}}_{c,kk1}^{},\text{}j\in {\mathcal{J}}_{i}$. Substituting this fact into (28), there is
$$\sum _{r=1}^{1+{d}_{i}}{\displaystyle \sum _{s=1}^{1+{d}_{i}}{\left[{\left({{\mathit{P}}^{\prime}}_{i,kk1}^{}\right)}^{1}\right]}_{r,s}}}=\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}$$
Therefore, the estimated results in (22) and (23) can be transformed into the weighted summation of the prior information and current measurement innovations, which are the same forms as the results shown in (26) and (27).
Remark 4.
It should be noted that the assumed cases are not always satisfied in realistic applications, but it is still of great significance for distributed filtering algorithms. The effectiveness and feasibility of such an approximation is evaluated by numerical examples in Section 6.
4. Hybrid Information Weighted Consensus Filter
In Section 3, the prior estimate of each node is assumed to be known. Here, the prediction step is given.
$${\widehat{\mathit{x}}}_{i,kk1}^{}={\mathit{F}}_{k1}{\widehat{\mathit{x}}}_{i,k1k1}^{}$$
$${\mathit{Y}}_{i,kk1}^{}={\left({\mathit{F}}_{k1}{\mathit{Y}}_{i,\text{}k1k1}^{1}{\mathit{F}}_{k1}^{\mathrm{T}}+{\mathit{Q}}_{k1}\right)}^{1}$$
For simplicity, the prediction step in (31) can be rewritten as
with
$${\mathit{Y}}_{i,kk1}^{}={\Psi}_{k1}\left({\mathit{Y}}_{i,\text{}k1k1}^{}\right)$$
$${\Psi}_{k1}\left({\mathit{Y}}_{i,\text{}k1k1}^{}\right)={\left({\mathit{F}}_{k1}{\mathit{Y}}_{i,\text{}k1k1}^{1}{\mathit{F}}_{k1}^{\mathrm{T}}+{\mathit{Q}}_{k1}\right)}^{1}$$
The corresponding prior information vector is
$${\widehat{\mathit{y}}}_{i,kk1}^{}={\mathit{Y}}_{i,kk1}^{}{\widehat{\mathit{x}}}_{i,kk1}^{}$$
With the above prediction steps and a weighted consensus protocol incorporated into the distributed local MAP estimator, a novel state estimation algorithm is obtained. Since the prior estimates and the measurement innovation are fused with different schemes, the proposed algorithm is referred as distributed hybrid information weighted consensus filter (DHIWCF). The recursive form of DHIWCF is detailed in Algorithm 1.
Algorithm 1. DHIWCF implemented by node $i$ at time instant $k$. 
1. Obtain the local measurement ${\mathit{z}}_{i,\text{}k}$ with covariance matrix ${\mathit{R}}_{i,k}^{}$. 
2. Compute the measurement contribution vector and contribution matrix.
$$\{\begin{array}{l}{\mathit{u}}_{i}^{}={\mathit{H}}_{i,k}^{\mathrm{T}}{\mathit{R}}_{i,k}^{1}{\mathit{z}}_{i,k}\hfill \\ {\mathit{U}}_{i}^{}={\mathit{H}}_{i,k}^{\mathrm{T}}{\mathit{R}}_{i,k}^{1}{\mathit{H}}_{i,k}^{}\hfill \end{array}$$

3. Broadcast state message $\left\{{\mathit{y}}_{i,kk1}^{},{\mathit{Y}}_{i,kk1}^{},{\mathit{u}}_{i}^{},{\mathit{U}}_{i}^{}\right\}$ to its neighboring nodes $j\in {\mathcal{N}}_{i}$. 
4. Receive state message $\left\{{\mathit{y}}_{j,kk1}^{},{\mathit{Y}}_{j,kk1}^{},{\mathit{u}}_{j}^{},{\mathit{U}}_{j}^{}\right\}$ from its neighboring nodes $j\in {\mathcal{N}}_{i}$. 
5. Compute the initial values.
$$\{\begin{array}{l}{\widehat{\mathit{y}}}_{i,kk}^{0}=\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\widehat{\mathit{y}}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{z}}_{j,\text{}k}}\hfill \\ {\mathit{Y}}_{i,kk}^{0}=\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{H}}_{j,\text{}k}}\hfill \end{array}$$

6. Perform consensus. for $l=1:L$ do
end for 
7. Compute the posterior estimate. 
${\widehat{\mathit{x}}}_{i,kk}^{}={\left({\mathit{Y}}_{i,\text{}kk}^{L}\right)}^{1}{\widehat{\mathit{y}}}_{i,\text{}kk}^{L}$, ${\mathit{Y}}_{i,kk}^{}={\mathit{Y}}_{i,kk}^{L}$, ${\widehat{\mathit{y}}}_{i,\text{}kk}^{}={\widehat{\mathit{y}}}_{i,\text{}kk}^{L}$ 
8. Prediction at time instant $k+1$ 
${\widehat{\mathit{x}}}_{i,k+1k}^{}={\mathit{F}}_{k}{\widehat{\mathit{x}}}_{i,kk}^{}$, ${\mathit{Y}}_{i,k+1k}^{}={\Psi}_{k}\left({\mathit{Y}}_{i,\text{}kk}^{}\right)$, ${\widehat{\mathit{y}}}_{i,k+1k}^{}={\mathit{Y}}_{i,k+1k}^{}{\widehat{\mathit{x}}}_{i,k+1k}^{}$ 
5. Performance Analysis
5.1. Consistency of Estimates
One of the most fundamental but important properties of a recursive filtering algorithm is that the estimated error statistics should be consistent with the true estimation errors. The approximated error covariance of an inconsistent filtering algorithm is too small or optimistic, which does not really indicate the uncertainty of the estimate and may result in divergence since subsequent measurements in this case are prone to be neglected [28].
Definition 2
[28,30,37,38]. Consider a random vector $\mathit{x}$. Let $\widehat{\mathit{x}}$ and $\mathit{P}$ be, respectively, the estimate of $\mathit{x}$ and the estimate of the corresponding error covariance. Then the pair $\left(\widehat{\mathit{x}},\mathit{P}\right)$ is said to be consistent if
$$\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{}{\mathit{x}}_{}\right){\left({\widehat{\mathit{x}}}_{}{\mathit{x}}_{}\right)}^{\mathrm{T}}\right\}\le \mathit{P}$$
It is shown in (38) that consistency requires that the true error covariance should be upper bounded (in the positive definite sense) by the approximated error covariance $\mathit{P}$. In the distributed estimation paradigm, due to the unaware reuse of the redundant data in the consensus iteration and the possible correlation between measurements from different nodes, the filter may suffer from inconsistency and divergence. In such a case, preservation of consistency is even much more important.
For convenience, consider the information pair $(\widehat{\mathit{y}},\mathit{Y})$, where $\widehat{\mathit{y}}={\mathit{P}}^{1}\widehat{x}$ and $\mathit{Y}={\mathit{P}}^{1}\widehat{x}$. The consistency defined by (38) can be rewritten as
$$\mathit{Y}\le {\left(\mathbb{E}\left\{\left({\mathit{Y}}^{1}\widehat{\mathit{y}}{\mathit{x}}_{}\right){\left({\mathit{Y}}^{1}\widehat{\mathit{y}}{\mathit{x}}_{}\right)}^{\mathrm{T}}\right\}\right)}^{1}$$
Assumption 3.
The initialized estimate of each node, represented by $({\widehat{\mathit{x}}}_{i,00}^{},{\mathit{P}}_{i,00}^{}),i\in \mathcal{S}$, is consistent. Equivalently, inequality ${\mathit{P}}_{i,00}^{}\ge \mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,00}^{}{\mathit{x}}_{0}\right){\left({\widehat{\mathit{x}}}_{i,00}^{}{\mathit{x}}_{0}\right)}^{\mathrm{T}}\right\}$ holds for $i\in \mathcal{S}$.
Remark 5.
In general, Assumption 3 can be easily satisfied. The initial information on the state vector can be acquired in an offline fashion before the fusion process. In the worst case where no prior information is available, each node can simply set the initialized information matrix as ${\mathit{P}}_{i,00}^{1}=\mathbf{0}$, which implies infinite estimate uncertainty in each node at the beginning so that Assumption 3 is fulfilled.
Assumption 4.
The system matrix ${\mathit{F}}_{k}$ is invertible.
Lemma 1
[28]. Under Assumption 4, if two positive semidefinite matrices ${\mathit{Y}}_{1}$ and ${\mathit{Y}}_{2}$ satisfy ${\mathit{Y}}_{1}\le {\mathit{Y}}_{2}$, then $0\le {\Psi}_{k}\left({\mathit{Y}}_{1}\right)\le {\Psi}_{k}\left({\mathit{Y}}_{2}\right)$. In other words, the function ${\Psi}_{k}\left(\text{}\cdot \text{}\right)$ is monotonically nondecreasing for any $k\ge 0$.
Theorem 1.
Let Assumptions 1, 2, and 3 hold. Then, for each time instant $k$ and each node $i\in \mathcal{S}$, the information pair $({\widehat{\mathit{y}}}_{i,kk},{\mathit{Y}}_{i,kk})$ of the DHIWCF is consistent in that
with${\widehat{\mathit{x}}}_{i,kk}^{}={\mathit{Y}}_{i,kk}^{1}{\widehat{\mathit{y}}}_{i,\text{}kk}^{}$.
$${\mathit{Y}}_{i,kk}^{}\le {\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk}^{}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk}^{}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}\right)}^{1}$$
Proof.
An inductive method is utilized here to prove this theorem. It is supposed that, at time instant $k1$
for any $i\in \mathcal{S}$. For brevity, the predicted information matrix in (31) can be rewritten as
$${\mathit{Y}}_{i,k1k1}^{}\le {\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,k1k1}^{}{\mathit{x}}_{k1}\right){\left({\widehat{\mathit{x}}}_{i,k1k1}^{}{\mathit{x}}_{k1}\right)}^{\mathrm{T}}\right\}\right)}^{1}$$
$${\mathit{Y}}_{i,kk1}^{}=\Psi \left({\mathit{Y}}_{i,\text{}k1k1}^{}\right)$$
On the basis of Lemma 1, it is immediate to see
$$\begin{array}{ll}{\mathit{Y}}_{i,kk1}^{}& ={\Psi}_{k1}\left({\mathit{Y}}_{i,\text{}k1k1}^{}\right)\\ & \le {\Psi}_{k1}\left({\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,k1k1}^{}{\mathit{x}}_{k1}\right){\left({\widehat{\mathit{x}}}_{i,k1k1}^{}{\mathit{x}}_{k1}\right)}^{\mathrm{T}}\right\}\right)}^{1}\right)={\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk1}^{}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk1}^{}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}\right)}^{1}\end{array}$$
According to (26) and (27), the local estimation error is
$$\begin{array}{ll}{\widehat{\mathit{x}}}_{i,kk}^{0}{\mathit{x}}_{k}& ={\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}{\widehat{\mathit{y}}}_{i,kk}^{0}{\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}{\mathit{Y}}_{i,kk}^{0}{\mathit{x}}_{k}\\ & ={\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}\left[\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}\left({\widehat{\mathit{x}}}_{j,kk1}^{}{\mathit{x}}_{k}\right)}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}\left({\mathit{z}}_{j,\text{}k}{\mathit{H}}_{j,\text{}k}{\mathit{x}}_{k}\right)}\right]\\ & ={\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}\left[\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}\left({\widehat{\mathit{x}}}_{j,kk1}^{}{\mathit{x}}_{k}\right)}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{v}}_{j,\text{}k}}\right]\end{array}$$
Then,
where
$${\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk}^{0}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk}^{0}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}\right)}^{1}={\left({\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}\left(\mathbb{E}\left\{{\Delta}_{k,i}\right\}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{H}}_{j,\text{}k}}\right){\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}\right)}^{1}$$
$${\Delta}_{k,i}=\left({\displaystyle \sum _{j\in {\mathcal{J}}_{i}}\frac{1}{1+{d}_{i}}{\mathit{Y}}_{j,kk1}^{}\left({\widehat{\mathit{x}}}_{j,kk1}^{}{\mathit{x}}_{k}\right)}\right){\left({\displaystyle \sum _{j\in {\mathcal{J}}_{i}}\frac{1}{1+{d}_{i}}{\mathit{Y}}_{j,kk1}^{}\left({\widehat{\mathit{x}}}_{j,kk1}^{}{\mathit{x}}_{k}\right)}\right)}^{\mathrm{T}}$$
According to the consistency property of covariance intersection [29,38], it holds that
Then, exploiting (47) and (43) in (45), the following result is obtained.
$$\mathbb{E}\left\{{\Delta}_{k,i}\right\}\le {\displaystyle \sum _{j\in {\mathcal{J}}_{i}}\frac{1}{1+{d}_{i}}{\mathit{Y}}_{j,kk1}^{}\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{j,kk1}^{}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{j,kk1}^{}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}{\mathit{Y}}_{j,kk1}^{}}\le \frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}}$$
$${\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk}^{0}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk}^{0}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}\right)}^{1}\ge {\left({\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}\left(\frac{1}{1+{d}_{i}}{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\mathit{Y}}_{j,kk1}^{}}+{\displaystyle \sum _{j\in {\mathcal{J}}_{i}}{\left({\mathit{H}}_{j,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{j,k}^{1}{\mathit{H}}_{j,\text{}k}}\right){\left({\mathit{Y}}_{i,kk}^{0}\right)}^{1}\right)}^{1}\ge {\mathit{Y}}_{i,kk}^{0}$$
Since the information pair $({\widehat{\mathit{x}}}_{i,kk}^{l+1},{\mathit{Y}}_{i,kk}^{l+1})$ is computed based on the previous information pair $({\widehat{\mathit{x}}}_{i,kk}^{l},{\mathit{Y}}_{i,kk}^{l})$ by (3), and the covariance intersection involved in (3) preserves the consistency of estimates [29,37,38,39], it can be concluded that ${\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk}^{l}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk}^{l}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}\right)}^{1}\ge {\mathit{Y}}_{i,kk}^{l}$ indicates ${\left(\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk}^{l+1}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk}^{l+1}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}\right)}^{1}\ge {\mathit{Y}}_{i,kk}^{l+1}$ for any $l=1,\dots ,L$. In other words, if the estimate obtained with l consensus iterations is consistent, the estimate obtained with l + 1 consensus iterations is also consistent. Therefore, it is straightforward to conclude that (40) holds with ${\widehat{\mathit{x}}}_{i,kk}={\widehat{\mathit{x}}}_{i,kk}^{L}$ and ${\mathit{Y}}_{i,kk}={\mathit{Y}}_{i,kk}^{L}$. The proof is concluded since the initial estimate ${\widehat{\mathit{x}}}_{i,00},\forall i\in \mathcal{S}$ is consistent. □
5.2. Boundedness of Error Covariances
According to the consistency of the proposed DHIWCF in Theorem 1, it is sufficient to prove that ${\mathbf{Y}}_{i,kk}^{}$ is lower bounded by a certain positive matrix (or equivalently, to prove ${P}_{i,kk}^{}={\mathbf{Y}}_{i,kk}^{1}$ is upper bounded by some constant matrix) for the proof of the boundedness of the error covariance $\mathbb{E}\left\{\left({\widehat{\mathit{x}}}_{i,kk}^{}{\mathit{x}}_{k}\right){\left({\widehat{\mathit{x}}}_{i,kk}^{}{\mathit{x}}_{k}\right)}^{\mathrm{T}}\right\}$. To derive the bounds for the information matrix ${\mathit{Y}}_{i,kk}^{}$, The following assumptions are required.
Assumption 5.
The system is collectively observable. That is, the pair $\left({\mathit{F}}_{k},{\mathit{H}}_{k}\right)$ is observable where ${\mathit{H}}_{k}=\mathrm{col}\left({\mathit{H}}_{i,\text{}k},\text{}i\in \mathcal{S}\right)$.
Let $\mathbf{\Pi}$ be the consensus matrix, whose elements are the consensus weights ${\pi}_{i,j}^{}$ for any $i,j\in \mathcal{S}$. Further, let ${\pi}_{i,j}^{L}$ be the $\left(i,j\right)$th element of ${\mathit{\Pi}}^{L}$, which is the $L$th power of $\mathit{\Pi}$.
Assumption 6.
The consensus matrix $\mathit{\Pi}$ is row stochastic and primitive.
Assumption 7.
There exist real numbers $\underset{\_}{f},\overline{f},\underset{\_}{h},\overline{h}\ne 0$ and positive real numbers $\overline{p}>\underset{\_}{p}>0$, $\overline{q}>\underset{\_}{q}>0$, such that the following bounds are fulfilled for each $k\ge 0,\text{}i\in \mathcal{S}$.
$$\{\begin{array}{l}{\underset{\_}{f}}^{2}{\mathit{I}}_{n}\le {\mathit{F}}_{k}{\mathit{F}}_{k}^{\mathrm{T}}\le {\overline{f}}^{2}{\mathit{I}}_{n},\text{\hspace{1em}}{\underset{\_}{h}}^{2}{\mathit{I}}_{m}\le {\mathit{H}}_{i,k}{\left({\mathit{H}}_{i,k}\right)}^{\mathrm{T}}\le {\overline{h}}^{2}{\mathit{I}}_{m}\hfill \\ \underset{\_}{q}{\mathit{I}}_{n}\le {\mathit{Q}}_{k}\le \overline{q}{\mathit{I}}_{n},\text{\hspace{1em}}\underset{\_}{r}{\mathit{I}}_{m}\le {\mathit{R}}_{i,k}\le \overline{r}{\mathit{I}}_{m}\hfill \end{array}$$
Lemma 2
[28]. Under Assumptions 4 and 5, and the proposed DHIWCF algorithm, if there exists a positive semidefinite matrix $\stackrel{\u2323}{\mathit{Y}}$ such that ${\mathit{Y}}_{i,kk}^{}\le \stackrel{\u2323}{\mathit{Y}},\text{}\forall k\ge 0,\text{}i\in \mathcal{S}$, then there always exists a strictly positive constant $0<\alpha <1$ such that
$${\mathit{Y}}_{i,k+1k}^{}\ge \alpha {\left({\mathit{F}}_{k}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}kk}^{}{\mathit{F}}_{k}^{1}$$
By virtue of Lemma 2, Theorem 2 which depicts the boundedness of error covariances is presented below.
Theorem 2.
Let Assumptions 4–7 hold, there exist positive definite matrices $\underset{\_}{\mathit{\Omega}}$ and $\overline{\mathit{\Omega}}$ such that
where ${\mathit{Y}}_{i,\text{}kk}^{}$ is the information matrix given by the proposed DHIWCF.
$$\mathbf{0}<\underset{\_}{\mathit{\Omega}}\le {\mathit{Y}}_{i,\text{}kk}^{}\le \overline{\mathit{\Omega}},\text{\hspace{1em}}\forall k\ge 0,\text{}i\in \mathcal{S}$$
Proof.
For simplicity, the proof is concluded for the case $L=1$. The generalization for $L>1$ can be directly derived in a similar way. According to the proposed DHIWCF, the information matrix for node $i$ at time instant $k$ can be written as
$${\mathit{Y}}_{i,kk}^{}={\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{}{\left(1+{d}_{j}\right)}^{1}{\mathit{Y}}_{r,kk1}^{}}}+{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{}{\left({\mathit{H}}_{r,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{r,k}^{1}{\mathit{H}}_{r,\text{}k}}}$$
In view of Assumption 6, 7 and fact that ${\mathit{Y}}_{r,kk1}^{}\le {\mathit{Q}}_{k1}^{1}$ by (31), one can get
Hence, the upper bound is achieved. Next a lower bound will be guaranteed under Assumption 5.
$$\begin{array}{ll}{\mathit{Y}}_{i,kk}^{}& \le {\mathit{Q}}_{k1}^{1}+{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{}{\left({\mathit{H}}_{r,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{r,k}^{1}{\mathit{H}}_{r,\text{}k}}}\\ & \le \left(\frac{1}{\underset{\_}{q}}+\frac{\left(1+{d}_{\mathrm{max}}\right){\overline{h}}^{2}}{\underset{\_}{r}}\right){\mathit{I}}_{{n}_{\mathit{x}}}\triangleq \overline{\mathit{\Omega}}\end{array}$$
According to Lemma 2 and Assumption 7 and (31), (53), it follows from (52) that
where $\alpha $ is a positive scalar with $0<\alpha <1$. By recursively exploiting (52) and (54) for a certain number (denoted by $\overline{k}$) of times, there is
where ${\pi}_{i,j}^{\tau}$ is the $\left(i,j\right)$th element of ${\mathit{\Pi}}^{\tau}$. $\mathit{\Xi}$ is a matrix with elements
$${\mathit{Y}}_{i,kk}^{}\ge \alpha {\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{}{\left(1+{d}_{j}\right)}^{1}{\left({\mathit{F}}_{k1}\right)}^{\mathrm{T}}{\mathit{Y}}_{r,\text{}k1k1}^{}{\mathit{F}}_{k1}^{1}}}+{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{}{\left({\mathit{H}}_{r,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{r,k}^{1}{\mathit{H}}_{r,\text{}k}}}$$
$$\begin{array}{ll}{\mathit{Y}}_{i,kk}^{}& \ge {\alpha}^{\overline{k}}{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in \mathcal{S}}{\pi}_{i,j}^{\overline{k}}{\Xi}_{j,\text{}r}^{\overline{k}}{\left(\underset{\mathrm{length}=\overline{k}}{\underset{\u23df}{{\mathit{F}}_{k1}{\mathit{F}}_{k2}\dots {\mathit{F}}_{k\overline{k}}}}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}k1k1}^{}{\left({\mathit{F}}_{k1}{\mathit{F}}_{k2}\dots {\mathit{F}}_{k\overline{k}}\right)}^{1}}}\\ & +{\displaystyle \sum _{\tau =1}^{\overline{k}}{\alpha}^{\tau 1}{\left(\underset{\mathrm{length}=\text{}\tau 1}{\underset{\u23df}{{\mathit{F}}_{k1}{\mathit{F}}_{k2}\dots {\mathit{F}}_{k\tau +1}}}\right)}^{\mathrm{T}}\left({\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in \mathcal{S}}{\pi}_{i,j}^{\tau}{\Xi}_{j,\text{}r}^{\tau 1}{\left({\mathit{H}}_{r,\text{}k\tau +1}\right)}^{\mathrm{T}}{\mathit{R}}_{r,k\tau +1}^{1}{\mathit{H}}_{r,\text{}k\tau +1}}}\right)}{\left({\mathit{F}}_{k1}{\mathit{F}}_{k2}\dots {\mathit{F}}_{k\tau +1}\right)}^{1}\end{array}$$
$${\mathit{\Xi}}_{i,j}=\{\begin{array}{l}1/\left(1+{d}_{i}\right),\text{\hspace{1em}}\mathrm{if}\text{}j\in {\mathcal{J}}_{i}\hfill \\ 0,\text{\hspace{1em}}\mathrm{otherwise}\hfill \end{array}$$
Note that the matrix $\mathit{\Xi}$ is constructed based on the network topology and is naturally stochastic. According to [40,41], as long as the undirected network is connected, similar to the definition of $\mathit{\Pi}$, $\mathit{\Xi}$ is primitive. Therefore, there exist strictly positive integers $m$ and $n$ such that all the elements of ${\mathit{\Pi}}^{s}$ and ${\mathit{\Xi}}^{t}$ are positive for $s\ge m,\text{}t\ge n$. Let us define
$${\mathit{\Omega}}_{1}={\displaystyle \sum _{\tau =1}^{\overline{k}}{\alpha}^{\tau 1}{\left(\underset{\mathrm{length}=\text{}\tau 1}{\underset{\u23df}{{\mathit{F}}_{k1}{\mathit{F}}_{k2}\dots {\mathit{F}}_{k\tau +1}}}\right)}^{\mathrm{T}}\left({\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in \mathcal{S}}{\pi}_{i,j}^{\tau}{\Xi}_{j,\text{}r}^{\tau 1}{\left({\mathit{H}}_{r,\text{}k\tau +1}\right)}^{\mathrm{T}}{\mathit{R}}_{r,k\tau +1}^{1}{\mathit{H}}_{r,\text{}k\tau +1}}}\right)}{\left({\mathit{F}}_{k1}{\mathit{F}}_{k2}\dots {\mathit{F}}_{k\tau +1}\right)}^{1}$$
It should be noted that, under Assumption 5, ${\mathit{\Omega}}_{1}$ is definite positive for $\overline{k}\ge \mathrm{max}\left(m,n+1\right)$. Therefore, for $k\ge \overline{k}$, ${\mathit{Y}}_{i,kk}^{}\ge {\mathit{\Omega}}_{1}>0$. Since $\overline{k}$ is finite, for $0\le k\le \overline{k}1$, there exists a constant positive definite matrix ${\mathit{\Omega}}_{2}$ such that ${\mathit{Y}}_{i,kk}^{}\ge {\mathit{\Omega}}_{2}>0$. Hence, there exists a positive definite matrix $\underset{\_}{\mathit{\Omega}}$ such that $0<\underset{\_}{\mathit{\Omega}}\le {\mathit{Y}}_{i,\text{}kk}^{}$. The proof is now complete. □
Remark 6.
The result shown in Theorem 2 is only dependent on collective observability. This is distinct from some algorithms that require some sort of local observability or detectability condition [5,6,8,11,25], which poses a great challenge to the sensing abilities of sensors and restricts the scope of application.
5.3. Convergence of Estimation Errors
In line with the boundedness of ${\mathbf{Y}}_{i,kk}^{}$ proven in Theorem 2, the convergence of local estimation errors obtained by the proposed DHIWCF is analyzed in this section. To facilitate the analysis, the following preliminary lemmas are required.
Lemma 3
[26,28,31]. Given an integer $N\ge 2$, $N$ positive definite matrices ${\mathit{M}}_{1},\dots ,{\mathit{M}}_{N}$ and $N$ vectors ${\mathit{v}}_{1},\dots ,{\mathit{v}}_{N}$, the following inequality holds
$${\left({\displaystyle \sum _{i=1}^{N}{\mathit{M}}_{i}}{\mathit{v}}_{i}\right)}^{\top}{\left({\displaystyle \sum _{i=1}^{N}{\mathit{M}}_{i}}\right)}^{1}\left({\displaystyle \sum _{i=1}^{N}{\mathit{M}}_{i}}{\mathit{v}}_{i}\right)\le {\displaystyle \sum _{i=1}^{N}{\mathit{v}}_{i}^{\top}}{\mathit{M}}_{i}{\mathit{v}}_{i}$$
Lemma 4
[26,28]. Under Assumptions 4 and 5, and the proposed DHIWCF algorithm, if there exists a positive semidefinite matrix $\tilde{\mathit{Y}}$ such that ${\mathit{Y}}_{i,kk}^{}\ge \tilde{\mathit{Y}},\text{}\forall k\ge 0,\text{}i\in \mathcal{S}$, then there always exists a strictly positive scalar $0<\beta <1$ such that
$${\mathit{Y}}_{i,k+1k}^{}\le \beta {\left({\mathit{F}}_{k}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}kk}^{}{\mathit{F}}_{k}^{1}$$
For the sake of simplicity, let us denote the prediction and estimation error at node $i$ by ${\tilde{\mathit{x}}}_{i,kk1}^{}={\widehat{\mathit{x}}}_{i,kk1}^{}{\mathit{x}}_{k}$ and ${\tilde{\mathit{x}}}_{i,kk}^{}={\widehat{\mathit{x}}}_{i,kk}^{}{\mathit{x}}_{k}$, respectively. The collective forms are, respectively, ${\tilde{\mathit{x}}}_{kk1}^{}=\mathrm{col}\left({\tilde{\mathit{x}}}_{1,kk1}^{},\dots ,{\tilde{\mathit{x}}}_{{N}_{s},kk1}^{}\right)$ and ${\tilde{\mathit{x}}}_{kk}^{}=\mathrm{col}\left({\tilde{\mathit{x}}}_{1,kk}^{},\dots ,{\tilde{\mathit{x}}}_{{N}_{s},kk}^{}\right)$.
Theorem 3.
Under Assumptions 4–6, the proposed DHIWCF algorithm yields an asymptotic estimate in each node of the network in that
$$\underset{k\to +\infty}{\mathrm{lim}}\mathbb{E}\left\{{\widehat{\mathit{x}}}_{i,kk}{\mathit{x}}_{k}\right\}=\mathbf{0},\text{\hspace{1em}}\forall i\in \mathcal{S}$$
Proof.
Under Assumptions 4–6, Theorem 2 holds. Therefore, ${\mathit{Y}}_{i,kk}^{}$ is uniformly lower and upper bounded. Let us define the following candidate Lyapunov function
$${V}_{i,k}(\mathit{x})={\mathit{x}}^{\mathrm{T}}{\mathit{Y}}_{i,kk1}^{}\mathit{x},\text{\hspace{1em}}i\in \mathcal{S}$$
By virtue of Lemma 2, it can be concluded that there exists a positive real number $0<\tilde{\beta}<1$ such that
$${\mathit{Y}}_{i,k+1k}^{}\le \tilde{\beta}{\left({\mathit{F}}_{k}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}kk}^{}{\mathit{F}}_{k}^{1}$$
Then, one has
$$\begin{array}{ll}{V}_{i,\text{}k+1}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\right)& ={\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,k+1k}^{}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\\ & \le \tilde{\beta}{\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\right)}^{\mathrm{T}}{\left({\mathit{F}}_{k}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}kk}^{}{\mathit{F}}_{k}^{1}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\end{array}$$
Since $\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}=\mathbb{E}\left\{{\mathit{F}}_{k}{\widehat{\mathit{x}}}_{i,kk}^{}\left({\mathit{F}}_{k}{\mathit{x}}_{k}+{\mathit{w}}_{k}\right)\right\}={\mathit{F}}_{k}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk}^{}\right\}$, one can obtain
$${V}_{i,\text{}k+1}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\right)\le \tilde{\beta}{\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk}^{}\right\}\right)}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}kk}^{}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk}^{}\right\}$$
Notice that
$${\mathit{Y}}_{i,kk}^{}={\mathit{Y}}_{i,kk}^{L}={\displaystyle \sum _{j\in \mathcal{S}}{\pi}_{i,j}^{L}{\mathit{Y}}_{j,\text{}kk}^{0}}$$
$${\widehat{\mathit{y}}}_{i,kk}^{}={\widehat{\mathit{y}}}_{i,kk}^{L}={\displaystyle \sum _{j\in \mathcal{S}}{\pi}_{i,j}^{L}{\widehat{\mathit{y}}}_{j,\text{}kk}^{0}}$$
Here, premultiplying (65) by ${\mathit{Y}}_{i,kk}^{1}$ and postmultiplying it by ${\mathit{x}}_{k}$ yields
$${\mathit{x}}_{k}={\mathit{Y}}_{i,kk}^{1}{\displaystyle \sum _{j\in \mathcal{S}}{\pi}_{i,j}^{L}{\mathit{Y}}_{j,\text{}kk}^{0}{\mathit{x}}_{k}}$$
In a similar way, premultiplying (66) by ${\mathit{Y}}_{i,kk}^{1}$ yields
$${\widehat{\mathit{x}}}_{i,kk}^{}={\mathit{Y}}_{i,kk}^{1}{\displaystyle \sum _{j\in \mathcal{S}}{\pi}_{i,j}^{L}{\widehat{\mathit{y}}}_{j,\text{}kk}^{0}}$$
Therefore,
$$\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk}^{}\right\}=\mathbb{E}\left\{{\widehat{\mathit{x}}}_{i,kk}^{}{\mathit{x}}_{k}\right\}={\mathit{Y}}_{i,kk}^{1}{\displaystyle \sum _{j\in \mathcal{S}}{\pi}_{i,j}^{L}\mathbb{E}\left\{{\widehat{\mathit{y}}}_{j,\text{}kk}^{0}{\mathit{Y}}_{j,\text{}kk}^{0}{\mathit{x}}_{k}\right\}}$$
According to (36), there is
$${\widehat{\mathit{y}}}_{j,\text{}kk}^{0}{\mathit{Y}}_{j,\text{}kk}^{0}{\mathit{x}}_{k}={\left(1+{d}_{j}\right)}^{1}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\mathit{Y}}_{r,kk1}^{}{\tilde{\mathit{x}}}_{r,kk1}^{}}+{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\left({\mathit{H}}_{r,\text{}k}\right)}^{\mathrm{T}}{\mathit{R}}_{r,k}^{1}{\mathit{v}}_{r,\text{}k}}$$
Since $\mathrm{E}\left\{{\mathit{v}}_{r,\text{}k}\right\}=\mathbf{0}$, one can get
$$\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk}^{}\right\}={\mathit{Y}}_{i,kk}^{1}{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{L}{\left(1+{d}_{j}\right)}^{1}{\mathit{Y}}_{r,kk1}^{}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{r,kk1}^{}\right\}}}$$
Substituting (71) into (64) yields
$$\begin{array}{l}{V}_{i,\text{}k+1}\left(\mathrm{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\right)\\ \le \tilde{\beta}{\left[{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{L}{\left(1+{d}_{j}\right)}^{1}{\mathit{Y}}_{r,kk1}^{}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{r,kk1}^{}\right\}}}\right]}^{\mathrm{T}}{\mathit{Y}}_{i,\text{}kk}^{1}\left[{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{L}{\left(1+{d}_{j}\right)}^{1}{\mathit{Y}}_{r,kk1}^{}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{r,kk1}^{}\right\}}}\right]\end{array}$$
Applying the fact that ${\mathit{Y}}_{i,\text{}kk}^{}\ge {\displaystyle {\sum}_{j\in \mathcal{S}}{\displaystyle {\sum}_{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{L}{\left(1+{d}_{j}\right)}^{1}{\mathit{Y}}_{r,kk1}^{}}}$ and Lemma 3 to the right hand side of (72), one can obtain that
$$\begin{array}{ll}{V}_{i,\text{}k+1}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,k+1k}^{}\right\}\right)& \le \tilde{\beta}{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{L}{\left(1+{d}_{j}\right)}^{1}{\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{r,kk1}^{}\right\}\right)}^{\mathrm{T}}{\mathit{Y}}_{r,kk1}^{}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{r,kk1}^{}\right\}}}\\ & =\tilde{\beta}{\displaystyle \sum _{j\in \mathcal{S}}{\displaystyle \sum _{r\in {\mathcal{J}}_{j}}{\pi}_{i,j}^{L}{\left(1+{d}_{j}\right)}^{1}{V}_{r,\text{}k}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{r,kk1}^{}\right\}\right)}}\end{array}$$
Writing (73) for $i=1,2,\dots ,{N}_{s}$ in a collective form, it turns out that
where
and $\mathit{\Xi}$ is a matrix with elements satisfying
$${V}_{k+1}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{k+1k}^{}\right\}\right)\le \tilde{\beta}\text{}{\mathit{\Pi}}^{L}\mathit{\Xi}\text{}{V}_{k}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{kk1}^{}\right\}\right)$$
$${V}_{k}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{kk1}^{}\right\}\right)=\mathrm{col}\left({V}_{1,\text{}k}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk1}^{}\right\}\right),\dots ,{V}_{{N}_{s},\text{}k}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{{N}_{s},kk1}^{}\right\}\right)\right)$$
$${\mathit{\Xi}}_{i,j}=\{\begin{array}{l}1/\left(1+{d}_{i}\right),\text{\hspace{1em}}\mathrm{if}\text{}j\in {\mathcal{J}}_{i}\hfill \\ 0,\text{\hspace{1em}}\mathrm{otherwise}\hfill \end{array}$$
Since the consensus matrix $\mathit{\Pi}$ and the constructed matrix $\Xi $ are both row stochastic, thus their spectral radiuses are both 1. As a consequence, for $0<\tilde{\beta}<1$, the elements of vector ${V}_{k+1}\left(\mathbb{E}\left\{{\tilde{\mathit{x}}}_{k+1k}^{}\right\}\right)$ vanishes as $k$ tends to infinity in that $\underset{k\to +\infty}{\mathrm{lim}}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{k+1k}^{}\right\}=\mathbf{0}$. Due to the equation $\mathbb{E}\left\{{\tilde{\mathit{x}}}_{k+1k}^{}\right\}={\mathit{F}}_{k}\mathbb{E}\left\{{\tilde{\mathit{x}}}_{i,kk}^{}\right\}$ and Assumption 4, it is straightforward to conclude that $\underset{k\to +\infty}{\mathrm{lim}}\mathbb{E}\left\{{\widehat{\mathit{x}}}_{i,kk}{\mathit{x}}_{k}\right\}=\mathbf{0}$ for any $i\in \mathcal{S}$. □
Remark 7.
The Lyapunov function defined in (61) plays a crucial role in the convergence proof of the proposed algorithm, which can be easily extended to stability analysis of Kalmanlike consensus filters in other scenarios. The reason for the nonsingularity requirements of ${\mathit{F}}_{k}$ in Theorem 3 is that the proof of the Lyapunov method depends on Lemma 4, the establishment of which needs the invertibility of ${\mathit{F}}_{k}$.
6. Experimental Results and Analysis
6.1. Simulation Setting
A target tracking scenario is adopted here to validate the effectiveness and superiority of the proposed DHIWCF. The centralized Kalman filter (CKF) is chosen as a benchmark to compare the proposed DHIWCF with following algorithms: The Kalman consensus filter (KCF), the generalized Kalman consensus filter (GKCF), the information weighted consensus filter (ICF), the consensus on information algorithm (CI), the consensus on measurements algorithm (CM), the hybrid consensus on measurement + consensus on information algorithm (HCMCI).
In the surveillance area, a target is moving with the discrete time linear model shown in (1). ${\mathrm{x}}_{k}={\left[{x}_{k},{y}_{k},{\dot{x}}_{k},{\dot{y}}_{k}\right]}^{\mathrm{T}}$ is the state vector at time instant $k$. $\left({x}_{k},{y}_{k}\right)$ and $\left({\dot{x}}_{k},{\dot{y}}_{k}\right)$ are, respectively, the position and velocity components of the state. The state transition matrix ${\mathit{F}}_{k}$ and the process noise covariance matrix ${\mathit{Q}}_{k}$ are set as follows.
$${\mathit{F}}_{k}=\left[\begin{array}{cccc}1& 0& 1& 0\\ 0& 1& 0& 1\\ 0& 0& 1& 0\\ 0& 0& 0& 1\end{array}\right],\text{}{\mathit{Q}}_{k}=\left[\begin{array}{cccc}10& 0& 0& 0\\ 0& 10& 0& 0\\ 0& 0& 1& 0\\ 0& 0& 0& 1\end{array}\right]$$
The initial position of the target is randomly located at the $500\times 500$ space. The initial speed is set to 2 units per time step, with a random direction uniformly chosen from 0 to $2\pi $. In each simulation run, the initial prior error covariance is ${\mathit{P}}_{0}=\mathrm{diag}\left(100,100,10,10\right)$, and all nodes in the network share the same ${\mathit{P}}_{0}$. The initial prior estimate of each node is generated by adding zeromean Gaussian noise with covariance ${\mathit{P}}_{0}$ to the true initial state. The total number of time steps is $K=100$ unless stated otherwise. The sampling time interval is $T=1\text{}\mathrm{s}$.
The target of interest is observed by a number of networked sensors with measurement model shown in (2). The measurement matrix ${\mathit{H}}_{i,\text{}k}$ and the measurement noise covariance ${\mathit{R}}_{i,k}$ are given below.
$${\mathit{H}}_{i,\text{}k}=\left[\begin{array}{cccc}1& 0& 0& 0\\ 0& 1& 0& 0\end{array}\right],\text{}{\mathit{R}}_{i,k}=\left[\begin{array}{cc}100& 0\\ 0& 100\end{array}\right]$$
6.2. Performance Metrics
For a fair comparison, a total number ${M}_{c}=200$ of independent Monte Carlo runs are carried out. The consensus iterations $L$ is set from 1 to 10. The consensus rate parameter is selected as $\epsilon =0.65/{d}_{\mathrm{max}}$. For the proposed DHIWCF algorithm, the Metropolis weight matrix is chosen, which is computed only with knowledge of local node degree. Following metrics are chosen to evaluate the estimation performance from different aspects.
(1) The position root mean squared error (PRMSE), which indicates the tracking accuracy, is defined as
where $\left({\widehat{x}}_{i,k}^{m},\text{}{\widehat{y}}_{i,k}^{m}\right)$ and $\left({x}_{k}^{m},\text{}{y}_{k}^{m}\right)$ are, respectively, the estimated position and the true position in the mth Monte Carlo run.
$$\mathrm{PRMSE}=\sqrt{\frac{1}{{M}_{c}}{\displaystyle \sum _{m=1}^{{M}_{c}}\frac{1}{{N}_{s}}{\displaystyle \sum _{i=1}^{{N}_{s}}\left[{\left({\widehat{x}}_{i,k}^{m}{x}_{k}^{m}\right)}^{2}+{\left({\widehat{y}}_{i,k}^{m}{y}_{k}^{m}\right)}^{2}\right]}}}$$
(2) The averaged position root mean squared error (APRMSE), which implies the overall tracking accuracy of an algorithm over all simulation runs, all time instants and all sensors, is defined as
$$\mathrm{APRMSE}=\sqrt{\frac{1}{{M}_{c}}{\displaystyle \sum _{m=1}^{{M}_{c}}\frac{1}{K}{\displaystyle \sum _{k=1}^{K}\frac{1}{{N}_{s}}{\displaystyle \sum _{i=1}^{{N}_{s}}\left[{\left({\widehat{x}}_{i,k}^{m}{x}_{k}^{m}\right)}^{2}+{\left({\widehat{y}}_{i,k}^{m}{y}_{k}^{m}\right)}^{2}\right]}}}}$$
(3) The averaged consensus estimate error (ACEE), which indicates the degree of consensus among estimates from all nodes in the network, is defined as
$$\mathrm{ACEE}=\frac{1}{{M}_{c}}{\displaystyle \sum _{m=1}^{{M}_{c}}\frac{1}{{N}_{s}\left({N}_{s}1\right)}{\displaystyle \sum _{i=1}^{{N}_{s}}{\displaystyle \sum _{j=1}^{{N}_{s}}\sqrt{\left[{\left({\widehat{x}}_{j,k}^{m}{\widehat{x}}_{i,k}^{m}\right)}^{2}+{\left({\widehat{y}}_{j,k}^{m}{\widehat{y}}_{i,k}^{m}\right)}^{2}\right]}}}}$$
(4) The normalized estimation error squared (NEES), which is used to check for filter consistency, is defined as
where ${\mathit{x}}_{k}$ and ${\widehat{\mathit{x}}}_{kk}$ are, respectively, the true state and estimated state. ${\mathit{P}}_{kk}^{}$ is the posterior covariance at time instant $k$. Suppose that the filter is consistent, the NEES is subject to Chisquared distribution with ${n}_{x}$ degrees of freedom. A way to check filter consistency is by testing the average NEES over a number of ${M}_{c}$ Monte Carlo runs, i.e.,
$${\epsilon}_{k}={\left({\mathit{x}}_{k}{\widehat{\mathit{x}}}_{kk}\right)}^{\mathrm{T}}{\mathit{P}}_{kk}^{1}\left({\mathit{x}}_{k}{\widehat{\mathit{x}}}_{kk}\right)$$
$${\overline{\epsilon}}_{k}=\frac{1}{{M}_{c}}{\displaystyle \sum _{i=1}^{{M}_{c}}{\epsilon}_{k}^{i}}$$
Under similar assumptions ${M}_{c}{\overline{\epsilon}}_{k}$ will be Chisquared distributed with ${M}_{c}{n}_{x}$ degrees of freedom. Suppose the acceptance interval is $\left[{r}_{1},\text{}{r}_{2}\right]$, the Chisquare test is accepted if ${\overline{\epsilon}}_{k}\in \left[{r}_{1},\text{}{r}_{2}\right]$. The filter is optimistic if the computed ${\overline{\epsilon}}_{k}$ is much higher than ${r}_{2}$, while it is conservative with the computed ${\overline{\epsilon}}_{k}$ below ${r}_{1}$.
(5) Computational cost. The computational cost is defined as the averaged running time over all Monte Carlo runs.
6.3. Reuslts and Analysis
In this subsection, three simulation scenarios are chosen to evaluate and compare the estimation performance of the proposed DHIWCF algorithm with respect to the aforementioned metrics.
6.3.1. Evaluation of the Effectiveness of the Proposed DHIWCF Algorithm
This scenario is designed to validate the effectiveness of the proposed DHIWCF algorithm. The target of interest is tracked by 8 networked sensors with a communication topology shown in Figure 1. Only node 1 can observe the target, then the node set $\left\{5,6,7,8\right\}$ is naive about the target state. Figure 2 shows the estimated tracks obtained by local nodes with the proposed DHIWCF and the CKF. For simplicity, only the estimated tracks of node 1 and node 8 are plotted. To illustrate the evolution of each track, checkpoints are plotted in the same color every 20 steps. The covariance ellipses with 95% confidence at each checkpoint are plotted in dashed lines. As is shown, the true position (cross in black) is always enveloped by the corresponding ellipse in red (node 1) or blue (node 8), which validates the consistency of the local estimates. Compared with the CKF, the local estimates by the proposed DHIWCF is much more conservative. This is due to the fact that the network in Figure 1 has a weak connectivity and most nodes have poor joint observability.
In Figure 3, the PRMSE of the compared algorithms with a single consensus iteration is given. Both KCF and CM diverge in the considered scenario, while others can effectively track the target. Later on in this section KCF and CM are not considered for their poor performance. The proposed DHIWCF is more accurate for its lower PRMSE close to the CKF. Due to limited consensus iterations, GKCF, ICF, and HCMCI obtain PRMSE higher than DHIWCF, but is much lower than CI.
Figure 4 compares the APRMSE of different algorithms. It shows that GKCF and ICF obtain APRMSE much higher than that of ICF, HCMCI and DHIWCF. Specifically, DHIWCF performs the best with limited consensus iterations $L\le 2$. As consensus iteration increases, DHIWCF asymptotically converges to the CKF. In addition, the performance of DHIWCF is a little much better than that of ICF. In Figure 5, the average NEES of different algorithms with a single consensus iteration is compared. It is obvious that the NEES curve of ICF lies much higher than the 95% concentration regions, which indicates that ICF has poor consistency in such a scenario. The NEES curve of CI always lies below the concentration regions, and hence its estimates is much conservative. The NEES curves of GKCF, HCMCI, DHIWCF, and CKF lie either below or within the concentration regions all the time steps, which shows an enhanced consistency.
The ACEE comparison of different algorithms with a single consensus iteration is shown in Figure 6. The proposed DHIWCF performs much better with regard to consensus in that it has relatively lower ACEE than other algorithms. Figure 7 shows the computational time with different number of consensus iterations. Although HCMCI performs a little better than DHIWCF in the aspect of APRMSE as shown in Figure 4, its ACEE and computational time are much higher than that of DHIWCF. Moreover, DHIWCF is a little more timeconsuming than the most efficient CI as shown in Figure 7.
6.3.2. Performance Comparison under Chain Topology
In this subsection, an even worse scenario is considered, where the networked sensors are connected with a chain topology as shown in Figure 8. As is illustrated, the target is observed only by node 1, and the remaining are communication nodes with no sensing abilities. Node set $\left\{3,4,5,6,7,8\right\}$ and their immediate neighbors do not have measurement of the target, so they are naive about the target’s state information. It takes at least 7 consensus iterations for node 8 to be affected by node 1. As is shown in Figure 4, the APRMSE of GKCF and CI is much higher than that of ICF, HCMCI, and DHIWCF. Here, the estimation results of GKCF and CI are not considered.
In Figure 9, the PRMSE averaged over all nodes and all Monte Carlo runs for different consensus iterations is given. With a single consensus iteration, DHIWCF performs much better than ICF and HCMCI. When consensus iterations increase to $L=3$, DHIWCF is still smaller than the improved HCMCI. The result of averaged ACEE with a single consensus iteration is provided in Figure 10, which indicates that except for ICF, the remaining algorithms preserve good consistency.
The APRMSE for the algorithms under discussion is plotted in Figure 11. Similar to the result in Figure 4, DHIWCF has smaller APRMSE with $L\le 3$, and its performance approaches the KCF as consensus iterations progress. Although HCMCI obtains APRMSE a little smaller than DHIWCF, its average ACEE is relatively higher as shown in Figure 12, especially in the case $L=1$. Therefore, the proposed DHIWCF makes a good tradeoff between estimation accuracy and consensus on local estimates.
6.3.3. Performance Comparison in LargeScale Sparse Sensor Networks
This experiment is designed to test the performance of DHIWCF in largescale sparse sensor networks. Assume that the interested target is tracked by 100 sensors, which are randomly located within the $500\times 500$ space. The communication range of each sensor is set to be ${\mathit{R}}_{c}=10\sqrt{{N}_{s}}$. As is shown in Figure 13, there are 10 sensor nodes and 90 communication nodes in the surveillance area. The sensor nodes are able to observe the target, while the communication nodes act as relays of information among distant nodes and has no observation ability [24]. As is shown in Figure 13, most of nodes in the network are naive about the target’s state, which brings great challenges to target tracking.
In Figure 14, the estimated tracks by different algorithms with a single consensus iteration in a certain Monte Carlo run are plotted. It is intuitive to see that DHIWCF and KCF perform much better than ICF and HCMCI. Especially when the target suffers from relatively obvious process noise (for instance, the target suddenly changes its moving direction), DHIWCF recovers its estimate to the CKF more quickly. The results in Figure 15 further suggest that with limited consensus iterations, DHIWCF is able to obtain more accurate estimates than ICF and HCMCI in that it has relatively lower PRMSE compared with its counterparts. With respect to estimation consistency, it is shown in Figure 16 that the NEES curve of ICF lies higher than the concentration region, while the NEES curves of the remaining algorithms are always within or below the concentration region. Therefore, both HCMCI and DHIWCF show sound consistency on local estimates.
To compare the overall performance of the distributed algorithms under discussion, Table 1 gives the APRMSE for different algorithms versus consensus iterations. Compared with ICF and HCMCI, the proposed DHIWCF has lower APRMSE. Especially in case of consensus iterations $L\le 3$, the advantage is more obvious. This implies that DHIWCF is relatively more accurate. The computational time relative to that of CKF is investigated in Table 2, where RCT means the relative computation time. The proposed DHIWCF runs faster than HCMCI for fewer information exchanges. Although it takes less time for ICF to operate, the lower estimation accuracy and poor consistency make it not a good choice to estimate the state of interest.
7. Conclusions
This paper considers the problem of distributed state estimation in presence of naive nodes with constrained communication resources. A novel distributed hybrid information weighted consensus filter, in which each node exploits not only the measurement information but also the prior estimate information from its immediate neighbors to update its local posterior estimate, is proposed. The proposed DHIWCF is able to settle the problem under consideration without any knowledge of global parameters, and preserve consistency of local estimates as well as achieve relatively high estimation accuracy and satisfactory consensus. Theoretical analysis with regard to consistency of local estimates, stability, and convergence of the estimator is also provided. The experimental results indicate that with limited consensus iterations, the proposed DHIWCF is much more accurate and reaches better consensus than the existing algorithms. In addition, DHIWCF preserves good consistency of local estimates in the experiments. Even a single consensus iteration is allowed, the proposed DHIWCF still performs much better. If more consensus iterations are available, the proposed DHIWCF would approach the performance of the centralized scheme. In the future research, a further investigation for distributed state estimation in mobile sensor networks, consensus protocol with eventtriggered communication, more efficient design of consensus weights, and distributed nonlinear filtering problems and stability analysis will be taken into account.
Author Contributions
Conceptualization, Y.L. and Y.H.; Data curation, Z.D.; Formal analysis, J.L.; Funding acquisition, Y.H.; Methodology, J.L.; Project administration, Y.L.; Software, J.L., Y.L. and Z.D.; Visualization, K.D.; Writing—original draft, J.L.; Writing—review and editing, J.L., Y.L. and K.D.
Funding
This research was cosupported by the National Natural Science Foundation of China (Nos. 61471383, 91538201, 61531020, 61790550, 61671463 and 61790552).
Acknowledgments
The authors would like to thank the editors and anonymous reviewers for the valuable comments and suggestions.
Conflicts of Interest
The authors declare no conflict of interest.
References
 Marelli, D.; Zamani, M.; Fu, M.; Ninness, B. Distributed Kalman filter in a network of linear systems. Syst. Control Lett. 2018, 116, 71–77. [Google Scholar] [CrossRef]
 Battistelli, G.; Chisci, L.; Selvi, D. A distributed Kalman filter with eventtriggered communication and guaranteed stability. Automatica 2018, 93, 75–82. [Google Scholar] [CrossRef]
 Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
 Kamal, A.T.; Ding, C.; Song, B.; Farrell, J.A.; RoyChowdhury, A.K. A Generalized Kalman Consensus Filter for WideArea Video Networks. In Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, Orlando, FL, USA, 12–15 December 2011; pp. 7863–7869. [Google Scholar]
 OlfatiSaber, R. Distributed Kalman Filtering for Sensor Networks. In Proceedings of the 46th IEEE Conference on Decision and Control, New Orleans, LA, USA, 12–14 December 2007; pp. 5492–5498. [Google Scholar]
 OlfatiSaber, R. KalmanConsensus Filter: Optimality, Stability, and Performance. In Proceedings of the 48h IEEE Conference on Decision and Control (CDC) Held Jointly with 2009 28th Chinese Control Conference, Shanghai, China, 15–18 December 2009; pp. 7036–7042. [Google Scholar]
 AminiOmam, M.; TorkamaniAzar, F.; Ghorashi, S.A. Generalised Kalmanconsensus filter. IET Signal Process 2017, 11, 495–502. [Google Scholar] [CrossRef]
 Deshmukh, R.; Kwon, C.; Hwang, I. Optimal DiscreteTime Kalman Consensus Filter. In Proceedings of the 2017 American Control Conference (ACC), Seattle, WA, USA, 24–26 May 2017; pp. 5801–5806. [Google Scholar]
 Yao, P.; Liu, G.; Liu, Y. Average informationweighted consensus filter for target tracking in distributed sensor networks with naivety issues. Int. J. Adapt. Control 2018, 5, 681–699. [Google Scholar] [CrossRef]
 Chong, C.; Chang, K.; Mori, S. Comparison of Optimal Distributed Estimation and Consensus Filtering. In Proceedings of the 19th International Conference on Information Fusion, Heidelberg, Germany, 5–8 July 2016; pp. 1034–1041. [Google Scholar]
 Liu, Y.; Liu, J.; Xu, C.; Qi, L.; Sun, S.; Ding, Z. Consensus Algorithm for Distributed State Estimation in MultiClusters Sensor Network. In Proceedings of the 20th International Conference on Information Fusion, Xi’an, China, 10–13 July 2017; pp. 1605–1609. [Google Scholar]
 Rastgar, F.; Rahmani, M. Consensusbased distributed robust filtering for multisensor systems with stochastic uncertainties. IEEE Sens. J. 2018, 18, 7611–7618. [Google Scholar] [CrossRef]
 Ji, H.; Lewis, F.L.; Hou, Z.; Mikulski, D. Distributed informationweighted Kalman consensus filter for sensor networks. Automatica 2017, 77, 18–30. [Google Scholar] [CrossRef]
 Liu, Q.; Wang, Z.; He, X.; Zhou, D.H. On Kalmanconsensus filtering with random link failures over sensor networks. IEEE Trans. Autom. Control 2018, 63, 2701–2708. [Google Scholar] [CrossRef]
 Soatti, G.; Nicoli, M.; Savazzi, S.; Spagnolini, U. Consensusbased algorithms for distributed networkstate estimation and localization. IEEE Trans. Signal Inf. Proc. Over Netw. 2017, 3, 430–444. [Google Scholar] [CrossRef]
 Kamal, A.T.; Farrell, J.A.; RoyChowdhury, A.K. Information weighted consensus filters and their application in distributed camera networks. IEEE Trans. Autom. Control 2013, 58, 3112–3125. [Google Scholar] [CrossRef]
 Tang, W.; Zhang, G.; Zeng, J.; Yue, Y. Information weighted consensusbased distributed particle filter for largescale sparse wireless sensor networks. IET Commun. 2014, 8, 3113–3121. [Google Scholar] [CrossRef]
 Kamal, A.T. Information Weighted Consensus for Distributed Estimation in Vision Networks. Ph.D. Dissertation, University of California Riverside, Riverside, CA, USA, 2013. [Google Scholar]
 Liu, G.; Tian, G.; Zhao, Y. Information Weighted Consensus Filtering with Improved Convergence Rate. In Proceedings of the IEEE 35th Chinese Control Conference (CCC), Chengdu, China, 27–29 July 2016; pp. 8356–8359. [Google Scholar]
 Bin, J.; Khanh, D.P.; Erik, B.; Dan, S.; Zhonghai, W.; Genshe, C. Cooperative space object tracking using spacebased optical sensors via consensusbased filters. IEEE Trans. Aerosp. Electron. Syst. 2016, 52, 1908–1936. [Google Scholar]
 Jia, B.; Pham, K.D.; Blasch, E.; Shen, D.; Chen, G. ConsensusBased Auction Algorithm for Distributed Sensor Management in Space Object Tracking. In Proceedings of the 2017 IEEE Aerospace Conference, Big Sky, MT, USA, 4–11 March 2017; pp. 1–8. [Google Scholar]
 Kamal, A.T.; Farrell, J.A.; RoyChowdhury, A.K. Information Weighted Consensus. In Proceedings of the 51st IEEE Conference on Decision and Control, Maui, HI, USA, 10–13 December 2012; pp. 2732–2737. [Google Scholar]
 Shang, Y. Resilient consensus of switched multiagent systems. Syst. Control Lett. 2018, 122, 12–18. [Google Scholar] [CrossRef]
 Shang, Y. Resilient Multiscale Coordination Control against Adversarial Nodes. Energies 2018, 11, 1844. [Google Scholar] [CrossRef]
 OlfatiSaber, R.; Fax, J.A.; Murray, R.M. Consensus and cooperation in networked multiagent systems. IEEE Proc. 2007, 95, 215–233. [Google Scholar] [CrossRef]
 Battistelli, G.; Chisci, L.; Mugnai, G.; Farina, A.; Graziano, A. Consensusbased linear and nonlinear filtering. IEEE Trans. Autom. Control 2015, 60, 1410–1415. [Google Scholar] [CrossRef]
 Battistelli, G.; Chisci, L.; Mugnai, G.; Farina, A.; Graziano, A. ConsensusBased Algorithms for Distributed Filtering. In Proceedings of the IEEE 51st IEEE Conference on Decision and Control (CDC), Maui, HI, USA, 10–13 December 2012; pp. 794–799. [Google Scholar]
 Battistelli, G.; Chisci, L. Kullback–Leibler average, consensus on probability densities, and distributed state estimation with guaranteed stability. Automatica 2014, 50, 707–718. [Google Scholar] [CrossRef]
 Julier, S.J.; Uhlmann, J.K. A NonDivergent Estimation Algorithm in the Presence of Unknown Correlations. In Proceedings of the 1997 American Control Conference, Albuquerque, NM, USA, 6 June 1997; pp. 2369–2373. [Google Scholar]
 Wang, S.; Ren, W. On the convergence conditions of distributed dynamic state estimation using sensor networks: A unified framework. IEEE Trans. Control Syst. Technol. 2018, 26, 1300–1316. [Google Scholar] [CrossRef]
 Chen, Q.; Yin, C.; Zhou, J.; Wang, Y.; Wang, X.; Chen, C. Hybrid consensusbased cubature Kalman filtering for distributed state estimation in sensor networks. IEEE Sens. J. 2018, 18, 4561–4569. [Google Scholar] [CrossRef]
 Li, T.; Corchado, J.M.; Prieto, J. Convergence of Distributed Flooding and Its Application for Distributed Bayesian Filtering. IEEE Trans. Signal Inf. Proc. Over Netw. 2017, 3, 580–591. [Google Scholar] [CrossRef]
 Xiao, L.; Boyd, S. Fast Linear Iterations for Distributed Averaging. In Proceedings of the 42nd IEEE International Conference on Decision and Control, Maui, HI, USA, 9–12 December 2003; Volume 5, pp. 4997–5002. [Google Scholar]
 Yilun, S. FiniteTime Weighted Average Consensus and Generalized Consensus Over a Subset. IEEE Access 2016, 4, 2615–2620. [Google Scholar]
 Ren, W.; Beard, R.W.; Kingston, D.B. MultiAgent Kalman Consensus with Relative Uncertainty. In Proceedings of the American Control Conference, Portland, OR, USA, 8–10 June 2005; pp. 1865–1870. [Google Scholar]
 Alighanbari, M.; How, J.P. Unbiased Kalman consensus algorithm. J. Aerosp. Comput. Inf. Commun. 2008, 5, 298–311. [Google Scholar] [CrossRef]
 Motion, L. General Decentralized Data Fusion with Covariance Intersection. In Handbook of Multisensor Data Fusion; CRC Press: Boca Raton, FL, USA, 2001; pp. 1–40. [Google Scholar]
 Niehsen, W. Information Fusion Based on Fast Covariance Intersection Filtering. In Proceedings of the Fifth International Conference on Information Fusion, Annapolis, MD, USA, 8–16 July 2002; pp. 901–904. [Google Scholar]
 Li, T.; Fan, H.; García, J.; Corchado, J.M. Secondorder statistics analysis and comparison between arithmetic and geometric average fusion: Application to multisensor target tracking. Inf. Fusion 2019, 51, 233–243. [Google Scholar] [CrossRef]
 Calafiore, G.C.; Abrate, F. Distributed linear estimation over sensor networks. Int. J. Control 2009, 82, 868–882. [Google Scholar] [CrossRef]
 Ren, W.; Beard, R.W. Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. Autom. Control 2005, 50, 655–661. [Google Scholar] [CrossRef]
Figure 2.
State estimation results: distributed hybrid information weighted consensus filter (DHIWCF) versus centralized Kalman consensus filter (CKF).
Figure 3.
Position root mean squared error (PRMSE) averaged over all 8 nodes and 200 Monte Carlo runs.
Figure 4.
The averaged position root mean squared error (APRMSE) averaged over all nodes, all time steps and all Monte Carlo runs.
APRMSE(m)  CKF  ICF  DHIWCF  HCMCI 

1  3.68  22.87  10.44  26.24 
2  3.68  14.80  9.23  11.93 
3  3.68  12.28  8.37  8.64 
4  3.68  11.05  7.63  7.93 
5  3.68  10.30  7.46  7.78 
10  3.68  8.56  6.87  7.25 
RCT  CKF  ICF  DHIWCF  HCMCI 

$L=1$  1  5.26  7.92  9.49 
$L=2$  1  6.58  10.84  15.55 
$L=3$  1  7.30  13.71  21.57 
$L=4$  1  8.34  16.68  27.04 
$L=5$  1  9.39  19.55  32.25 
$L=10$  1  14.18  34.43  60.05 
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).