The Cramér–Rao Bounds and Sensor Selection for Nonlinear Systems with Uncertain Observations

This paper considers the problems of the posterior Cramér–Rao bound and sensor selection for multi-sensor nonlinear systems with uncertain observations. In order to effectively overcome the difficulties caused by uncertainty, we investigate two methods to derive the posterior Cramér–Rao bound. The first method is based on the recursive formula of the Cramér–Rao bound and the Gaussian mixture model. Nevertheless, it needs to compute a complex integral based on the joint probability density function of the sensor measurements and the target state. The computation burden of this method is relatively high, especially in large sensor networks. Inspired by the idea of the expectation maximization algorithm, the second method is to introduce some 0–1 latent variables to deal with the Gaussian mixture model. Since the regular condition of the posterior Cramér–Rao bound is unsatisfied for the discrete uncertain system, we use some continuous variables to approximate the discrete latent variables. Then, a new Cramér–Rao bound can be achieved by a limiting process of the Cramér–Rao bound of the continuous system. It avoids the complex integral, which can reduce the computation burden. Based on the new posterior Cramér–Rao bound, the optimal solution of the sensor selection problem can be derived analytically. Thus, it can be used to deal with the sensor selection of a large-scale sensor networks. Two typical numerical examples verify the effectiveness of the proposed methods.


Introduction
In practical problems, we always encounter some sensors have the uncertain measurement subjected to random interference, natural interruptions or sensor failures. Using the mode parameters without considering the uncertainty is unavailable, and there are a lot of researchers that have studied the state estimation with uncertain measurement, such as [1][2][3][4]. In this paper, we consider the uncertainty caused by occlusions, i.e., the sensors may not be able to observe the target when blocked by some obstacles [5]. For the linear dynamic systems involving uncertainty in [6,7], the authors use a Kalman filter to track the target. However, it is difficult to obtain the optimal estimation for a nonlinear uncertain dynamic system, but we are particular interested in measuring their efficiency. For this purpose, it is natural to compare a lower bound of the estimation error, which gives an indication of performance limitations. Moreover, it can be used to determine whether imposed performance requirements are realistic or not.
The most popular lower bound is the well-known Cramér-Rao bound (CRB). In time-invariant statistical models, the estimated parameter vector is usually assumed to be real-valued (non-random). The lower bound is given by the inverse of the Fisher information matrix. When we deal with the time-varying systems, the estimated parameter vector is modeled randomly. A lower bound that is analogous to the CRB for random parameters is derived in [8], and this bound is also known as the Van Trees version of the CRB, or referred to as posterior CRB (PCRB). In fact, the underlying static random system needs to satisfy the regularity condition, which is absolute integrability of the first two derivatives of all related probability density functions. The first derivation of a sequential PCRB version applicable to discrete-time dynamic system filtering is done in [9] and then extended in [10][11][12]. The most general form of sequential PCRB for discrete-time nonlinear systems is presented in [13]. Together with the original static form of the CRB, these results serve as a basis for a large number of applications [14][15][16].
Most of the papers on PCRB are obtained without considering the uncertainty in the dynamic systems. When the sensors have uncertain measurements, we need to consider the influence of the uncertainty [17,18]. The CRB is presented in [19,20] to target tracking with detection probability smaller than one. If the uncertain measurement is prone to discretely-distributed faults, a Cramér-Rao-type bound is shown in [21]. Actually, the authors in [22,23] have considered uncertainty as the mixed Gaussian probabilistic model, where the sensor observation is assumed to contain only noise if the sensor cannot sense the target. Therefore, we hope to derive a recursive PCRB based on the uncertain model of the Gaussian mixture distribution.
Since the PCRB needs to compute the Fisher information, which is obtained by the derivatives of the log likelihood function of the Gaussian mixture model, and it is much more difficult than the case of a single Gaussian distribution. The reason is that the presence of the summation that occurs inside of the logarithm, and the PCRB of the Gaussian mixture model needs to compute the complex integral, which is with respect to the joint probability density function of the sensor measurements and the target state. These reasons motivate us to research another approach to derive the PCRB.
In large wireless sensor networks (WSNs), sensors are battery-powered devices with limited signal processing capabilities [24,25]. In such situations, it is inefficient to utilize all the sensors including the uninformative ones, which is hardly helpful to the tracking task but still consumes resources. This issue has been researched and shown via the development of sensor selection schemes, whose goal is to select the best non-redundant set of sensors for the tracking task while satisfying the resource constraints [26,27]. The previous research [28,29] on sensor selection assumes that the target tracking process does not have any interruptions. As the sensor observations are quite uncertain, we need to consider the sensor selection based on the proposed PCRB.
In this paper, we use two methods to derive the PCRB to effectively overcome the difficulties caused by uncertainty. The first method is based on the recursive formula of the Cramér-Rao bound and the Gaussian mixture model. Nevertheless, it needs to compute a complex integral based on the joint probability density function of the sensor measurements and the target state, which leads to the computation burden of this method being relatively high, especially in large sensor networks, so that it is not better using this PCRB as a measure criteria of the sensor selection. In order to reduce the computation burden and deal with the sensor selection of a large-scale sensor networks, our contributions are as follows: • Inspired by the idea of the expectation maximization algorithm, we introduce some 0-1 latent variables to treat the Gaussian mixture model. Since the regular condition of the posterior Cramér-Rao bound is unsatisfied in the discrete uncertain system, we use some continuous variables to approximate the discrete latent variables, then a new Cramér-Rao bound can be achieved by a limiting process of the Cramér-Rao bound of the continuous system. The Cramér-Rao bound avoids the complex integral with a less computation burden.

•
Based on the proposed posterior Cramér-Rao bound, the sensor selection problems for the nonlinear uncertain dynamic system can be efficiently solved, and the optimal solution of the sensor selection problem can be derived analytically. Thus, it can be used to deal with the sensor selection for the large-scale sensor networks.
The remainder of this paper is organized as follows. The system uncertain model is defined and the problem is formulated in Section 2. The PCRB for the dynamic system with uncertain observations is detailed and justified in Section 3. The optimal sensor selection with uncertain observations is shown in Section 4. Two numerical examples are presented in Section 5. Finally, the conclusions are offered in the final section.

Problem Formulation
Consider the L-sensor nonlinear dynamic systems with the uncertain observations [5,30], where p i k is the sensing probability of sensor i, x k is the state of system at time k, y i k is the measurement at the ith sensor, i = 1, . . . , L, f k (x k−1 ) is the nonlinear state function, and h i k (x k ) is the nonlinear measurement function of x k at the ith sensor. w k and v i k are the state noise and the measurement noise, respectively, and they are mutually independent. v i k is assumed to be independent across time steps and across sensors. The measurement information of the ith sensor is denoted by Y i k = {y i 1 , y i 2 , ..., y i k }. Assume that w k and v i k are white Gaussian noise with N (0, Q k ) and N (0, R i k ), i = 1, . . . , L, respectively, where, Q k and R i k are the corresponding covariance matrices. We also assume that the initial state x 0 ∼ N (x 0 , Σ 0 ), and, if x k is given, then the measurement y i k follows the Gaussian distribution N (h i k (x k ), R i k ) with probability p i k , and follows the Gaussian distribution N (0, R i k ) with probability 1 − p i k , i.e., Obviously, the conditional probability density function is a Gaussian mixture distribution, which ishard to calculate the PCRB. This difficult problem motivates us to introduce a hidden state variable, which draws lessons from the idea of the expectation maximization (EM) algorithm [31].
We introduce the 0-1 hidden state variables I i k , i = 1, 2, ..., L, which indicate whether the dynamic system has uncertainty. In other words, if I i k = 1, then y i k = h i k (x k ) + v i k , and I i k = 0, then y i k = v i k . Now, we transfer the nonlinear systems (1) and (2) as follows: Then, the compact form for Equations (4) and (5) can be written as follows: , which means a Bernoulli distribution with probability parameter p i k , if P(w i k = 1) = p i k and P(w i k = 0) = 1 − p i k . The process noise is independent of the uncertainty. Then, we assume w k andw i k , i = 1, 2, ..., L are mutually independent. Since the PCRB is an important criterion of sensor selection, we drive two PCRBs of the uncertain dynamic systems (1), (2), (6) and (7) in Sections 3 and 4, respectively. The former is accurate, but it is difficult to be computed. Thus, the latter is derived by introduced some hidden state variables, which avoids the complex integral and can reduce the computation burden. Finally, based on the second PCRB, we hope to obtain the analytically optimal solution of the sensor selection problem, so that it can be applied to the large-scale sensor selection problem for the uncertain dynamic systems.

The Posterior Cramér-Rao Bound with Uncertain Observations
In this section, we mainly discuss two methods to calculate the PCRB of multiple sensors. The first method is based on the nonlinear dynamic system with uncertain observations (1) and (2) and Gaussian mixture model [5,13,15,32]. The other approach is based on the nonlinear dynamic system (6) and (7) motivated by the EM algorithm [33].
Let θ be a r-dimensional estimated random parameter, z represents a vector of measured data, let p(z, θ) be the joint probability density of the pair (z, θ), and let g(z) be a function of z, which is an estimate of θ. Let ∆ and ∇ be operators of the first and second-order partial derivatives, respectively, The PCRB on the estimate error has the form where is the (Fisher) information matrix denoted by Van Trees [8]. For example, if the posterior distribution of θ conditioned on z is Gaussian with meanθ z and a covariance matric Σ z . Then, the information matrix (8)

and the information matrix J is correspondingly divided into blocks
Then, it can be easily shown that the covariance of estimation of θ β is lower bounded by the right-lower block of J −1 , i.e., where we assume that J −1 αα exists. Denoted J β = J ββ − J βα J −1 αα J αβ , which is called the information submatrix for β. Now, for nonlinear dynamic systems with uncertain observations (1) and (2), the following proposition gives a method to compute the information submatrix J k recursively. Proposition 1. The Fisher information submatrix J k for the estimating state vectors {x k } obeys the recursion: with where Proof. Equations (1) and (2) together with p(x 0 ) determine the joint probability distribution of The conditional probability densities p(x k |x k−1 ) and p(y i k |x k ) can be calculated by Equations (1) and (2), respectively. Denote p k = p(X k , Y k ), by Equation (16), we can obtain the formula about p k+1 as follows: Therefore, The information submatrix J k for x k can be obtained as follows: T , then the posterior information matrix for X k+1 can be written as the following block form by Equation (18), where 0 stands for zero blocks of appropriate dimensions, and D 11 k , D 12 k , D 22 k are calculated as follows: Then, the information submatrix J k+1 for x k+1 can be computed as Based on the definition of J k in (20), we can obtain the desired recursion (9). Since the state noise and the measurement noise are Gaussian with zero mean and invertible covariance matrices Q k and R i k , i = 1, . . . , L, respectively. Moreover, the dynamic systems have the uncertain observations. From these assumptions and Equation (3), it follows that where c 1 is a constant. Therefore, D 11 k , D 12 k , D 22 k can be simplified to (11)-(14).
From Equations (14) and (15), we see that the appearance of the summation inside of the logarithm, and the computation of D k 22 is related to the joint probability density function of the sensor measurements y k+1 and the target state x k+1 , then D k 22 is not easy to calculate. These reasons motivate us to study another approach to derive the PCRB.
Based on the equivalence between the systems (1)- (2) and (6)-(7), we can derive PCRB for the dynamic systems (6) and (7) by introducing a hidden variable I k , and the new PCRB may be easier to compute. Since the second derivation for the discrete augmented variable I k do not exist, then we bring in a continuous random variableĨ k to approximate the 0-1 variable I k . The augmented state vectoȓ Therefore, the new system can be expressed as follows: . . , L, then the limit of I k is the state variable I k when σ → 0, i.e., lim σ→0Ĩ k = I k .
LetJ k represents the PCRB about the approximated augment vectorx k of systems (23)-(25), respectively. Then, we can easily get the following conclusion: Based on Lemmas 1 and 2, for nonlinear dynamic system with the uncertain observations (6) and (7), it is easy to see thatJ −1 k can also represent a CRB for the estimation error covariance matrix of vector x k . Proposition 2. At time k + 1, the Fisher information submatrixJ k+1 of x k+1 for the multi-sensor uncertain systems (6) and (7) is computed according to the following recursion: Proof. According to Lemma 1 and the derivation of Proposition 1, the new augmented state vectorx k has the following PCRB for systems (6) and (7): whereD 11 k ,D 12 k ,D 22 k are denoted as follows: In order to obtain the lower bound for x k , it is necessary for us to calculate the following probability densities, according to Equations (6) and (7), where c 3 is a constant, and the first equality follows from the independence and the second follows from (2). The another probability density is as follows: where c 4 is a constant. Sincex k = [x k ,Ĩ k ] T , we use Equations (33)-(34) and Lemma 1, and the suitable partitioned expressions forD 11 k ,D 12 k ,D 22 k are obtained: whereD 11 k ,D 12 k are denoted in (28) and (29), while C 12 , C 21 and C 22 are calculated as follows: If we divideJ k as the following block matrix then according to (32) and (35)-(37), the value ofJ k+1 is Since the matrix C 22 is the function of σ, it is shown in [21] that Using Lemma 1, we can obtain (26), and the matrixD 22 k can be computed as

Remark 1.
Note that PCRBs derived in Propositions 1 and 2 have different forms. The first one is optimal. The second one is only approximately optimal with less computational burden. Since it may be approximated from above or below, which one is lower cannot be judged. The simulation in Section 6 shows that they are almost equal and the computational complexity of the approximate bound is less than that of the accurate bound.

Remark 2.
In the case of p = 1 and L = 1, the multi-sensor dynamic systems (1) and (2) has the certain observations. Obviously, the PCRB derived by the method in [13] is consistent with that derived in Proposition 2.

Sensor Selection with Uncertain Observations
In large sensor networks, it is an important problem to manage the communication resources efficiently. The calculation of PCRB by Proposition 1 needs to use the joint probability density function of the sensor measurements and the target state, which leads to the computational burden being heavy, so that it is detrimental to be used as a measure criteria of the sensor selection. In this section, we consider the problem of sensor selection by Proposition 2.
For the nonlinear dynamic system at time k, assume that s sensors will be selected from L sensors by maximizing the Fisher information matrix, then they will send their measurements or local estimates to the fusion center. Finally, the fusion center makes the estimates for the state. In order to select the optimal s sensors, we need to introduce a selection vector s k = [s 1 k , . . . , s L k ] T ∈ {0, 1} L . If the ith sensor is selected, let s i k = 1; otherwise, s i k = 0, i = 1, . . . , L. According to the derivation of the Fisher information matrix in Section 3, the selection vector modifies the log conditional probability density logp(y k |x k ) as [34] In fact, the selected variable s k only has an effect onD 22 k of Proposition 2. Then,D 22 k can be written asD Therefore, the information matrix of x k+1 is the function of the selected variable s k . Now, the sensor selection problem can be expressed as the following optimization problem: where "tr" means "trace", which is the sum of squares of semiaxes lengths of the Fisher information matrix. "s.t". means "subjected to".

Remark 3.
In fact, the objective function in (43) should be matrixJ k+1 (s k+1 ). Then, the problem (43)-(45) is a matrix optimization problem, which is considered in the sense that if s * k+1 is an optimal solution. Then, for an arbitrary feasible solution s k+1 , we haveJ k+1 (s k+1 ) J k+1 (s * k+1 ), i.e.,J k+1 (s k+1 ) −J k+1 (s * k+1 ) is a positive semidefinite matrix. There are two reasons to choose trace function as the objective function. First, it is a linear function, which helps us to easily derive the optimal solution. Second, some researchers [26][27][28] have proved that it has many advantages to apply to sensor selection, such as, if the primal matrix optimization problem has an optimal solution andD 12 k in (29) is invertible, then the matrix optimization problem for sensor selection can be equivalently transformed to this convex optimization problem (43)-(45).

Simulation
In this section, we provide two examples to compare the different PCRB by Proposition 1 with Proposition 2, and select the optimal sensors by Proposition 3.
Example 1: Consider an uncertain nonlinear dynamic system for the mobile robot. At time k, the mobile robot pose is described with thestate vector x k = [x k y k θ k ], where x k and y k are the coordinates on a 2D plane relative to an external coordinate frame, and θ k is the heading angle. We use the control commands u k = [∆d k , ∆θ k ] to determine the motion of the mobile robot, where ∆d k is the incremental distance robot (in meters) and ∆θ k is the incremental change in heading angle (in degrees). The robot motion can be described as follows [35]: where ∆d k = 5, ∆θ k = 5. The state equation is defined as f k = [ f 1,k , f 2,k , f 3,k ] T , and then the state model is The measurement equation is f or i = 1, . . . , L, p i T is the position of the ith sensor. In the simulation, we consider the WSN shown in Figure 1, which has L = 6 × 6 = 36 sensors deployed in the area 100 × 100 m 2 [5]. The noise covariances are set as Q k = diag([0.1 2 , 0.1 2 , 3 2 ]), R i k = diag([1 2 , 1 2 ]). In the example, the initial state of the robot starts is [8,8,1] and the initial covariance matrix is P 0 = diag([10, 10, 2]) [35]. The sampling length is assigned to f lag = 50. Here, the number of Monte Carlo (MC) simulation is MC = 200. The following simulation results include three parts: the first part is about the trajectory of the mobile robot and PCRB of the state estimation, the second part is about the average computation time, and the third part is about the PCRB with different sensing probability p.
• Figure 1 shows the trajectory of the mobile robot and the location of the L sensors. Figures 2 and 3 show that the PCRB of position along the xand y-directions based on Proposition 1 and Proposition 2, respectively. From Figures 2 and 3, we can see the different PCRBs are shown to converge to the same values. However, the PCRB changes so much in the first seconds, and there are two possible reasons. First, the dynamic system is nonlinear. It may cause the algorithm to require some time to be convergent. Second, the initial variance may not be given better, such that it is far away from the convergence point.

•
The average computation time of calculating PCRB based on Proposition 1 and Proposition 2 is presented in Figure 4. From Figure 4, obviously, when the number of the sensors increases, the computational complexity of Proposition 1 is much higher than that of Proposition 2, and the average computation time of PCRB by Proposition 2 increases slowly. The reason may be that the expression of PCRB based on Proposition 2 has a more concise form where theD 22 k is easier to compute. Thus, Proposition 2 is more suitable for the sensor selection in the large-scale sensor networks.

•
In Figure 5, the average PCRB of 20 time steps is plotted as a function of number of sensors. It shows that the PCRB obtained by Proposition 1 is the same as that based on Proposition 2. The larger p is, the smaller the number of required sensors. The reason may be that the sensors can take more observation information, when the sensor probability p is larger.  Example 2: In order to manage the communication resources efficiently in large wireless sensor networks, we need to select some appropriate sensors. Thus, let us consider the above dynamic system (50) and (51) and the WSNs [35]. In general, if the sensors are close to the target, which may have higher sensing probabilities compared to other sensors in the WSN, then it is highly likely to select those sensors, owing to being both closer to target and higher sensing probability. Here, we consider a relatively difficult case that the sensors around the target have relatively low sensing probabilities. Then, we compare our algorithm in Proposition 3 with the recent two methods given in [5,28].
In this example, we present two cases with different number of sensors in WSN. Firstly, we consider L = 36 and s k = 15. Moreover, let p i k = 0.1, i = 7, 8,9,10,13,14,15,16,20,21,22,23,24, and the other sensing probabilities are between 0.8 and 1. Secondly, we also consider L = 49 and s k = 15. Moreover, let p i k = 0.1, i = 14, . . . , 18, 22, . . . , 26, 30, . . . , 34, and the other sensing probabilities are between 0.8 and 1. The following simulation results contain three parts. The first part is about the sensor selection in the application of wireless sensor network, the second part is about the mean squared error based on the selected sensors, and the third part is about the computation time.
• Figures 6 and 7 present the location of L = 36 and L = 49 sensors, respectively. The target is showed at the time 10 s, and we use our algorithm in Proposition 3 to select the optimal s k = 15 sensors. When the uncertainty in the dynamic system is ignored, the recent method in [28] can be used to select the required sensors, and the results are shown in Figures 8 and 9. Comparing Figure 6 with Figure 8, some sensors are close to the target, such as sensor 8 and sensor 15, but they are not selected in Figure 6. In Figure 8, they are selected and the only closer sensors can be selected. The reason is that the sensing probability of sensor 8 and sensor 15 is very low, and they may be not given us much useful information, although they are close to the target. Comparing Figure 7 with Figure 9, it has a similar phenomenon, such as sensor 16 and 31 not being selected in Figure 7, but they are selected in Figure 9.

•
In Figures 10 and 11, the mean squared errors of position in xand y-directions are plotted for the algorithm given in Proposition 3 and the algorithms in [5,28]. It shows that our algorithm can derive the best performance. The reason is that our algorithm considers the influence of uncertain observation, and the optimal selected sensors are obtained. Although the algorithm in [5] considers the uncertain observation, it is difficult to obtain the optimal selected sensors, since it involves relaxing the variable {0, 1} to the interval [0, 1]. From Figures 12 and 13, we can also see that the proposed method also performs best in the case of L = 49, thus the performance of the new method is stable with the increase of the number of sensors.

•
The computation times of obtaining PCRB are plotted in Figures 14 and 15 for the three algorithms, respectively. Figures 14 and 15 show that the computation time of the method in Proposition 3 is much smaller than that of the other two methods. The reason is that the method in Proposition 3 is an analytical solution. Therefore, the proposed algorithm in Proposition 3 is more suitable for the large sensor networks.
x-axis coordinates (m) The algorithm in [31] The algorithm in [29] The algorithm in Proposition 3 The algorithm in [31] The algorithm in [29] The algorithm in Proposition 3 The algorithm in [31] The algorithm in [29] The algorithm in Proposition 3 The algorithm in [31] The algorithm in [29] The algorithm in Proposition 3 The algorithm in [31] The algorithm in [29] The algorithm in Proposition 3 The algorithm in [31] The algorithm in [29] The algorithm in Proposition 3 Figure 15. The computation time of obtaining PCRB is plotted as function of time steps with L = 49 sensors.

Conclusions
This paper has proposed two methods to derive the PCRB to effectively overcome the difficulties caused by uncertainty. The first method is based on the recursive formula of the Cramér-Rao bound and the Gaussian mixture model. Nevertheless, it needs to compute a complex integral based on the joint probability density function of the sensor measurements and the target state. The computational burden of this method is relatively high, especially in large sensor networks. Inspired by the idea of the expectation maximization algorithm, the second method is to introduce some 0-1 latent variables to treat the Gaussian mixture model. Since the regular condition of the posterior Cramér-Rao bound is unsatisfied for the discrete uncertain system, we use some continuous variables to approximate the discrete latent variables. Then, a new Cramér-Rao bound can be achieved by a limiting process of the Cramér-Rao bound of the continuous system. It avoids the complex integral, which can reduce the computation burden. Thus, the sensor selection problems for the nonlinear uncertain dynamic system with linear equality or inequality constraints can be efficiently solved, and the optimal solution of the sensor selection problem can be derived analytically. Thus, it can be used to deal with the sensor selection of large-scale sensor networks. Two typical numerical examples verify the effectiveness of the proposed methods.