Bayesian Cramér-Rao Lower Bounds for Prediction and Smoothing of Nonlinear TASD Systems

The performance evaluation of state estimators for nonlinear regular systems, in which the current measurement only depends on the current state directly, has been widely studied using the Bayesian Cramér-Rao lower bound (BCRLB). However, in practice, the measurements of many nonlinear systems are two-adjacent-states dependent (TASD) directly, i.e., the current measurement depends on the current state as well as the most recent previous state directly. In this paper, we first develop the recursive BCRLBs for the prediction and smoothing of nonlinear systems with TASD measurements. A comparison between the recursive BCRLBs for TASD systems and nonlinear regular systems is provided. Then, the recursive BCRLBs for the prediction and smoothing of two special types of TASD systems, in which the original measurement noises are autocorrelated or cross-correlated with the process noises at one time step apart, are presented, respectively. Illustrative examples in radar target tracking show the effectiveness of the proposed recursive BCRLBs for the prediction and smoothing of TASD systems.


Introduction
Filtering, prediction and smoothing have attracted wide attention in many engineering applications, such as target tracking [1,2], signal processing [3], sensor registration [4], econometrics forecasting [5], localization and navigation [6,7], etc. For filtering, the Kalman filter (KF) [8] is optimal for linear Gaussian systems in the sense of minimum mean squared error (MMSE). However, most real-world system models are usually nonlinear, which does not meet the assumptions of the Kalman filter. To deal with this, many nonlinear filters have been developed. The extended Kalman filter (EKF) [9] is the most well-known one, which approximates nonlinear systems as linear systems by the first-order Taylor series expansion of the nonlinear dynamic and/or measurement systems. The divided difference filter (DDF) was proposed in [10] using the Stirling interpolation formula. DDFs include the first-order divided difference filter (DD1) and second-order divided difference filter (DD2), depending on the interpolation order. Moreover, some other nonlinear filters have also been proposed, including the unscented Kalman filter (UKF) [11,12], quadrature Kalman filter (QKF) [13], cubature Kalman filter (CKF) [14,15], etc. All these nonlinear filters use different approximation techniques, such as function approximation and moment approximation [16]. Another type of nonlinear filter is the particle filter (PF) [17,18], which uses the sequential Monte Carlo method to generate random sample points to approximate the posterior density. Prediction is also very important since it can help people make decisions in advance and prevent unknown dangers. Following the same idea of filters, various predictors have been studied, e.g., Kalman predictor (KP) [19], extended Kalman predictor (EKP) [20], unscented Kalman predictor (UKP) [21], cubature Kalman predictor in [50], which is dependent on the actual measurements. Compared with the BCRLB, this CPCRLB can provide performance evaluations for a particular nonlinear system's state realization and better criteria for online sensor selection. In practice, the TASD systems sometimes may incorporate some unknown nonrandom parameters. For the performance evaluation of joint state and parameter estimation for nonlinear parametric TASD systems, a recursive joint CRLB (JCRLB) was studied in [51].
As equally important as CPCRLB is the BCRLB. It only depends on the structures and parameters of the dynamic model and measurement model but not the specific realization of measurement. As a result of this, BCRLBs can be computed offline. The BCRLB for the filtering of the general form of TASD systems has been obtained as a special case of the JCRLB in [51] when the parameter belongs to the empty set. However, the BCRLBs for the prediction and smoothing of the general form of TASD systems have not been studied yet. This paper aims to obtain the BCRLB for the prediction and smoothing of such nonlinear systems. First, we develop the recursive BCRLBs for the prediction and smoothing of general TASD systems. A comparison between the BCRLBs for TASD systems and regular systems is also made, and specific and simplified forms of the BCRLBs for additive Gaussian noise cases are provided. Second, we study specific BCRLBs for the prediction and smoothing of two special types of TASD systems, with autocorrelated measurement noises and cross-correlated process and measurement noises at one time step apart, respectively.
The rest of this paper is organized as follows. Section 2 formulates the BCRLB problem for nonlinear systems with TASD measurements. Section 3 develops the recursions of BCRLB for the prediction and smoothing of general TASD systems. Section 4 presents specific BCRLBs for two special types of nonlinear systems with TASD measurements. In Section 5, some illustrative examples in radar target tracking are provided to verify the effectiveness of the proposed BCRLBs. Section 6 concludes the paper.

Problem Formulation
Consider the following general discrete-time nonlinear systems with TASD measurements where x k ∈ R n and z k ∈ R m are the state and measurement at time k, respectively, the process noise w k and the measurement noise v k are mutually independent white sequences with probability density functions (PDFs) p(w k ) and p(v k ), respectively. We assume that the initial state x 0 is independent of the process and measurement noise sequences with PDF p(x 0 ).
x k ] and Z k = [z 1 , · · · , z k ] as the accumulated state and measurement up to time k, respectively. The superscript " " denotes the transpose of a vector or matrix.

Definition 2.
DefineX j|k andx j|k as estimates of X j and x j given the measurement Z k , respectively.x j|k are state estimates for filtering, prediction and smoothing when j = k, j > k and j < k, respectively. Definition 3. The mean square error (MSE) ofX j|k is defined as where m, n ∈ {k, k + 1}, and D 0,0 To initialize the recursion for FIMs of prediction and smoothing, the recursion of the FIM J k|k for filtering is required. This can be obtained from Corollary 3 of [51], as shown in the following lemma. Lemma 3. The FIM J k|k for filtering obeys the following recursion [51]

BCRLB for Prediction
Theorem 1. The FIMs J j+1|k and J j|k are related to each other through Proof. See Appendix A.
Theorem 2. The FIM J j|k for smoothing can be recursively obtained as for j = k − 1, k − 2, · · · , 0. This backward recursion is initialized by the FIM J k|k for filtering.
Proof. See Appendix B.

Comparison with the BCRLBs for Nonlinear Regular Systems
For nonlinear regular systems, measurement z k only depends on state x k directly, i.e., z k = h k (x k , v k ). Clearly, nonlinear regular systems are special cases of nonlinear TASD systems (2) since As a result, the likelihood function p(z j+1 |x j+1 , x j ) for TASD systems in (3)  in (3) will be reduced to in (10) into (8), the recursion of the FIM for smoothing of TASD systems will be reduced to This is exactly the recursion of the FIM for smoothing of nonlinear regular systems in [45]. That is, the recursion of the FIM for the smoothing of nonlinear regular systems is a special case of the recursion of the FIM for the smoothing of nonlinear TASD systems.
For the FIM of prediction, it can be seen that the FIMs for prediction in (5) of TASD systems are governed by the same recursive equations as the FIMs for regular systems in [45], except that J j|k , j = k, k + 1, k + 2, · · · , is different. This is because predictions for both TASD systems and regular systems only depend on the same dynamic Equation (1).
Next, we study specific and simplified BCRLBs for TASD systems with additive Gaussian noises.

BCRLBs for TASD Systems with Additive Gaussian Noise
Assume that the nonlinear systems (1) and (2) is driven by additive Gaussian noises as where w k ∼ N (0, Q k ), v k ∼ N (0, R k ) and the covariance matrices Q k and R k are invertible. Then the D's and E's of (3) used in the recursions of FIMs for prediction and smoothing will be simplified to Assume that the systems (12) and (13) is further reduced to a linear Gaussian system as where w k ∼ N (0, Q k ), v k ∼ N (0, R k ) and the covariance matrices Q k and R k are invertible. Then the D's and E's of (3) used in the recursions of FIMs for prediction and smoothing will be further simplified to Remark 1. If we rewrite the linear TASD systems (15) and (16) as the following augmented form where zero blocks have been left empty and w * k = [w k , w k−1 ] , then the process noise w * k in (18) will be correlated with its adjacent noises w * k−1 and w * k+1 , but uncorrelated with {w * 0 , · · · , w * k−2 , w * k+2 , · · · }. For this special type of linear system, how to obtain its BCRLBs is still unknown.

Recursive BCRLBs for Two Special Types of Nonlinear TASD Systems
Two special types of nonlinear systems, in which the measurement noises are autocorrelated or cross-correlated with the process noises at one time step apart, can be deemed as nonlinear TASD systems described in (1) and (2). These two types of nonlinear systems are very common in many engineering applications. For example, in target-tracking systems, the high radar measurement frequency will result in autocorrelations of measurement noises [29] and the discretization of continuous systems can induce the cross-correlation between the process and measurement noises at one time step apart [35]. In navigation systems, the multi-path error and weak GPS signal will make measurement noises autocorrelated [31] and the effect caused by vibration on the aircraft may result in the cross-correlation between the process and measurement noises [36]. Next, specific recursive BCRLBs for the prediction and smoothing of these two systems are obtained by applying the above theorems in Section 3.

BCRLBs for Systems with Autocorrelated Measurement Noises
Consider the following nonlinear system where l k is a nonlinear measurement function, e k is autocorrelated measurement noise satisfying a first-order autoregressive (AR) model [38] where Ψ k−1 is the known correlation parameter, the process noise w k and the driven noise ξ k−1 are mutually independent white noise sequences, and both independent of the initial state x 0 as well.
To obtain the BCRLBs for the prediction and smoothing of nonlinear systems with autocorrelated measurement noises, a TASD measurement equation is first constructed by differencing two adjacent measurements as Then, we can get a pseudo measurement equation depending on two adjacent states as where Clearly, the pseudo measurement noise v k in (24) is white and independent of the process noise w k and the initial state x 0 .
From the above, we know that the systems (20)- (22) is equivalent to the TASD systems (20) and (24). Applying Theorems 1 and 2 to this TASD system, we can get the BCRLBs for the prediction and smoothing of nonlinear systems with autocorrelated measurement noises.
Next, we discuss some specific and simplified recursions of FIMs for the prediction and smoothing of nonlinear and linear systems with autocorrelated measurement noises when the noises are Gaussian.

Theorem 3.
For the nonlinear systems (20)- (22), if the process noise w k ∼ N (0, Q k ) and the driven noise ξ k ∼ N (0, R k ), then the D's and E's of (3) used in the recursions of FIMs for prediction and smoothing will be simplified to Proof. See Appendix C.

Corollary 1.
Assume that the systems (20)-(22) is reduced to a linear Gaussian system as Then the D's and E's of (25) in Theorem 3 will be simplified to Theorem 4. For the linear Gaussian systems (26)-(28) with autocorrelated measurement noises, the inverse of FIM J k+m|k for m-step prediction in Corollary 1 is equivalent to the MSE matrix P k+m|k of the optimal prediction, m ≥ 1, i.e., Proof. See Appendix D.
Since P k+m|k = J −1 k+m|k , m ≥ 1, the optimal predictors can attain the BCRLBs for prediction proposed in Corollary 1, i.e., the optimal predictors are efficient estimators for the linear Gaussian systems (26)-(28) with autocorrelated measurement noises.

BCRLBs for Systems with Noises Cross-Correlated at One Time Step Apart
Consider the following nonlinear system where w k ∼ N (0, Q k ), e k ∼ N (0, E k ) and they are cross-correlated at one time step apart [39], satisfying E[w k e j ] = U k δ k,j−1 , where δ k,j−1 is the Kronecker delta function. Both w k and e k are independent of the initial state x 0 . To obtain the BCRLBs for the prediction and smoothing of nonlinear systems with noises cross-correlated at one time step apart, as in [50], a TASD measurement equation is constructed as where Clearly, the pseudo measurement noise v k is uncorrelated with the process noise

Proposition 1.
For the reconstructed TASD systems (31) and (33), h k (x k , x k−1 ) is independent of the pseudo measurement noise v k .
Proof. First, from the assumption of noise independence, we know that x k−1 is independent of e k and w k−1 . Therefore, it is obvious that is independent of the pseudo measurement noise v k . This completes the proof.
Proposition 1 shows that the reconstructed TASD systems (31) and (33) satisfies the independence assumption of the TASD systems in Section 2.
From the above, we know that the systems (31) and (32) is equivalent to the TASD systems (31) and (33). Applying Theorems 1 and reftheorem4 to this TASD system, the BCRLBs for the prediction and smoothing of nonlinear systems in which the measurement noise is cross-correlated with the process noise at one time step apart can be obtained.
Next, we discuss some specific and simplified recursions of FIMs for the prediction and smoothing of nonlinear and linear systems with Gaussian process and measurement noises cross-correlated at one time step apart.
Theorem 5. For the nonlinear systems (31)- (32), if the process noise w k ∼ N (0, Q k ) and the measurement noise e k ∼ N (0, E k ), then the D's and E's of (3) used in recursions of FIMs for prediction and smoothing will be simplified to Corollary 2. Assume that the systems (31)-(32) is reduced to a linear Gaussian system as Then the D's and E's of (34) in Theorem 5 will be simplified to Theorem 6. For the linear Gaussian systems (35) and (36) with cross-correlated process and measurement noises at one time step apart, the inverse of FIM J k+m|k for m-step prediction in Corollary 2 is equivalent to the MSE matrix P k+m|k of the optimal prediction, m ≥ 1, i.e., Proof. See Appendix E.
Since P k+m|k = J −1 k+m|k , m ≥ 1, the optimal predictors can attain the BCRLBs for prediction proposed in Corollary 2, i.e., the optimal predictors are efficient estimators for the linear Gaussian systems (35) and (36) with cross-correlated process and measurement noises at one time step apart.

Illustrative Examples
In this section, illustrative examples in radar target tracking are presented to demonstrate the effectiveness of the proposed recursive BCRLBs for the prediction and smoothing of nonlinear TASD systems.
Consider a target with nearly constant turn (NCT) motion in a 2D plane [14,40,48,53]. The target motion model is where x k = [x k ,ẋ k , y k ,ẏ k ] is the state vector, T = 1 s is the sampling interval, ω = 2 • s −1 is the turning rate and the process noise where S w = 0.1 m 2 s −3 is the power spectral density. Assume that a 2D radar is located at the origin of the plane. The measurement model is where the radar measurement vector z k+1 is composed of the range measurement r m k+1 and bearing measurement θ m k+1 , and e k+1 is the measurement noise.

Example 1: Autocorrelated Measurement Noises
In this example, we assume that the measurement noise sequence e k+1 in (41) is first-order autocorrelated and modeled as where I is a 2 × 2 identity matrix, the driven noise ξ k ∼ N (0, R k ) with R k = diag(σ 2 r (ξ), σ 2 θ (ξ)), σ r (ξ) = 30 m and σ θ (ξ) = 30 mrad. Further, w k and ξ k are mutually independent. The initial state X 0 ∼ N (X 0 , P 0 ) with X 0 = [1000 m, 120 ms −1 , 1000 m, 0 ms −1 ] P 0 = diag(10,000 m 2 , 100 m 2 s −2 , 10,000 m 2 , 10 m 2 s −2 ) To show the effectiveness of the proposed BCRLBs in this radar target tracking example with autocorrelated measurement noises, we use the cubature Kalman filter (CKF) [37], cubature Kalman predictor (CKP) [37] and cubature Kalman smoother (CKS) [38] to obtain the state estimates. These estimators generate an augmented measurement to decorrelate the autocorrelated measurement noises instead of using the first-order linearization method. Meanwhile, these Gaussian approximate estimators can obtain accurate estimates with very low computational cost, especially in the high-dimensional case with additive Gaussian noises. The RMSEs and BCRLBs are obtained over 500 Monte Carlo runs. Figure 1 shows the RMSE versus √ BCRLB for position and velocity estimation. It can be seen that the proposed BCRLBs provide lower bounds to the MSEs of CKP and CKS. Moreover, the gaps between the RMSEs of CKP and CKS and the √ BCRLBs for one-step prediction and fixed-interval smoothing are very small. This means that the CKP and CKS are close to being efficient. Moreover, it can be seen that the √ BCRLB for one-step prediction lies above the √ BCRLB for filtering and the RMSE of CKP lies above the RMSE of CKF. This is because prediction only depends on the dynamic model, whereas filtering depends on both the dynamic and measurement models. Since smoothing uses both past and future information, the √ BCRLB for fixed-interval smoothing is lower than the √ BCRLB for filtering and the RMSE of CKS is lower than the RMSE of CKF. Figure 2 shows the √ BCRLBs for multi-step prediction, i.e., 1-step to 5-step prediction. It can be seen that the more steps we predict ahead, the larger the √ BCRLB for prediction is. This is because if we take more prediction steps, the predictions for position and velocity will be less accurate. Figure 3 shows the √ BCRLBs for fixed-lag and fixed-interval smoothing. It can be seen that the √ BCRLB for 1-step fixed-lag smoothing is the worst and the √ BCRLB for fixed-interval smoothing is the best. This is because the smoothing estimation becomes more and more accurate as the length of the data interval increases.

Example 2: Cross-Correlated Process and Measurement Noises at One Time Step Apart
In this example, we assume that the process noise sequence w k in (39) is crosscorrelated with the measurement noise sequence e k in (41) at one time step apart. To show the effectiveness of the proposed BCRLBs in this radar target tracking example with the the cross-correlated process and measurement noises at one time step apart, we use the cubature Kalman filter (CKF), cubature Kalman predictor (CKP) and cubature Kalman smoother (CKS) in [40] to obtain the state estimates. These estimators decorrelate the cross-correlation between process and measurement noises by reconstructing a pseudo measurement equation. Compared with the Monte Carlo approximation method, these Gaussian approximate estimators can give an effective balance between estimation accuracy and computational cost. A total of 500 Monte Carlo runs are performed to obtain the RMSEs and BCRLBs. Figure 4 shows the RMSEs of CKF, CKP and CKS versus three types of √ BCRLBs, i.e., for filtering, one-step prediction and fixed-interval smoothing. It can be seen that the RMSEs of CKP and CKS are bounded from below by their corresponding √ BCRLBs. It can also be observed that the gaps between the RMSEs of CKP and CKS and their corresponding √ BCRLBs are very small. This indicates that these estimators are close to being efficient. Moreover, we can see that the √ BCRLB for one-step prediction lies above the √ BCRLB for filtering, and the RMSE of CKP lies above the RMSE of CKF because prediction uses less information than filtering. Since smoothing uses data within the whole interval, the √ BCRLB for fixed-interval smoothing is lower than the √ BCRLB for filtering and the RMSE of CKS is lower than the RMSE of CKF. Figure 5 shows the √ BCRLBs for multi-step prediction. We can see that the √ BCRLB for prediction grows as the prediction step increases. This is because if we predict more steps ahead, the predictions for position and velocity will be less accurate. Figure 6 shows the √ BCRLBs for fixed-lag and fixed-interval smoothing. Clearly, smoothing becomes more accurate as the length of the data interval increases. Hence, the √ BCRLB for 1-step fixed-lag smoothing is the worst. In contrast, the √ BCRLB for fixed-interval smoothing is the best.

Conclusions
In this paper, we have proposed recursive BCRLBs for the prediction and smoothing of nonlinear dynamic systems with TASD measurements, i.e., the current measurement depends on both the current and the most recent previous state directly. A comparison with the recursive BCRLBs for nonlinear regular systems, in which the current measurement only depends on the current state directly, has been made. It is found that the BCRLB for the smoothing of regular systems is a special case of the newly proposed BCRLB, and the recursive BCRLBs for the prediction of TASD systems have the same forms as the BCRLBs for the prediction of regular systems except that the FIMs are different. This is because prediction only depends on the dynamic model, which is the same for both of them. Specific and simplified forms of the BCRLBs for the additive Gaussian noise cases have also been given. In addition, the recursive BCRLBs for the prediction and smoothing of two special types of nonlinear systems with TASD measurements, in which the original measurement noises are autocorrelated or cross-correlated with the process noises at one time step apart, have been presented, respectively. It is proven that the optimal linear predictors are efficient estimators if these two special types of nonlinear TASD systems are linear Gaussian.

Data Availability Statement:
The authors declare that the data that support the findings of this study are available from the authors upon request.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: For the FIM J j+1|k , the joint PDF of X j+1 and Z k is Partition X j as X j = [(X j−1 ) , x j ] and J j|k as Since J −1 j|k is equal to the n × n right-lower block of (J j|k ) −1 , from the inversion of a partitioned matrix [24], the FIM about x j can be obtained as Similarly, we can obtain Then, J j+1|k can be rewritten as Since the prediction FIM matrix J j+1|k is the inverse of the right-lower n × n submatrix of J j+1|k , from (A7), we have This completes the proof.

Appendix B. Proof of Theorem 2
For the FIM J k|k , the joint PDF of X k and Z k at arbitrary time k is Similar to (A7), by using (A8), we can partition J k|k as where zero blocks have been left empty, , and the block matrix T j|j is Since J −1 j|k is the lower-right block of [(J k|k ) −1 ] 11 defined in (7), we have From (A9) and the inversion of a partitioned matrix [24], we have Substituting (A12) into (A11) and using (A13) yields Then using the matrix inversion lemma [24], the FIM J j|k is given by Substituting (4) into (A15), we have This completes the proof.

Appendix C. Proof of Theorem 3
From the assumptions that the noises are additive Gaussian white noises, we have where c 2 is a constant. Thus, the partial derivatives of ln p(x k+1 |x k ) are The remaining D k+1,k k+1 , D k+1,k+1 k+1 , E k,k k+1 , E k+1,k k+1 and E k+1,k+1 k+1 can be obtained similarly. This completes the proof.

Appendix D. Proof of Theorem 4
Applying the optimal filter [24] to the linear Gaussian systems (26)-(28), we have For simplicity, we introduce From (4) and the matrix inversion lemma [24], we have The inverse of B 22 k in (A30) can be rewritten as Thus, the inverse of B 11 Then, from (A30), (A32), (A33) and (A35), the inverse of J k+1|k+1 is Using (6) and Corollary 1, the FIM for one-step prediction can be obtained as For the optimal one-step predictor, the MSE matrix P k+1|k is given by Then we have P k+1|k (A31)(A37) Using (6) and Corollary 1, the FIM for two-step prediction can be written as For the optimal two-step predictor, one has and P k+2|k = E[x k+2|kx k+2|k |y k ] Then it follows from (A39), (A40) and (A42) that Similarly, we can prove that P k+m|k = J −1 k+m|k , m ≥ 3. This completes the proof.

Appendix E. Proof of Theorem 6
For the linear Gaussian systems (35) and (36), from [49], we have P k|k = J −1 k|k . Using (6) and Corollary 2, the FIM for one-step prediction can be written as For the optimal one-step predictor, the MSE matrix P k+1|k is given by From (A44), (A45) and P k|k = J −1 k|k , we can obtain Using (6) and Corollary 2, the FIM for two-step prediction is given by For the optimal two-step predictor, one has and Then it follows from (A46), (A47) and (A49) that Similarly, we can prove that P k+m|k = J −1 k+m|k , m ≥ 3. This completes the proof.