Particle Filter for Randomly Delayed Measurements with Unknown Latency Probability

This paper focuses on developing a particle filter based solution for randomly delayed measurements with an unknown latency probability. A generalized measurement model that includes measurements randomly delayed by an arbitrary but fixed maximum number of time steps along with random packet drops is proposed. Owing to random delays and packet drops in receiving the measurements, the measurement noise sequence becomes correlated. A model for the modified noise is formulated and subsequently its probability density function (pdf) is derived. The recursion equation for the importance weights is developed using pdf of the modified measurement noise in the presence of random delays. Offline and online algorithms for identification of the unknown latency parameter using the maximum likelihood criterion are proposed. Further, this work explores the conditions that ensure the convergence of the proposed particle filter. Finally, three numerical examples, one with a non-stationary growth model and two others with target tracking, are simulated to show the effectiveness and the superiority of the proposed filter over the state-of-the-art.


Introduction
State estimation for nonlinear discrete-time stochastic systems has received considerable attention from researchers because of its application in various fields of science, including navigation and localization [1,2], surveillance [3], agriculture [4], econometrics [5], and meteorology [6], for example. The Bayesian approach [7] gives a recursive relationship for the computation of the posterior probability density functions (pdf) of the unobserved states. But the computation of the posterior pdf in case of a nonlinear system is often numerically intractable, and hence suboptimal approximations of these pdf are often used. The particle filter (PF) is a powerful sequential Monte Carlo method under the Bayesian framework to solve nonlinear and non-Gaussian estimation problems by approximating the posterior pdf empirically [8]. The particle filter often outperforms other approximate Bayesian filters such as the extended Kalman filter (EKF) and the grid-based filters in solving nonlinear state estimation problems [9]. However, most works on the EKF [10] and as well as on the traditional PF [8,9,11] typically assume that measurements are available at each time step without any delay. In practice, in many aerospace and underwater target tracking [12], control [13] and communication [14] subsystems, random delays in receiving the measurements are inevitable. These delays, usually caused by the limitations of the network channel, need to be accounted for while designing the state estimator.
In the literature, the random delays have been addressed in the context of linear estimators [15][16][17][18][19][20]. A linear networked estimator is proposed in Reference [21] to tackle irregularly-spaced and delayed measurements in a multisensor environment. On the other hand, the research on random delays and packet drops in nonlinear state estimation is limited. In Reference [22] and Reference [23], improved versions of the EKF and the unscented Kalman filter (UKF) are proposed for one-time step and two-time step randomly delayed measurements. In Reference [24], quadrature filters have been modified to solve nonlinear filtering problem with one-step randomly delayed measurements. In Reference [25], the cubature Kalman filter (CKF) [26] is used to tackle one-step randomly delayed measurements for nonlinear systems. In Reference [27], a methodology to solve nonlinear estimation problems with multi-step randomly delayed measurements is proposed. However, all these non-linear filters are restricted to Gaussian approximations. Moreover, they assume that the latency probability of delayed measurements is known. In Reference [28] and Reference [29], a modified PF that deals with one-step randomly delayed measurement with unknown latency probability and a PF for multi-step randomly delayed measurements with a known latency probability are presented, respectively. In References [30,31], the estimation of the unknown latency parameter with one-step randomly delayed measurements is addressed using data log-likelihood function within the Expectation-Maximization (EM) framework. However, none of these works considered the presence of random packet drops. Further, H ∞ filtering techniques are used to tackle network-induced delays and packet drops in References [32][33][34].
The specific contributions of this paper to the state-of-the art, specifically over References [28][29][30] are as follows: (i) We propose a new PF with an explicit expression for the importance weight for randomly delayed measurements of any number of time steps with an unknown latency probability and in the presence of packet drops. In Reference [29], the design of a PF with multi-step randomly delayed measurements is addressed. However, the latency probability is assumed to be known and packet drop is not considered while deriving the expression for the importance weight. (ii) The latency parameter for the random delays and packet drops is assumed to be unknown and we present a method to estimate it both in offline and online manners by maximizing the likelihood function. The sequential Monte Carlo (SMC) method is used here to approximate the likelihood function in the presence of randomly delayed measurements of any number of time steps and packet drops. In References [28,30,31] the latency probability for the measurements with random delays of maximum one step is estimated without considering packet drops. Moreover, while Gaussian approximation is used in the E-steps of the EM framework in References [30,31], the SMC approximation is used in Reference [28], but only for measurements with random delays of maximum one time step and without considering any packet drops. (iii) Due to presence of random delays and packet drops, the measurement noise sequence at different time steps becomes correlated. This work first formulates the modified noise model and derives a general pdf for the modified noise. The proposed PF is then developed using the pdf of the modified measurement noise. In References [25,35], randomly delayed measurements with the correlated measurement noise for a nonlinear system is addressed. However, they have considered a maximum delay of one time step and used the Gaussian approximation to develop the filtering algorithm.
Finally, with the help of three numerical examples, the effectiveness and the superiority of the proposed PF are demonstrated in comparison with the state-of-the-art algorithms. Table 1 lists the features of the previous works and of the proposed work.

Work
Random Delays Packet Drops Latency Estimation Filtering [28] Single step × SMC [29] Multi-step × × SMC [30] Single-step × Gaussian [31] Single-step × Gaussian Proposed work Multi-step SMC The rest of this paper is organized as follows. The problem statement is defined in Section 2. In Section 3, the modified PF is proposed and its convergence is discussed. Section 4 deals with the estimation of the unknown latency probability. In Section 5, simulation results are presented to demonstrate the superiority of the proposed PF. Finally, in Section 6, some conclusions are discussed.

Problem Statement
Consider a nonlinear dynamic system that can be described by the following equations: State equation: x Measurement equation: where x k ∈ n x denotes the state vector of the system and z k ∈ n z is the measurement at any discrete time k ∈ (0, 1, · · · ), while q k−1 ∈ n x and v k ∈ n z are mutually independent white noises with arbitrary but known pdf. Here, we consider the case where actual measurement received at a particular time step may be a randomly delayed measurement from a previous time step. This delay (in number of integer time steps) can be between 0 and N at the kth sampling instant. If any measurement gets delayed by more than N steps, that measurement is discarded and no measurement is received at the estimator. Here, N is the maximum (in number of integer time steps) delay that is determined as discussed in Sections 3.2 and 3.3.
To model delayed measurements at the kth instant, we choose the independent and identically distributed Bernoulli random numbers β i k (i = 1, 2, · · · , N + 1) that take values either 0 or 1 with an unknown probability P(β i where p is the unknown latency parameter. If y k is the measurement received at the kth instant [27], then where Here, β 0 k is considered to be 1. A measurement received at the kth time instant is j step delayed if α j k = 1. Additionally, at time instant k, at most one of α j k (0 ≤ j ≤ N) can be 1. If all α j k are zeros, the estimator buffer keeps the measurement y k−1 received at the previous step. This results in the loss of a measurement (packet drop) when that measurement is delayed by more than N steps due to buffer-size limitation.

Remark 1.
Bernoulli random variable β i k and its function α j k are used to represent the real-time randomness of delays in measurements in practical systems [15]. Inclusion of the possibility that at a time step k no measurement may be received (when the delay is longer than N steps) corresponds to the practical limit on the buffer size in real systems. In our algorithm, if a received measurement matches with any of the measurements in the buffer, it is discarded.

Remark 2.
The latency parameter of received measurements, p, that is, the mean of random variable β i k is unknown in real systems. Contrary to the assumption of single-step delay in Reference [28], the unknown latency probability in this paper is for arbitrary step delays along with the packet drops due to buffer limitation.
Further, since both the ideal measurement and the measurement noise are modified in (3), y k needs to be rewritten for its subsequent use in characterizing the densities. Hereafter in this article, h k (x k , k) and f k (x k , k) will be written as h k (x k ) and f k (x k ) respectively, for brevity. Then, (3) can be restructured as Here, G1 k h k−j (x k−j ), α j k and G2 k G1 k−1 (·), α j k , G2 k−1 (·) denote the ideal measurement parts (without any noise) of y k from the non-delayed measurements z k−N:k and the previous step measurement y k−1 , respectively, and are defined as where G1 1 (·) = h 1 (x 1 ) and G2 1 (·) = 0. Again, v k is the additive measurement noise in (3) defined as where v k−1 is the noise received along with observation y k−1 and v 1 = v 1 . Now, the objective is to develop a PF algorithm for the system in (1) with measurement model (3) that assumes the knowledge of latency probability p. Additionally, we propose offline as well as online algorithms to estimate p.

Particle Filter
In a sequential importance sampling filter, the posterior probability density function P(x 0:k |z 1:k ) is replaced by its equivalent series of weighed particles, which can be represented as [9] P(x 0:k |z 1: where particles {x i 0:k } N s i=1 are drawn from a proposal density q(x 0:k |z 1:k ) and then the weights of the particles are chosen using the importance principle. The unnormalized importance weight of the ith particle is given by The recursive weight update at each time step is given by P(x 0:k |z 1:k ) = P(z k |x 0:k )P(x 0:k |z 1:k−1 ) P(z k |z 1:k−1 ) where P(z k |z 1:k−1 ) is a normalizing constant. Similarly, the proposal density is assumed to be decomposed as q(x 0:k |z 1:k ) = q(x k |x 0:k−1 , z 1:k )q(x 0:k−1 |z 1:k−1 ). (12) Assuming that the state vector x k follows the Markov process, the importance weight of (10), with the help of (11) and (12), can be written as where the predicted density P(x i k |x i k−1 ) and the likelihood density P(z k |x i k ) can be evaluated using the system model, previously estimated posterior P(x i k−1 |z 1:k−1 ) and the joint noise density

Modified PF for Randomly Delayed Measurements
A recursive computation of the importance weights can be obtained for a nonlinear system with measurement model of (3). From (3), it can be seen that y k is stochastically dependent on the non-delayed measurements z k−N:k and the previous step measurement y k−1 and, therefore, we need to relax the standard assumption of P(z k |x 1:k , z 1:k−1 ) = P(z k |x k ) as discussed below. Assumption 1. The received measurement y k , conditionally on x k−N:k and y k−1 , is independent of the state vectors x 1:k−N−1 and the measurements y 1:k−2 , that is, P(y k |x 1:k , y 1:k−1 ) = P(y k |x k−N:k , y k−1 ).

Lemma 1.
Recursion equation of the importance weight w i k for model (1) and (3), can be obtained as where x i k is drawn from the proposal density q(x k |x i 0:k−1 , y 1:k ).
Proof. Using the Bayes' theorem and assuming that the states do not depend on the future measurements, the proposal density can be decomposed as q(x 0:k |y 1:k ) = q(x k |x 1:k−1 , y 1:k )q(x 1:k−1 |y 1:k−1 ).
Particles x i k and x i 1:k−1 can be sampled from q(x k |x 1:k−1 , y 1:k ) and q(x 1:k−1 |y 1:k−1 ), respectively. Again, using the Bayes' rule, the joint pdf, P(x 1:k , y 1:k ), can be decomposed as follows: By Assumption 1 and the first-order Markov property of the state vectors, (15) can be rewritten as Using (14) and (16), the importance weight can be written as Now, with the help of (17), w i k can be finally written as (13). Now, the predicted density P(x i k |x i k−1 ) and the likelihood density P(y k |x i k−N:k , y k−1 ) can be characterized using the joint noise density P(q k−1 , v k |x k−1 ) [7,36]. As given in Section 2, q k−1 and v k are independent noise processes with for all integer values of i and j. Hence, using the independence property and (8), it can be shown that q k−1 and the modified measurement noise v k are also independent. Therefore, assuming q k−1 and v k are independent of the previous state x k−1 , we can decompose the joint noise density as P(q k−1 , v k |x k−1 ) = P(q k−1 )P(v k ). Moreover, given that the pdf of q k−1 is known, P(x i k |x i k−1 ) can be evaluated, whereas the pdf of v k is unknown and needs to be calculated for the evaluation of the likelihood.
Further, for the computation of the likelihood density P(y k |x k−N:k , y k−1 ), the probability related to the number of random delays in the received measurement needs to be evaluated. Note that the probability of a received measurement being delayed by j time steps, at any instant k, is [27] Note that as the number of delay steps (j) increases, the associated probability (P(α j k = 1)) decreases. The probability that the estimator receives y k−1 at the kth instant of time is [27] It can be observed that for a high value of p, N should be kept sufficiently large to reduce the probability of a packet being lost.

Lemma 2.
The likelihood density P(y k |x i k−N:k , y k−1 ) can be computed recursively as where P v k−j (·) represents the pdf of the measurement noise v k−j .
Proof. Let γ k be a Bernoulli random variable that denotes the event that a measurement is received (with a step delay between 0 and N). The probability that a measurement is received with a delay less than or equal to N steps, is The probability that no measurement is received and the estimator keeps measurement y k−1 , is Now, as the modified likelihood density should be characterized by the pdf of the modified measurement noise v k , using (5), we have where P v k (·) denotes the pdf of v k . Further, assuming that v k , conditionally on v k−N:k and v k−1 , is an independent noise sequence over time, it can be calculated using the known pdf of v k−N:k and the intermediate Bernoulli random variable γ k as follows: (19) and (20) Using the expression for v k in (8) Similarly, when γ k = 0 (i.e., ∑ N j=0 α j k = 0), the measurement noise v k = v k−1 and we have Substituting (23) and (24) into (22) and then, (22) and (21) results in (3) and (7), we have (y . Now, rewriting (25) for particle i, we have (applying (21) for k := k − 1 to the second term on the right hand side) (18) is similar to the sum of the product densities used in the probabilistic data association (PDA) algorithm [37,38]. Further, it can be observed that if there are no random delays and packet drops in the received measurements (i.e., N = 0 and p = 0), (18)

Remark 3. Note that
), the expression for the likelihood density of the standard PF.
In this work, to mitigate the effect of degeneracy on the importance weights of particles, the resampling is carried out at each step. A general proposal density q(x k |x 0:k−1 , y 1:k ) obtained from a nonlinear filter such as the EKF or the UKF [39], can be used to draw samples. However, the predicted density P(x k |x k−1 ) is a common choice to implement the sequential importance resampling (SIR) PF [9] despite the fact that the current measurement is not used to locate new samples. Note that the current measurement in this work is randomly delayed and is unaccounted for in the proposal density, which may not necessarily push the sampled particles towards higher likelihood regions. On the other hand, P(x k |x k−1 ) includes the measurements up to the last time-step that have been corrected with the modified importance weight. The steps to implement a standard SIR filter [9] with the modified importance weight for this work are outlined in Algorithm 1. It uses a standard resampling technique, for example, multinomial [40] or, systematic [9] to suppress the particles with negligible weight. A flow diagram to obtain the modified PF estimate is shown in Figure 1. (13) and (18) -Normalize the importance weight:

Convergence of the PF for Randomly Delayed Measurements
In this subsection, we explore the conditions for the convergence of the modified PF derived for randomly delayed measurements. A PF is said to be converging if its empirical approximation follows a mean square error of order 1/N s at each step of filtering [8]. The prime requisite for simple convergence is that likelihood function P(z k |·) should be upper bounded, that is, P(z k |·) < ∞, for all x k ∈ n x [8]. The following lemma is an extension of the results in Reference [8] for our case.
then, ∀k ≥ N, there exists c k/k independent of N s , such that for any where x i 0:k are the unweighted particles obtained using the modified PF algorithm.
Proof. Attributing to the random delays and packet drops in measurements, we investigate the impact of the modified likelihood density on the simple convergence of the particle filter that is otherwise converging with non-delayed measurements. That is, P(z k |x k ) < ∞, ∀x k ∈ n x and z k ∈ n z , is given. Now, using (2), we can write Thus, the pdf of noise v k , that is, , is bounded for all its real-valued inputs. Rearranging the terms of (18) on the both sides, we have As v k is a stationary process, its pdf is not affected by time shift, that is, if P v k (·) is bounded, P v k−j (·) must also be bounded. Again, since p j (1 − p) < 1 for all values of j, we can write Given that N is a finite number, (28) and (29) can be used to establish (26). Now, results of Reference [8] can be used to verify (27).
Theorem 3 of Reference [8] suggests that c k|k is independent of the number of particles N s , but represents the dependency of mixed dynamics of the system on the initial conditions or past values. That is, if the optimal filter associated with the system dynamics has a long memory, c k|k will continue to increase the mean square error with each step. In our case, the filter does have a memory due to arbitrary delays in measurements. Therefore, to avoid the large mean square error of convergence, the value of N should be kept small. On the other hand, to counter the increase in the value of c k|k , number of particles N s needs to be large.

Identification of Latency Probability
In practice, when a set of randomly delayed measurements is given, we may not know channel properties and its random parameters. Hence, the latency probability that is required to design the filter may be unknown. Here, we use the ML criterion to identify the unknown latency probability for received measurements. This method involves the maximization of the joint density P p (y 1:m ) of the received measurements, which is a function of latency parameter p represented by [28] p = arg max p∈[0,1] P p (y 1 , · · · , y m ), (30) where m is the number of measurements used for the identification of parameter p andp is the estimated value of p. Now, we assume that the first received measurement y 1 is independent of parameter p and is equal to z 1 . Again, using the Bayes' theorem the above joint pdf can be reformulated as P p (y 1 , · · · , y m ) = P(y 1 ) m ∏ k=2 P p (y k |y 1:k−1 ).
For computational simplicity the above maximization of likelihood is expressed in terms of the log-likelihood (LL). The LL of (31) can be formulated as where L p (y 1:m ) is the LL function of the received measurements. Now, to solve the maximization problem of (31), first the computation of likelihood P p (y k |y 1:k−1 ) and then the maximization of L p (y 1:m ) need to be carried out.

Computation of Likelihood Density
State likelihood can be used to compute P p (y k |y 1:k−1 ) with the help of the SMC approximation [41]. We can express the likelihood density P p (y k |y 1:k−1 ) as the marginal density of a joint pdf that includes delayed measurement and previous states that are correlated using the Bayes' theorem and Assumption 1 as follows: P p (y k |y 1:k−1 ) = · · · P p (y k , x k−N:k |y 1:k−1 )dx k · · · dx k−N = · · · P p (y k |x k−N:k , y 1:k−1 )P p (x k−N:k |y 1:k−1 )dx k · · · dx k−N = · · · P p (y k |x k−N:k , y k−1 )P p (x k−N:k |y 1:k−1 )dx k · · · dx k−N .
Using Bayes' rule, the joint state prior P p (x k−N:k |y 1:k−1 ) can be decomposed as Since the state vectors follow the first-order Markov property, using the chain rule and (9), we can write the joint pdf as where x i k−N , x i k−N+1 , · · · , and x i k are the unweighted particles and drawn from P p (x k−N |x k−N−1 , y k−N ), P p (x k−N+1 |x k−N , y k−N+1 ), · · · , and P(x k |x k−1 ), respectively. The accuracy of the SMC approximation in this case depends on the choice of the proposal density, the number of particles and the value of N. This would be equivalent to that of the modified SIR as discussed in Section 3.3. Now, substituting (34) into (33), P p (y k |y 1:k−1 ) can be computed aŝ where P p (y k |x i k−N:k , y k−1 ) is given in (18). Algorithm 2 illustrates the steps to compute the LL function.
Algorithm 2 Computation of log-likelihood function.

Maximization of Log-Likelihood Function
Substituting (35) into (32), we can rewrite the LL function in (32) aŝ where y 1 is independent of parameter p and can be neglected in the maximization of the likelihood density. Equation (36) can be maximized numerically over p ∈ [0, 1]. There are two options for the numerical search of the latency parameter: offline and online identification. In the offline method, we can use more measurements for higher parameter estimation accuracy. A higher value of m and smaller steps (sl) for p can yield a more accurate estimate of the latency probability, at the expense of higher computational burden. The offline algorithm can only be started after the first m measurements and it can be repeated with further measurements to improve the estimate. Algorithm 3 outlines the steps for offline identification.
In case of online identification, the latency probability is estimated at each time step and the estimated value of the parameter is used in the proposed PF algorithm. The running average of the estimated values at each step can be evaluated to improve the accuracy of the identified parameter. Algorithm 4 outlines the online method.  • Select the value for sl • for (p = 0 : sl : 1)

Computational Complexity
Attributing to the use of numerical search method to maximize the likelihood, the estimation of latency parameter is a computationally involved process. However, once the latency parameter is estimated, the computational cost of the proposed PF algorithm is not substantially higher than that of the standard PF. In the offline mode of identification, the proposed particle filter will run 1/sl times slower than the standard PF, where sl (0 < sl < 1) is the step length used for the search algorithm. That is, if the standard PF is O(N s n 2 x ) [42], then proposed PF in offline mode is O(N s n 2 x /sl). Similarly, for online identification, the modified particle filter will be O(N s n 2 x k), where k is the time step.

Simulation Results
To demonstrate the superiority of the proposed PF over the standard PF and the state-of-the-art algorithms for randomly delayed measurements, three different types of filtering problems are simulated. Although any general proposal density [39] can be chosen for the implementation of the proposed PF, in our simulations, the transitional prior density P(x k |x k−1 ) is used as the proposal density for its simplicity. The latency probability of delayed measurements is first identified by maximizing (36) over p ∈ [0, 1] with the help of the proposed PF algorithm. Subsequently, the estimated probability p is used to implement the proposed PF for the given problems. To ensure a fair comparison of the proposed PF with the other state-of-the-art algorithms, two sets of simulation are carried out for each problem based on the measurement model used in the chosen filtering algorithm. In the first set of simulations, the proposed PF with estimated latency probability is compared against the standard PF and the cubature Kalman filter for the randomly delayed measurements (CKF-RD) [27] by using the randomly delayed measurements with packet drops. The proposed PF with other than the true value of N is also implemented to investigate the impact of selecting a wrong value for the maximum number of delay steps. The performances of all filters are compared in terms of the root mean square error (RMSE). Note that the proposed PF with the wrong value of N and the CKF-RD are implemented with the true value of the latency probability. In the second set of simulations, the proposed PF with the estimated latency probability is compared against the PF for multi-step randomly delay measurements (PF-MD) [29] by using the set of randomly delayed measurements with no packet drops. To demonstrate the importance of the latency estimation, PF-MD is implemented with both the true latency probability and the incorrect latency probability. Further, recognizing the practical limitation on the buffer-length and to avoid the large convergence error as discussed in Section 3.3, the true value of the maximum step delay is taken as N = 2. However, the simulation work can be carried out with the larger value of N as shown for Problem 1. For brevity, the case with the larger N for other problems is not included in the paper.
MATLAB 2017a software is used to carry out the simulation work. No in-built subroutine for the PF is used and all the figures presented in this work are generated by running the standard instructions in MATLAB following the Algorithms 1-4. Both the truth and measurements are generated prior to filter implementation. The filtering algorithm uses the software generated measurement values for the PF to implement.

Problem 1
We consider a time-varying growth model that is widely used in literature because of its non-stationary property for validation of performances by nonlinear filters [9,23,43,44]. Nonlinear dynamics of the system are as follows: where q k−1 and v k are independent zero mean Gaussian processes with E[q 2 k ] = 10 and E[v 2 k ] = 1, respectively. The initial state x 0 ∼ N (0, 1), and the received measurements, y 1:k are generated by using (37) and (3) with N = 2. The distribution of the initial estimated state P(x 0 ) ∼ N (0, 1) and the number of particles used for the simulation of this problem is, N s = 1000.
Offline estimation of the latency probability is carried out by using Algorithm 3 with sl = 0.01 and m = 500. The latency probability (p) at the end of each ensemble is calculated and plotted in Figure 2a. For this case, when the true value of p is 0.5, the mean of the estimated latency probability over 100 ensembles is calculated as 0.481. Online estimates of the latency probability are shown in Figure 2b. Here, the estimated probability at each time step is the running average of the estimated probabilities.
The proposed PF is implemented with N = 1 and N = 2 on the measurements as described above, and the results are compared against that of the CKF-RD and the standard PF. To compare the results, the RMSE calculated over 100 Monte Carlo (MC) runs are plotted over 50 time steps in Figure 3a. It can been seen that the proposed PF with N = 2 outperforms the other three filters. Moreover, it is interesting to observe that the performance of the proposed PF with the wrong value of the maximum number of step delay (N = 1) is closer to that of the proposed PF with N = 2. Further, the average RMSE calculated over 50 time steps for different values of true latency probability is shown in Figure 3b. As expected, for a higher probability value where packet drop is more likely, filter designed for N = 2 performs better than the other filters. To observe the impact on the performance of the proposed filter when the maximum number of step delay is increased, we take a case of N = 3. Figure 4 shows the RMSE for the different filter while considering the true latency, p = 0.50. The average RMSE of the proposed filter in Figure 3a with N = 3 is 8.79 whereas that in Figure 4 with N = 2 is 7.48. The increase in the length of memory for the proposed filter explains the increase in the RMSE.
For simulation with the set of randomly delayed measurements without any packet drops, the RMSE value for PF-MD with the true latency (PF-MD(TL)), the proposed PF and PF-MD with the incorrect latency (PF-MD(IL)) are plotted in Figure 5. It can be observed that the RMSE value of the proposed PF and PF-MD(TL) remain almost the same whereas the filtering performance of PF-MD degrades when the latency probability is unknown and considered an incorrect value.  Table 2 shows the comparison of the computational burden for the different filters. To make the computational comparison independent of the software and clock speed used for simulation, we have calculated the relative computational time for each filter. All computational time is calculated with respect to the computational time of the standard PF algorithm. Note that the given computational time for the proposed PF is after the estimation of the latency parameter. However, the estimation of the latency parameter requires 1/sl times running of the proposed PF algorithm.

Problem 2
Consider a ground surveillance problem where a moving target is tracked from a noisy sensor mounted on a moving platform at an approximately known altitude by using only bearing angle observations. This bearing-only tracking (BOT) problem has mainly two components, namely, the target kinematics and the tracking platform kinematics. A representative scenario is given in References [3,45]. The tracking platform motion may be represented by the following equations: x tp,k =x tp,k + ∆x tp,k , y tp,k =ȳ tp,k + ∆y tp,k , k = 0, 1, · · · , n step , where x tp,k and y tp,k represent the X and Y coordinates of the tracking platform at kth time-step, respectively.x tp,k andȳ tp,k are the known mean coordinates of the platform position and ∆x tp,k and ∆y tp,k are independent zero-mean Gaussian white noises with variances, r x = 1 m 2 and r y = 1 m 2 , respectively. The mean values for position coordinates (in meters) arex tp,k = 4kT andȳ tp,k = 20, where T is sampling time for the discretization expressed in seconds (s). The target moves in X direction according to following discrete state space relations. where with x 1,k and x 2,k denoting the position (in m) and velocity (in m/s), respectively, of the target. q k−1 is an independent zero-mean white Gaussian noise with variance r q = 0.01 m 2 s 4 . The initial true states are assumed to be x 0 = [80 1] T . The sensor measurement is given by where z m,k = h(x tp,k , y tp,k , x 1,k ) = arctan y tp,k x 1,k − x tp,k is the angle between the X axis and the line of sight from the sensor to target and v k is independent Gaussian white noise with zero mean and variance r v = (3 • ) 2 . For the implementation of CKF-RD, the measurement noise statistics, on account of the uncertainties in the position of the observer platform, need to be modified as given in Reference [3].
The randomly delayed measurements with packet drops, y 1:k , are generated for 21s by using (3) and (39) with N = 2. The number of particles used for approximating the pdf is N s = 3000. At the beginning of simulation, the latency probability of received measurements is identified offline with T = 0.05 s, m = 400, and sl = 0.01. Latency probability at the end of each ensemble is calculated. The mean value of estimated p over 100 ensembles is calculated as 0.460, whereas its true value is 0.5. For online identification, T is taken as 0.1s and the running average of latency probability is calculated at each time step. The plots for the offline and online estimations are shown in the Figure 6a,b, respectively. Now, the proposed PF with N = 1 and N = 2 are implemented with the true and estimated latency probabilities, respectively. The results of CKF-RD and the standard PF for same set of delayed measurements are compared against that of proposed PF. The RMSE of four filters with the true latency probability p = 0.5 and sampling time T = 0.2 s, which have been calculated over 100 MC runs, are plotted in Figure 7a,b. From the plots, it can be seen that using a filter designed for delayed measurements is a better choice than a standard PF where the delays are not accounted.  Further, the average RMSE is calculated over 105 time steps for the different values of p. The average RMSE for two states, x 1,k and x 2,k are plotted in Figure 8a,b respectively. It can be seen that the difference in performance of the proposed PF and the standard PF becomes pronounced as the value of probability increases. However, at higher probabilities, the performance of the proposed filter also gets deteriorated on account of the higher rate of the packet loss.
To compare the performance of the proposed PF with the estimated latency, and PF-MD with the unknown latency, randomly delayed measurements with no packet drops are generated using the measurement model of Reference [29]. The RMSE for three filters are plotted in Figure 9. It can be observed that estimation of the latency affects the filtering process and the use of its incorrect value degrades the performance of a filter. The relative computational burden is given in the Table 3.

Problem 3
Consider a coordinated turn model for an aerospace target tracking problem for an aircraft that executes a maneuvering turn in a two-dimensional plane with a fixed, but unknown turn rate Ω. This uses the bearing and the range measurement observed from a radar to estimate the kinematics of the target. The five-dimensional state vector for the kinematics of aircraft is considered as x k = [ζζ ηη Ω] , where ζ and η represent positions, andζ andη represent velocities along the X and Y co-ordinates, respectively. The discrete-time state dynamics of the aircraft is given as [46]: where T is the time between two successive measurements. q k−1 is a zero mean Gaussian noise with covariance Q = diag[q 1 M q 1 M q 2 T], where q 1 and q 2 are the noise intensity parameters, . The range, r, and bearing, θ are the measurement available for target tracking, which are being measured by a radar placed at the origin. The noise-corrupted measurements can be expressed as where v k is an independent zero-mean Gaussian noise with covariance R = diag[σ 2 r σ 2 θ ]. Data for the different parameters used in this simulation are given in Table 4. Initial values for the state and covariance are x 0 = [1000 m 300 ms −1 1000 m 0 ms −1 −3 • s −1 ] T and P 0|0 = diag[100 m 2 10 m 2 s −2 100 m 2 10 m 2 s −2 100 mrad 2 s −2 ], respectively. The randomly delayed measurements with packet drops, y 1:k , are generated by using (41) and (3) with N = 2 and a probability, p = 0.5. The number of particles used for approximating the pdf is N s = 8000. In the beginning of simulation, the latency probability of received measurements is identified offline with m = 200 and sl = 0.01. The latency probability at the end of each ensemble is calculated and plotted in Figure 10a. The mean value of the estimated probabilitiesp over 100 ensembles is calculated as 0.4864 against its true value 0.5. Figure 10b shows the online estimation of latency probability. To track the maneuvering target, the proposed PF with N = 2 and N = 1 (a wrong value of the maximum number of delay steps) are implemented. To demonstrate the superiority of the proposed PF, simulation results of the CKF-RD and the standard PF for the same set of measurements and with the true value of latency probability, are compared against that of the proposed PF. The RMSE of the position, velocity and turn rate are the performance metrics as defined in Reference [46]. The RMSE of position, velocity and turn rate for the four filters with p = 0.5, are plotted in Figure 11a-c, respectively. Since the standard PF does not consider the correlation of current measurement with previously observed states due to random delays, it diverges, particularly in case of the RMSE in position. It can also be observed in the RMSE plots that the proposed PF with N = 1, which has accounted at least partially for the presence of random delays, performs better than the standard PF. This is significant since the maximum number of delay steps will have to be decided by the user and setting N = 1 might still yield some benefit over assuming no delay. Note that the difference in RMSE of position between the proposed PF and the standard PF or, the CKF-RD is considerable and is worth the relatively higher computational cost paid to obtain it. Further, to observe the impact of latency probability on the performances of the different filters, the time-averaged value of the RMSEs of the position, velocity and turn rate are plotted against different values of the p in Figure 12a-c, respectively. It can be seen that with high value of probability, that is, when the random delays are more likely, the standard PF and proposed PF with N = 1 become significantly less applicable than the proposed PF with N = 2. It can also be observed that when the probability is high with insufficiently higher value of N, the rate of packet drops increases and even the performance of the proposed PF (N = 2) gets deteriorated.
The performance of the proposed PF and PF-MD is compared in terms of the RMSE in Figure 13. To be in synchronization with the model of Reference [29], the measurements used for the estimation are without any packet drops. It can be observed from the plot that without the knowledge of latency, the performance of PF-MD degrades considerably, which suggests estimation of the latency is needed as it is presented in the proposed PF. Table 5 shows the relative computational burden for different filters.

Conclusions and Discussion
The standard PF loses its applicability if measurements are randomly delayed at the receiver. The random delays are possible in a system where the sensor and the receiver are connected through a network with bandwidth limitation.
In this paper, a recursion equation of the importance weight was developed to handle delays in measurements. A practical measurement model based on i.i.d. Bernoulli random variables, which includes the possibility of random delays in receiving the observations along with the packet drop situation if any measurement suffers a delay more than the data buffer length, was adopted. Moreover, the latency probability of the received measurements is usually unknown in practical cases.
Hence, this paper presented a method to identify the latency probability in the randomly delayed measurement environment. Further, this paper explored the conditions to ensure the convergence of the proposed PF and the trade-off in selecting the maximum delay.
To validate the performance of the proposed filter and demonstrate its superiority, three numerical examples are simulated using the standard PF, the CKF-RD, the proposed PF with wrong selection of maximum delay and the proposed PF with the correct value of maximum delay. Simulation results show that a filter designed for the delayed measurements, even considering less delay than the actual, performs better than the standard PF. If the random delay is more likely in a system, number of the maximum possible delay should be chosen such that it strikes a balance between avoiding the information loss and minimizing the convergence error. Further, the proposed PF is compared with PF-MD, which is designed for the multi-step randomly delayed measurements without any packet drops and assumes that the latency probability is known. Simulation results showed that the performance of a filter depends on the correct value of latency and its estimation is necessary for the better accuracy of the filtering process.
Author Contributions: R.K.T., S.B., P.D. and T.K. conceptualized the work and developed the methodology. Software implementation is done by R.K.T. The results both theoretical and simulation are validated by S.B., P.D. and T.K. The first version of the draft is prepared by R.K.T. and later it was reviewed and modified by S.B., P.D. and T.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by Visvesvaraya PhD Scheme, MeitY, Govt. of India, MEITY-PHD-2530. The first and second authors gratefully acknowledge the financial support extended by the organization.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: