Adaptive Particle Filter for Nonparametric Estimation with Measurement Uncertainty in Wireless Sensor Networks

Particle filters (PFs) are widely used for nonlinear signal processing in wireless sensor networks (WSNs). However, the measurement uncertainty makes the WSN observations unreliable to the actual case and also degrades the estimation accuracy of the PFs. In addition to the algorithm design, few works focus on improving the likelihood calculation method, since it can be pre-assumed by a given distribution model. In this paper, we propose a novel PF method, which is based on a new likelihood fusion method for WSNs and can further improve the estimation performance. We firstly use a dynamic Gaussian model to describe the nonparametric features of the measurement uncertainty. Then, we propose a likelihood adaptation method that employs the prior information and a belief factor to reduce the measurement noise. The optimal belief factor is attained by deriving the minimum Kullback–Leibler divergence. The likelihood adaptation method can be integrated into any PFs, and we use our method to develop three versions of adaptive PFs for a target tracking system using wireless sensor network. The simulation and experimental results demonstrate that our likelihood adaptation method has greatly improved the estimation performance of PFs in a high noise environment. In addition, the adaptive PFs are highly adaptable to the environment without imposing computational complexity.


Introduction
With the advancement of wireless sensor network (WSNs) and RFID technologies, the demands for indoor location information have promoted the extensive study of indoor localization methods over the past few years [1]. There are increasing requirements of location-based services (LBS) in practical applications, such as indoor navigation systems, location-based social networks, indoor robot and healthcare [2]. The reliability of the location accuracy is a major concern when providing the location information. Additionally, the target position can be obtained via range-based measurement, such as time-of-arrival (TOA), time-difference-of-arrival (TDOA) and received signal strength (RSS), by using WSNs or RFID. In a dynamic high noise environment, the multi-path and non-line-of-sight (NLOS) effects cause high interference during the signal transmission, and the range measurement contains high error. In this case, the PF is a promising estimation method in the nonlinear non-Gaussian environment due to its high accuracy and is widely applied in the indoor localization systems.
Our goal is to reduce the instantaneous unknown noise in LF according to our analysis, and the major contributions of our work are four-fold: (1) We propose a likelihood adaptation method, which is based on the DGM. The method combines the prior information and a tuning parameter to reduce the instantaneous unknown error: the prior information is the predicted measurement based on the predicted state; and the belief factor θ is the tuning parameter, which adapts the likelihood function to a more accurate one. By tuning θ, the impact of unknown error for likelihood calculation is reduced; (2) In order to obtain the optimal performance, we use the Kullback-Leibler divergence (KLD), which is an efficient metric to compare two distributions, to derive the optimal θ. The optimal θ can achieve the minimum KLD and attain the lowest estimation error of the PF; (3) We formulate the optimal performance of the adaptive method by using the Cramér-Rao lower bound (CRLB) analysis. The analytical results indicate that our method outperforms the conventional PF for the nonparametric measurement error models; (4) Three versions of PFs are improved based on our adaptation method, which are the BPF, the GPF and the CPF. The improved PFs are evaluated in the simulations and the real-world WSN indoor localization experiments. The results demonstrate that the proposed algorithms effectively reduce the estimation error and have robust performance in a high noisy wireless environment.
The rest of the paper is organized as follows: Section 2 analyzes the measurement uncertainty effect of the likelihood function in the PF algorithms. Section 3 describes a dynamic Gaussian modeling method for the uncertainty and outlier measurement errors. Section 4 proposes the likelihood adaptation method, and the adaptive PFs are designed in Section 5. The theoretical analysis based on CRLB is described in Section 6. Section 7 presents the simulation and real experiment results of adaptive PFs in the indoor localization systems. Finally, Section 8 concludes this paper.

State-Space Model
In general, the state-space model is applied to describe the nonlinear estimation problem based on the hidden Markov process. The estimated state vector x t is evolved from the current observation z t and the previous state x t−1 , which are defined as the prediction equation and the update equation. The prediction equation is: x where f t () is the prediction function, x t−1 is the previous state and q t is the prediction noise, which implies the prediction error of x t . Here, we assume that q t is additive to the prediction. At time t, the system obtains a noisy measurement vector z t . The relationship between x t and z t follows: where h t () is the measurement function and v t is the measurement noise at time t. In this paper, we employ the TOA measurement for range information. The TOA is a simple and straightforward range measurement method for indoor localization. The distance is proportional to the transmission delay, and the noise comes from the additive delay, which is introduced by the multi-path, NLOS and unsynchronized distributed clock within the sensors. The TOA is generally formulated as: where τ is the transmission delay, c is the microwave speed, a = [s x , s y ] T is the position vector of the anchor and ||x t − a|| indicates the distance between the target and the anchor. Considering the noise, the range measurement for each anchor j can be formulated as: where x t = [p t x p t y ] is the target position and a j = [s j x , s j y ] T is the position of anchor j. For a 2D positioning system, the distance measurement function can be expressed as: where N indicates the number of anchors. Accordingly, p(z t |x t ) is defined as the measurement likelihood for z t , which is essential to the whole estimation.

Particle Filter
PF consists of three parts: the first part is to generate a set of state samples, which are called particles; the second part is to calculate the associated weight for each particle; the final part is to re-sample the particles and to eliminate the unimportant particles. The estimated state is obtained by calculating the weighted average values of the particles. PF first generates a particle set i and N s denote the particle number and the total number, respectively. The purpose of using I s (x i t ) is to generate high quality particles and then to attain the accurate estimation. Since the particles represent the possible state in the state space, it is necessary to sample a finite particles with high probability that can represent the state due to the limited computational overhead. Thus, the high quality particles are the samples with a high probability generated and calculated from I s (x i t ). Based on I s (x i t ), most particles are drawn only in the interested area, and a few particles are sampled on the other places. Thus, the posterior PDF p(x t |z t ) of state x t can be approximated by using the delta function: For the Bayesian estimation in a non-Gaussian environment, it hardly provides the analytical formulation. Thus, the approximation is used with the set of particles, which is the main advantage of the PF [26]. Additionally, the weight of each particle can be calculated based on the Bayes rule [27]: where p(x i t |x i t−1 ) is the transition probability for particle x i t and p(z t |x i t ) is the measurement likelihood of z t for x i t . The importance density function I s ( , which evolves from the previous samples {x i t−1 } N s i=1 , and 0 : t − 1 denotes the time sequence from the initial state to the previous state. The PF estimation relies on the high quality particles and associated weight calculations. The weight is obtained via Bayes' rule, which considers the prior probability and the measurement likelihood. The likelihood relates to the measurement and the noise distributions. A proper error distribution can lead to an accurate estimation not only for the PF, but also for other probability-based algorithms, e.g., maximum likelihood (ML) and maximum a posterior (MAP) algorithms.

The Measurement Uncertainty
In PFs, the measurement likelihood for each particle is calculated as [26]: where z i t = h t (x i t ) and p() is the PDF of measurement noise v t with variables of z i t . The additive measurement noise is assumed to be zero mean and follows a Gaussian distribution. However, the statistical results indicate that the noise is not zero mean due to heavy outlier values [28]. Thus, other PDFs are applied for measurement noise modeling, e.g., exponential, Rayleigh, Weibull or Gamma distribution [29]. If such a PDF is known to the system, the PF can obtain an accurate estimation, which approaches the optimal estimation with an increasing number of particles, as CRLB indicates. However, the PDF is not always known or precise for real applications. In real applications, the measurement error is modeled by statistical methods based on the sample dataset. On the one hand, the parameters of the error model can be changed if the sample dataset is changed. On the other hand, the outlier values can deviate from the error distribution, which makes the error model inaccurate for the PF estimation.
For instance, we construct an experiment to collect all of the ranging measurement noise throughout the whole building. During such an experiment, we employ a robot moving automatically through every possible position of the building, including the hallway, the office room and classroom in a building at Freie University Berlin. Then, all of the ranging errors are collected, and the histogram is constructed. The error data can be attained freely via the website of [30]. The large positive noise, which may contribute a heavy tail for a histogram, is shown in Figure 1. In Figure 1, the histogram can be divided into two parts: the left part is similar to a Gaussian distribution, for which the mean and variance can be calculated; the right part is a kind of heavy tail, which represents the outlier values. To describe the histogram, researchers use an arbitrary assumed error model or a combined model to approach the real application. However, this makes the estimation algorithms sensitive to the environment. If the environment is changed, the model should be adapted accordingly; otherwise, it can lead to the wrong estimation.

Dynamic Gaussian Model
To address the measurement uncertainty problem and to make the PF robust to environmental change, we firstly introduce a dynamic Gaussian model (DGM) to the likelihood calculation. The main idea of the DGM is to classify the measurement error distribution into two parts: the first part is the expected parametric distribution (EPD), which is a pre-assumed distribution and known to the system. The EPD is obtained based on the knowledge or experiences of the system design, and the parameters can be attained via the system model or some pre-assumptions. The most popular EPD is the normal distribution. The second part is the non-parametric distribution (NPD), which is unknown to the system and for which it is hard to get the parameters. The conventional PFs only use EPD as the pre-assumed distribution model for estimation. The NPD is the compensation for the EPD when the PFs are in the dynamic environment without the knowledge of the noise model. The histogram of the EPD and the NPD is depicted in Figure 1. In Figure 1, the error value between −5 and 5 can mainly be modeled as the normal distribution, which is the EPD and denoted by the solid curve. For the outlier values, another Gaussian distribution NPD attempts to cover such values, which is depicted by the dashed curve.
The DGM is the drifted EPD, which is deviated by the instantaneous NPD value. Here, we use v t = w t + n t , where w t is the EPD error and n t is the NPD error. Then, v t follows the EPD, but deviated by n t . The DGM is different from the Gaussian mixture model. The Gaussian mixture model uses different Gaussian distributions to fit the non-Gaussian distributions. The results are still a static distribution model. However, DGM is dynamic and does not attempt to fit the error histogram, because the NPD part is not an exact model to fit the outlier values. It is rather a fuzzy function to cover such values. In addition, the Gaussian mixture model is a parametric model for the entirety of the error samples, and the DGM only contains a single instantaneous value of the NPD instead of the entirety of the data samples. In general, if n t = 0, v t follows the EPD. However, for a typical measurement, v t consists of a drift value w t that deviates from the pre-assumed normal distribution to a certain distance. Then, the EPD should be adapted dynamically according to the instantaneous value of the NPD.
The DGM for a typical NPD error is depicted in Figure 2. In Figure 2, the solid curve represents the EPD, which is a zero mean normal distribution. When the instantaneous value of the NPD is introduced into the measurement noise, the EPD is deviated. Then, we obtain the likelihood calculation: where an unpredictable instantaneous noise n t is introduced into the likelihood function. Equation (9) is illustrated as the dashed curve in Figure 2, which is a biased non-zero mean Gaussian distribution. It is deviated from the original assumption due to considering the instantaneous value, n t . If n t is zero, the likelihood function p A (z t |x i t ) is the exact EPD of measurement noise: where p A represents the actual assumed probability; then, we would have optimal filtering with the increasing number of particles. However, in most real cases, n t is not zero, and w t is deviated by n t in Equation (10), which leads to inaccurate estimation. When n t becomes larger, the gap between the two curves is increasing, as shown in Figure 2b, which degrades the estimation accuracy significantly. Therefore, our goal is to develop an adaptation method to mitigate n t and approach the likelihood calculation of the exact assumed distribution.

Likelihood Adaptation
We attempt to reduce the impact of the NPD and calculate the LF p(z t |x i t ) based on the DGM in order to improve the estimation accuracy. Our adaptation method consists of two steps: the first step is to obtain a predicted measurementẑ t according to the previous state; the second step is to adapt the LF based onẑ t and a belief factor θ, which is a tuning parameter for adaptation. The structure of the adaptive PF, which integrates with the predicted measurement and θ, is illustrated in Figure 3. In the original PF, the LF is determined by the EPD, whereas in our algorithm, the likelihood calculation is based on the DGM. Belief factor θ is used to adapt the value betweenẑ t and z t .

Measurement Prediction
Sampling Likelihood Weight Posterior PDF & Estimation

Sensing Data
Belief Factor Predicted Measurement Figure 3. The architecture of the particle filter integrated with the adaptive likelihood method. The common particle filter does not contain the prior measurement and the belief factor.

Predicted Measurement
The predicted measurement is derived based on the prediction state and is the reference for the real measurement. The calculation steps are as follows:x t denotes the prediction value of x t : where x t−1 is the estimation at previous time t − 1. When considering the processing noise q t , we denotex t as:x where q t is assumed to be the additive noise and follows normal distribution q t ∼ N (0, Q t ); Q t is the covariance at time t. Then, we obtain a predicted measurement for: sensors: in whichẑ t indicates the prediction of measurement derived fromx t . The predicted measurementẑ t is not the actual measurement, but is used as prior information for measurement likelihood adaptation.

Belief Factor and Measurement Adaptation
Belief factor θ is the tuning parameter forẑ t , and it is used to adapt the measurement z t to reduce n t . Note that our goal is to achieve Equation (10). Since Equation (10) can not be achieved directly, we useẑ t and z t to approach Equation (10) with θ. Then, the adaptive likelihood function p AL (z t |x i t ) is constructed as:

Optimal Belief Factor and Likelihood Estimation
Since we intend to compare our adapted DGM to the EPD, the effective evaluation method is KLD [31]. KLD, which also is denoted as relative entropy, quantifies the difference between two distributions. If p 1 (x) and p 2 (x) indicate two different distributions, KLD is formulated as: where D KL (p 1 ||p 2 ) denotes the KLD. The KLD is a non-negative distance between two different distributions, which is In addition, KLD is a convex function [32]. We employ the KLD as a metric to find optimal θ, which minimizes the distance between p A (z t |x t ) and p AL (z t |x t ). Here, we use p A (z t |x t ) as the objective distribution and employ p AL (z t |x t ) as the tuning distribution with parameter θ. Then, the KLD function is constructed as: Then, optimal θ is attained with minimum D KL (p A ||p AL ): If p A () and p AL () are based on the same Gaussian distribution function, then Equation (16) is expressed as [33]: where R t is the covariance of v t . Since R t is independent of θ, the objective function is simplified as: which turns out to be a least-squares approximation problem [34]. Sinceẑ t is the nonlinear functions of the prediction noise q t according to Equation (13), it is difficult to obtain an analytical optimal result. Besides, the computational complexity is increased accordingly to obtain an optimal solution. Thus, linearizing the object function is preferred. We use first order Taylor series expansion at x t to linearize Equation (13):ẑ where ∂h t (x t )/∂x t is the partial differential of h t (x t ) with respect to x t . Additionally, substitute Equation (20) and Equations (2) into (19); we obtain: Therefore, the problem is converted into a linear optimization problem, which is solvable analytically by expressing the objective as the convex quadratic function [34]: where Q t is the covariance of q t and R t is the covariance of n t based on the assumed NPD model. Then, the optimal θ can be obtained if and only if: Then, the unique θ is derived: Since θ is the belief factor for the predicted measurement, Equation (24) indicates that when NPD error is high with a large R t ,ẑ t offers more contribution than the noisy measurement. In other words, when the prediction covariance Q t is larger than R t , our method should assign more belief to z t . In this case,ẑ t is useless and can introduce more estimation error. Thus, when the measurement error is quite small and the prediction error exceeds the measurement error, using our method will introduce extra estimation error. Then, the conventional methods can outperform our algorithms. This means that our method is useful when the measurement uncertainty is higher than the predicted state uncertainty, and the optimal θ exists. Fortunately, in the indoor wireless tracking system, the measurement uncertainty and outlier values are always high. In addition, if the measurement noise is low, the prediction error is also small, due to the performance of the Bayesian filtering estimation. Therefore, our method can effectively improve the estimation accuracy in the noisy environment.

Adaptive Particle Filter
According to the architecture in Figure 3, we integrate our adaptation method with three main PFs, which are BPF, GPF and CPF, and design new PF versions: adaptive BPF (A-BPF), adaptive GPF (A-GPF) and adaptive CPF (A-CPF).

Adaptive Bootstrap Particle Filter
In A-BPF, the prediction statex t is obtained through: where the previous state x t−1 is calculated based on: Then, we obtain the predicted measurementẑ t according to Equation (13).
When the measurement z t is available, the adapted measurement likelihood p(z t |x i t ) for each particle is calculated as Equation (14). Additionally, then, the particle weight is calculated and normalized as w i t ∝ w i t−1 p AL (z t |x i t ), which determines the posterior PDF of the estimated state x t . Finally, x t is attained: The importance sampling and resampling parts are still the same, and the predicted measurement and optimal θ is easy to obtain based on the analytical formulation. Therefore, our likelihood adaptation does not introduce much computational complexity to the original PFs. The algorithm of A-BPF is presented in Algorithm 1. Our adaptation method is implemented in the importance sampling part and the weight adaptation part of PF.

Adaptive Gaussian Particle Filter
The GPF approximates the estimated PDF by Gaussian distributions using the PF method. The GPF assumes that PDF of the state follows a Gaussian distribution, and it samples particles according to the estimated PDF [11]. Therefore, only the mean and covariance of the estimated PDF are calculated and propagated. Due to the simplicity, the GPF is widely used in distributed PF applications [12].
Particles are drawn from the Gaussian distribution functions, where µ t is the mean value of estimated state and Q t is the covariance of PDF. The Gaussian PDF evolves according to the transition model: x t = f t (x t−1 ). Thus, the covariance is assumed to propagate to the next time step, which is Q t|t−1 = Q t−1 . Then, the initial weight for each particle is calculated as: When the measurements are available, the weight for each particle updates: where p(z t |x i t ) is the likelihood.

Algorithm 1 Adaptive bootstrap particle filter (A-BPF).
Prediction: In the A-GPF, the LF fuses both the prior information and current data based on the DGM. The optimal θ is derived based on the previous estimated Q t|t−1 and current NPD covariance R t . Thus, calculating the optimal θ is a recursive procedure in A-GPF. The procedure of A-GPF is illustrated in Algorithm 2.

Algorithm 2 Adaptive Gaussian particle filter (A-GPF).
Prediction: The difference between A-BPF and A-GPF is: the prediction error in A-BPF follows an arbitrary assumed distribution, whereas A-GPF uses the Gaussian distribution to indicate such a distribution.
The assumed distribution in A-BPF is obtained based on the statistical analysis, and the estimation of A-BPF can be accurate if the assumed distribution is correct.

Adaptive Constraint Particle Filter
The CPF randomly samples particles not only based on both the assumed distribution and some constraint conditions, e.g., is the constraint functions [9]. The constraint conditions guarantee the particle generated in the target region with a very high probability by allocating a higher previous weight. In this case, both the previous weight and adaptive parameters are maintained during the estimation, where the complexity is still high. Thus, we use the constraint conditions only to sample the particles, and the weights are calculated according to the adaptive likelihoods. Then, the complexity of A-CPF is reduced.
The constraint conditions can be set up according to different applications. We will detail the conditions for the range-based target tracking in the next section. After sampling the particles, the prediction is obtained based on the prediction function. Then, our adaptation method is used, and the weight calculation follows the same procedure as A-BPF, which is illustrated in Algorithm 3. The major difference between A-BPF and A-CPF is the additional constrains for particles, which can improve the performance of BPF with additional information.

Algorithm 3 Adaptive constraint particle filter (A-CPF).
Constraint Sampling: Resampling: {x i t , w i t } N s i=1 ; end for

Performance Comparison
Note that the performances of the proposed APFs are different and suitable for different scenarios. In general, all of the APFs can effectively reduce the estimation error in the high noise environment, which will be proven in Section 6. The A-BPF reduces the estimation error by using arbitrary prior information. Such information is set up since the initial sampling scheme. If the sampling scheme and prior information are accurate, the estimation results can be good. However, the system cannot adjust the prior information at every time step. Thus, the A-BPF cannot further reduce the error if the measurement error is small. The A-GPF is quite suitable for adjusting itself during the recursive estimation. At each time step, A-GPF calculates the estimation covariance using the current particles and propagates to the next time step as the prior information. This is quite flexible for a dynamic environment. However, the disadvantage of the A-GPF is that there is some error in the covariance calculation. Thus, if the measurement error is too high, the performance of the A-GPF is not as good as A-BPF. The most robust feature for the dynamic environment is A-CPF, which employs constraint conditions to restrict error. In a high noise environment, the estimation error of the A-CPF can be controlled. However, the A-CPF has an inherent error from the constraint condition. Thus, it is still not as good as the A-BPF if the measurement error is small.

CRLB Analysis
In this section, we provide the benefits of the adaptive PFs by deriving the CRLB. The CRLB, which is given by the inverse of the Fisher information matrix (FIM), sets the lower limit for the variance (or covariance matrix) of any unbiased estimators of an unknown parameter (or unknown parameters) [35].

FIM of the Adapted Likelihood
For the adaptive PFs, the prior information is integrated into the likelihood function. If the measurement noise also follows a Gaussian distribution, we can easily derive the covariance of adapted measurement: If the measurement noises are independent of each anchor, the covariance matrix can be simplified asR t = diag N×N (R j ), whereR j can be derived independently in each anchor. Then, I D (x t ) is formulated as: for the detailed formulation, please refer to the Appendix.

FIM of Recursive Estimation
The FIM of recursive Bayesian estimation has been initially formulated by Petr et al. [36], which is given as: where where I(x t ) is a part of I([x t x t−1 ] T ); A, B and C are parts of I([x t x t−1 ] T ), which represent the previous and current information. For the detailed description, please refer to the Appendix.

Analytical Results
We set up several simulations to evaluate the performance improvement of the adaptive PF. The playing field is 100 × 100 m 2 . We randomly deploy 100 sensors. The target chooses a random path walking through the playing field. The range data are obtained through the TOA technique, whose range information is formulated according to Equation (4). The EPD for each sensor follows zero mean Gaussian distribution w j t ∼ σ 2 w , where σ w = 1 m. Additionally, set the NPD of each sensor as n j t ∼ N (1, σ 2 n ), where σ n = 3 m. Root mean square error (RMSE) is mainly compared in the simulation as the estimation accuracy metric. For the CRLB analysis, the RMSE indicates the root value of the unbiased covariance. To compare with our proposed method, we use the common recursive CRLB form in [36], where the measurement noise vector is the combination of the EPD and NPD v t ∼ N (1, R t + W t ). It should be noted that the the measurement noise in the simulation is not simply the sum of two distribution. EPD is the general error distribution, which is added during the whole simulation. However, NPD is applied to represent outlier values of the simulation. Thus, during each time step, we randomly add an NPD error or not. In this case, the NPD error is a few outlier values that can influence the measurement, and DGM must be applied for the PF estimation. The sequential estimation results are illustrated in Figure 4.  We also adapt the measurement noise environment to examine the performance change for each process. The NPD covariance for each anchor changes from R j = (0.5 m) 2 to R j = (5.5 m) 2 . The root mean squared error (RMSE) results of the two CRLBs are depicted in Figure 5. When the NPD covariance is small, the adaptive method cannot improve estimation, but increases the estimation error. In this case, it would rather use only the measurement data to obtain the estimation. However, with increasing the NPD covariance, the adaptive method improves the estimation gradually and outperforms the recursive Bayesian method when the error is quite high. Thus, our method is feasible for a harsh environment with high uncertainty.

Simulation Setup
Our scheme is evaluated in the simulation of target tracking using the wireless sensor networks. We randomly deploy 100 sensors in a two-dimensional square 100 × 100 m 2 region. One target runs through a random path with a constant speed. The target periodically broadcasts the request signals to the sensors, and the sensors respond to the request. Then, the ranging values are calculated according to the time-of-arrival (TOA) of the responses. Each sensor node j is assigned coordinations [s j x , s j y ] T , and the target state at time t is is the velocity vector and T is the transpose operator. In the simulation, we assume that the measurement noise consists of the EPD and the NPD values, and the measurement noise for each sensor node is identical independent distributed (i.i.d). The EPD noise follows the zero mean Gaussian distribution, where w j t ∼ N (0, σ 2 w ) and σ w = 1 m. Additionally, the NPD noise also follows a Gaussian distribution, where n j t ∼ N (1, σ 2 n ) and σ n = 3 m. Then, the DGM can be illustrated as shown in Figure 6.

Particle Filter Modeling
We employ a linear model as the prediction function, and Equation (1) can be expressed as: where F t is the linear transition matrix: where ∆T indicates the time interval.
In the particle sampling stage, the samples are initially randomly generated and propagated according to the linear transition model in BPF and A-BPF. In GPF and A-GPF, the particles are sampled based on the updated PDF. In CPF, the particles are constrained to some conditions. For range-based localization systems, the constraint region can be drawn by the bounding box algorithm, which is demonstrated to be robust to measurement noise [37]. Using z j t to denote the j-th range measurement for node j with the position [s j x , s j y ] T in the two-dimensional playing field, then the geometric constraint region is: max{s where N is the number of sensor nodes. Then, the particles are sampled within this region.

Optimal Belief Factor for A-BPF
First, we compare the estimation accuracy of BPF to our algorithm, A-BPF, to analyze the relationship between θ and the estimation accuracy. We implement A-BPF and vary θ from 0 to 1 to verify whether θ has an optimal value with minimum estimation error. The algorithms are tested in the different measurement noise scenarios, in which the NPD noise variance σ 2 n is tuned from 0.5 2 to 5 2 . The results of our simulation are averaged by over 1000 Monte-Carlo trials. We randomly deploy sensor nodes in every trial. In this simulation, all algorithms generate 1000 particles at each time step t. Figure 7 illustrates the root mean square error (RMSE) of each algorithm with some different measurement noise scenarios. In Figure 7, the solid line represents the RMSE of BPF, and the dashed line indicates the RMSE of A-BPF with different θ, in which θ changes from 0 to 1 with an interval of 0.05. When θ = 0, A-BPF totally believes the noisy measurement and has the same accuracy as BPF. The RMSE of A-BPF is decreasing when θ increases from 0, which indicates that our adaptation method can improve the measurement likelihood and estimation accuracy. As indicated in each figure in Figure 7, A-BPF has a minimum RMSE with optimal θ in different measurement noise scenarios. When θ is larger than the optimal value, the RMSE of A-BPF begins to increase. Additionally, when θ approaches 1, it is over tuned and causes high estimation error.
When the measurement nose is low, e.g., in Figure 7a, A-BPF does not improve much, even with the optimal θ. However, when the measurement noise is quite large, e.g., in Figure 7b, A-BPF can effectively reduce the RMSE with the optimal θ. Additionally, the gap between A-BPF and BPF is much larger in Figure 7b than in Figure 7a. Besides, the optimal θ increases when the NPD noise rises, e.g., in Figure 7a, θ is 0.3, and it becomes 0.8 in Figure 7b.
The comparison of θ obtained in the simulation and derived in A-BPF is depicted in Figure 8. The solid curve is the optimal θ in the simulations, and the dashed curve is the optimal θ based on Equation (24) with different measurement noise. As shown in Figure 8, the optimal θ based on our calculation in Equation (24) is slightly different from the simulation results, because the optimal θ is derived according to the approximation in Equation (20). Since the optimal θ is approaching the simulation results, it is still suitable for implementation. Figure 8 also depicts that the optimal θ increases with the rise of measurement noise. It testifies for Equation (24) that θ rises when the measurement noise becomes larger. In this case, A-BPF assigns more belief for the predicted measurement for adaptation.  Figure 9 illustrates the RMSE comparison for BPF, A-BPF, GPF, A-GPF, CPF and A-CPF, and the CRLB for the adaptive filters is also depicted as the optimal indicator. The measurement noise covariance varies from 0.5 2 to 5.5 2 , and we use the optimal θ based on Equation (24) as the belief factor in the adaptive PFs. When the measurement error is small, the performances of the PFs are similar. Furthermore, we can find that the adaptive PFs have slightly higher error than the original PFs when the noise is too low. This is due to the inherent sampling method, which generate samples with higher statistical covariance than the measurement noise. In this case, our adaptation method relies on such covariance and causes extra error in the final estimation, which is as illustrated in the previous sections. When the error increases, the estimation errors of BPF, GPF and CPF rise accordingly. Especially, CPF has the highest RMSE due to the imprecise likelihood calculation, although it has constraints. This indicates that the measurement noise is the major impact that influences the estimation performance of PFs. The adaptive PFs have better performance. A-BPF and A-GPF have similar estimation accuracy. The RMSE of A-CPF is the lowest and mostly approaching the CRLB. We also compare the estimation performance of PFs with varying the particle numbers. As Crisan et al. indicate, the estimation error should converge to zero with the increasing number of particles [38]. However, simulation results show that the estimation accuracy is corrupt with high measurement noise. When the error is small, as illustrated in Figure 10a, the RMSEs of PFs can converge to a very low value with the increased particle number, except BPF and CPF, which illustrates that the measurement noise can influence the convergence of PFs, although not much. In this case, our adaptive method does not improve much, and A-GPF even has a higher RMSE than GPF. When the measurement noise begins to rise, increasing the particle number of the original PFs cannot improve the estimation accuracy. In Figure 10b, the RMSEs of BPF and CPF begin to rise when the number of particles exceed 100, and GPF does not improve either. This implies that high measurement error leads to a significantly inaccurate likelihood calculation, which degrades the estimation. In addition, CPF outperforming BPF, which is mentioned in [9], only occurs in the low noise case. However, our adaptive method can improve the estimation accuracy and make PFs converge to a low RMSE. In Figure 10, RMSEs of three adaptive PFs decrease with the rising particle number. RMSEs can converge to a very low value even when the measurement error is high. Therefore, our method can reduce the imprecise measurement effect and achieve a better performance.

The Real Indoor Experiment Test-Bed
We also employ a reference system for indoor localization test-beds to examine our proposed algorithms. In this system, we deployed 17 wireless sensor nodes, either along the corridor or in the offices of the research building. A robot carrying a sensor node as a target moved along the corridor of the building with constant speed while recording its own positions [30]. In the previous simulations, we assume that the noise follows an independent identical distribution. However, in the real indoor environment, the wireless signals are propagated through complicated channels. The LOS and NLOS range measurements are mixed for a single target. Thus, a specific distribution for each sensor or anchor is necessary to derive accurate estimation. We still use the DGM to model the measurement noise. All sensors are integrated with the nanoPAN 5375 RF module with a 2.4-GHz transceiver and a 1-Mbps data rate for range measurement, LPC 2387 as the micro-controller and a CC1101 900-MHz transceiver as the radio transceiver for communication. The data collected from sensor nodes are also range measurement values, which are based on TOA. There is an inherent error when the 3D localization is projected into the 2D map, due to the height difference between the sensors and target [39]. We adapt the heights of the sensors to be the the same as the robot to reduce the impact of the height difference in the 3D world. In this case, we can draw an exact 2D trajectory within the map. Figure 11 depicts the map of our experimental building. The triangles, which are randomly deployed, mark the sensor node positions.
We also implement the six algorithms mentioned above to compare their performance. Each algorithm maintains 200 particles, and several trajectories are examined. In Figure 12, we present one estimated trajectory. We compare the estimation performance by evaluating the RMSE. The RMSE results of the six algorithms are listed in Table 1. As shown in Table 1, the mean average errors (MAE), which are the average performance, for adaptive PF are less than the original PFs. Additionally, the adaptive PFs have about 1 m less than the original PFs in the RMSE. The max error, which is the worst case for estimation, in the original PFs is more than 6 m and even 8 m in CPF. However, adaptive PFs have about less than 6.91 m, and the max error can even reach 4.0649 m in A-BPF in Table 1, which is much more precise than the original PFs. The overall cumulative distribution functions (CDF) for both trajectories are plotted in Figure 13. Based on the comparison, our adaptive PFs generally outperform the original PFs and show robust performance in the dynamic indoor environment.

Conclusions
In this paper, we use a DGM, which combines the EPD and NPD, to describe the hybrid LOS/NLOS ranging noise in the indoor environment. Based on this model, we propose an adaptation method by introducing the predicted measurement and its belief factor θ. The optimal θ is derived and implemented into our proposed adaptive algorithms, A-BPF, A-GPF and A-CPF. The estimation performance of the adaptive PFs is proven by the CRLB analysis. In the simulation, we observed that some analytical conclusion for PFs are not suitable for the high measurement noise scenarios, and we verify that the adaptive PFs improve the estimation accuracy with different measurement noise environments and that the optimal θ derived in our method approaches the actual value. The real indoor localization experimental results indicate that, comparing to the original PFs, our algorithms can effectively reduce the estimation error, especially in a high measurement noise environment. Besides, A-CPF is more accurate for target tracking than the other filters. and CRLB is just the inverse of FIM; the estimation covariance cannot be lower than it: Note that the CRLB relies on the system model and the noise distribution. In general, using different information can lead to different results. For instance, if the system only applies the measurement model, the CRLB is higher than the recursive CRLB, which applies the previous information iteratively. In our work, we combine the previous information into the measurement model. When the model and related error distribution are changed, a new CRLB is derived accordingly.
Since p(x t , z t ) = p(z t |x t )p(x t ) based on the Bayesian theorem, it is easily seen that I(x t ) can be decomposed into two parts: where I D (x t ) represents the information obtained from measurement data and I P (x t ) represents the prior information. Firstly, we decompose I D (x t ) using the chain rule as: is the Jacobian matrix of x t : where N indicates the dimension of vector z t , x t = [p t x p t y ]. Note that we employ the measurement function according to Equation (4), then we have: and I H is the FIM conditioned on h t (x t ): which is the measurement covariance R −1 t : then the I D (x t ) for conventional CRLB is: then, we derive the FIM of the adaptive likelihood and recursive formulation based on Equation (A10).
For the adaptive PFs, the prior information is integrated into the likelihood function. Since the measurement is adapted by the prior information, I D is changed by using the adapted measurement z t , which is the combination of predicted measurement and observed data. Then, the error covariance does not follow the original error distribution, but changes accordingly. If the measurement noise also follows a Gaussian distribution, we can easily derive the covariance of adapted measurement: If the measurement noise is independent of each anchor, the covariance matrix can be simplified asR t = diag N×N (R j ), whereR j can be derived independently in each anchor. Similar to Equation (A10), I D (x t ) is formulated as:

. FIM of Recursive Estimation
The FIM of recursive Bayesian estimation has been initially formulated by Petr et al. [36]. The formulation of the FIM, which indicates the estimation procedure of the conventional PFs, is similar to Equation (A4). The FIM of measurement data can be attained according to Equation (A11). Taking the prior information in I P (x t ) into account, the prior probability is the transition probability p(x t |x t−1 ). However, both x t and x t−1 should be estimated during the process. Thus, the information about x t−1 is also not accurate. Then, the estimated parameters for Fisher information is [x t , x t−1 ] T , and the related FIM including I D (x t , x t−1 ) and I P (x t , x t−1 ) should be reformulated.
For I D (x t ) computation, it is extended to be a 4 × 4 matrix, which is: Since the current probability is independent of the previous state, ∂ ln p(x t ,z t ) ∂x t−1 = 0. Then, we obtain: where 0 is a 2 × 2 zero matrix. For the recursive process, I(x t ) is calculated recursively, which involves I(x t−1 ) as the previous FIM. Then, I P (x t , x t−1 ) is expressed as: then, I(x t , x t−1 ) is given: I(x t , x t−1 ) = I P (x t , x t−1 ) + I D (x t , x t−1 ) = A B B T C + I(x t−1 ) (A16) where: where I(x t ) is a part of I([x t , x t−1 ] T ); A, B and C are parts of I([x t , x t−1 ] T ), which represents the previous and current information [36]. Then, I(x t ) is given based on Schurcompletion: