Outlier Detection in GNSS Pseudo-Range/Doppler Measurements for Robust Localization

In urban areas or space-constrained environments with obstacles, vehicle localization using Global Navigation Satellite System (GNSS) data is hindered by Non-Line Of Sight (NLOS) and multipath receptions. These phenomena induce faulty data that disrupt the precise localization of the GNSS receiver. In this study, we detect the outliers among the observations, Pseudo-Range (PR) and/or Doppler measurements, and we evaluate how discarding them improves the localization. We specify a contrario modeling for GNSS raw data to derive an algorithm that partitions the dataset between inliers and outliers. Then, only the inlier data are considered in the localization process performed either through a classical Particle Filter (PF) or a Rao-Blackwellization (RB) approach. Both localization algorithms exclusively use GNSS data, but they differ by the way Doppler measurements are processed. An experiment has been performed with a GPS receiver aboard a vehicle. Results show that the proposed algorithms are able to detect the ‘outliers’ in the raw data while being robust to non-Gaussian noise and to intermittent satellite blockage. We compare the performance results achieved either estimating only PR outliers or estimating both PR and Doppler outliers. The best localization is achieved using the RB approach coupled with PR-Doppler outlier estimation.


Introduction
The Global Navigation Satellite Systems (GNSS), such as the Global Positioning Systems (GPS), have been developed to provide an absolute location on an Earth-Centered Earth-Fixed (ECEF) [1]. These sensors became very popular for autonomous navigation [2] and applications of Intelligent Transportation Systems (ITS) thanks to the worldwide coverage of these constellations and the rather low cost of the receivers. Even if several works have proposed to combine GNSS data with other information sources, either sensors (e.g., Inertial Measurement Unit (IMU) [3]) or prior information (e.g., maps [4,5]), there is still a need to improve the performance of GNSS-only localization. Indeed, even in the perspective of fusion with other data, the accuracy of the GNSS estimation will impact the location result. Then, this study focuses on GNSS-only localization.
Early works estimated the receiver location based on GNSS Pseudo-Range (PR) data. Recently, the estimation of the instantaneous velocity that may be derived from Doppler measurements has been proposed. For instance, [6] introduces both the PR and Doppler measurements in the Extended Kalman Filter (EKF). These Doppler measurements may be particularly helpful in constrained environments where the number of usable observations may drop. Indeed, in space-constrained areas, the obstacles (buildings, trees) reflect the signals sent by the satellites, inducing Non-Line Of Sight (NLOS) and multipath receptions. The corrupted measurements are characterized by a positive bias that increases the estimated satellite-receiver distance in a faulty way, which is difficult to model and to correct. The

Observation Model
In this study, we consider two pieces of information provided by GNSS satellite S j . The first one is the pseudo-range ρ j that is related to the distance between the receiver and S j . Denoting by upper-script the transpose operator, the receiver position is denoted x r = (e r , n r , u r ) in the ENU (East, North, Up) coordinate local system, and the S j position is denoted x S j = e S j , n S j , u S j . We choose the ENU frame for its wide use in land navigation (since it allows us to process the 'up' coordinates separately). Then, the pseudo-range depends on x r , x S j , δ t the time bias (difference) between the two unsynchronized clocks of the satellite and receiver, respectively, c the light velocity and random noise j : Equation (1) is the simplest version of the PR observation equation. It does not represent multipath or NLOS receptions, so that they can be detected as deviations relatively to this model.
The second information piece is the Doppler measurement that is related to the receiver velocitẏ x r = (ė r ,ṅ r ,u r ) . Denotingẋ S j = ė S j ,ṅ S j ,u S j , the S i satellite velocity that is determined using broadcast ephemeris [28],ρ j = (ẋ r −ẋ S j ) · a j − cδ t + j (2) whereρ j , called the "PR rate", is equal to c D j f 1 , with f 1 = 1.575 GHz and D j the Doppler observation (in Hz) provided by S j , a j is to the unit vector collinear to the straight line through the receiver and satellite S j (a j = x S j −x r x S j −x r ), "·" denotes the dot product,δ t the clock drift and j random noise.

Localization Problem
For location estimation, different systems of equations may be considered depending on the used data: PR, Doppler measurements or both data.
Firstly, only using PR, at least four observations are required to estimate vector x r and time bias δ t by solving the system of Equation (1).
Secondly, only using Doppler measurements, theoretically vectorẋ r , time driftδ t and vector x r could be estimated, since they all appear in Equation (2). However, two hindrances to this approach are: (i) per epoch, at least seven observations from different satellites would be required, which is incompatible with robustness to satellite blockage in constrained environments; (ii) Equation (2)'s sensitivity to x r is rather poor, since x r is involved only through a j . Thus, practically, PR measurements are also used to derive an estimation of x r and then an estimation of a j , denotedã j , which is introduced in Equation (2):ρ j +ẋ S j ·ã j =ẋ r ·ã j − cδ t + j The third and last approach consist of considering simultaneously PR and Doppler measurements. The vector ξ 1 = e r ,ė r , n r ,ṅ r , u r ,u r , δ t ,δ t gathers the parameters involved in Equations (1) and (2). Having linearized Equations (1) and (2), the resolution of the derived system (having at least eight equations) can be achieved using the Gauss-Newton iterative algorithm. Specifically, if X S andẊ S denote the matrices gathering the vectors x S j andẋ S j , respectively, and the increment δξ 1 (k) to sum to previous estimate ξ 1 (k−1) (k being the iteration number) is: where z = (ρ 1 · · · ρ n ,ρ 1 · · ·ρ n ) is the vector of observations, z (k−1) is the estimation of z computed from previous (iteration k − 1) state vector ξ 1 (k−1) and H is the Jacobian matrix of the Equations (1) and (2) system (see [29] for more details).

Dynamic Model
In order to increase the robustness of the estimation, this latter may be done considering not only one epoch, but several epochs. Then, the data acquired at the different epochs should be linked through a model. In [30], the authors propose a polynomial dynamic model fitted on a time interval, including multiple epochs. Limiting ourselves to the first order, PR measurements are related using a dynamic model involving GNSS, the receiver location and speed, so that the ξ 1 vector already defined is suitable. However, considering also Doppler measurements, the acceleration should be introduced in the dynamic model, and the considered state vector becomes ξ 2 = e r ,ė r ,ë r , n r ,ṅ r ,n r , u r ,u r , δ t ,δ t .
State vectors, either ξ 1 or ξ 2 , at different instants are linearly linked through transition matrices M i,dt of the considered dynamic models, defined as follows: where the superscript τ denotes the anti-diagonal transpose operator (the transpose of the matrix with respect to the anti-diagonal). Denoting ξ i,t , i ∈ {1, 2}, the state vector at t, where o dt i+1 is the error (approximation) of the considered dynamic model. Using the dynamic Model Equation (7), we are now able to compute the expected measurements (PR or Doppler) at different instants. Specifically, denoting T = t k = t + kdt, k ∈ 0, . . . , n ep − 1 the set of epochs considered for the estimation of the solution, X S i ,t k = e S i,t k , n S i,t k , u S i,t k the satellite S i location at instant t k and the expected pseudo-rangeρ i (t k ) from S i at t k may be derived from ξ 1 : where the subscript in matrix notation [ ] l 1 ,l 2 ,l 3 denotes the restriction of the matrix or vector to rows l 1 , l 2 and l 3 and ||v|| is the norm of vector v.
In a similar way, the Doppler measurement expected at t k from S i may be derived from ξ 2 : Then, using classical regression, the state vector optimal valuesξ 1,t andξ 2,t are those minimizing the quadratic errors:ξ where I (t k ) is the set of the indices of the satellites providing measurements at t k and β is a weighting factor between the residues associated with PR and Doppler data, respectively. In previous equations, the minimization is performed considering all of the measurements (PR and/or Doppler ones) available for the considered set of epochs T . However, some of these measurements may correspond to outliers, and might then bias the estimation. In the following part the paper, in addition to the acronym PR, we use the abbreviations "Dp" for "Doppler measurement" and "(PR,Dp)" for "both PR and Doppler measurements".

Proposed Approach
In the presence of outliers, several strategies have been proposed. Robust methods aim at automatically mitigating the weight of these outliers in the estimation. For instance, PF or its variants belonging to the class of robust estimators can theoretically cope with outliers simply by giving a very small weight to the generated particles. However, if this filter has proven its efficiency against noise, we will see that too many outliers jeopardize the filter stability. Then, in the case of GPS data processing, some statistical tests have been proposed to detect the outliers, e.g., [31]. The most simple to cope with these outliers is simply to discard them from the data measurements (just as if the corresponding satellites were blocked). This is the strategy of the standard Fault Detection and Exclusion (FDE) technique implemented in the GPS receivers (even if they can only cope with at most one erroneous measurement [32]). More sophisticated strategies have also been proposed, e.g., [15,33], that aim at correcting the outliers. However, in this study, we do not consider such strategies, because we focus on the following basic main questions: • For the localization problem, are Doppler measurements less subject to outliers than PR measurements? • Does the presence of outliers also impact robust localization algorithms, such as PF or the Rao-Blackwell Particle Filter? • In the affirmative case, is it worth detecting and discarding these outliers?
Then, in the localization algorithm, we add an outlier detection step that will select the data (among those available) involved in the location estimation. Specifically, considering filtering algorithms with two steps, prediction and estimation, the outlier detection step is inserted before the estimation step.
Outliers are searched either only in the PR dataset or also in the Dp dataset, depending on the assumption on Dp robustness: • If Doppler measurements are assumed reliable like in [1], they are directly used to deriveẋ r , and the outlier detection is performed only within the PR set. • Otherwise, we assume like [7] that, even if the Doppler measurements are less distorted by NLOS reception than PR measurements (and thus, more reliable), both Doppler and PR observations are contaminated by multipaths. Then, the outlier detection is applied for (PR,Dp), so that only Dp inliers are used to deriveẋ r (and only PR inliers are considered for the estimation step further).

Outlier Detection
The outlier detection is performed using the a contrario approach that we proposed in [23] extended to the case of (PR,Dp). The a contrario approach detects the inliers as observations that are "too regular to occur by chance". "Chance" is measured through the Number of False Alarms (NFA), based on two items: a model, called the "naive" model, that represents the statistics of the outliers (the H 0 hypotheses in statistical decision theory) and a measurement that will allow the distinction of inlier and outlier sets under the "naive" model assumption. In [23], we have proposed and compared two "naive" models leading to two NFA criteria for partitioning the data between inliers and outliers. However, these models only deal with PR measurements. In this study, we extend to (PR,Dp) the first NFA criterion that experimentally leads to slightly better results than the second NFA criterion.
Before presenting the extended algorithm, let us specify the equations used and the notations. Assuming a value of ξ i denotedξ i and the satellite features (location and velocity), we are able to compute using Equations (8) and (9) the expected value of PR or Doppler measurement. Then, we can compare these expected ones to the actually observed ones. By definition, the residues are the differences between computed measurements (under theξ i hypothesis) and the observed ones: r i (t k ) associated with the PR observation at t k , ρ i (t k ), is: whereρ i (t k |ξ i ) is computed using Equation (8), andṙ i (t k ) associated with Doppler observation at t k , whereρ i (t k |ξ i ) is computed using Equation (9). In order to give the same weight to both kinds of measurements, PR and Doppler ones, r i (t k ) andṙ i (t k ) are normalized by their standard deviation, σ PR and σ Dp , respectively, and gathered into vector R: with M the cardinality of (PR,Dp) set. As for Equations (10) and (11), several epochs are considered. This allows us to increase the number of available data, as well as the quality of the estimation, provided that the dynamic model (Equations (8) and (9)) used to 'align' the data acquired at different epochs is sufficiently accurate. The number of considered epochs, n ep , is then a compromise between data availability and dynamic model approximation. In the following, S PR and S Dp denote the sets of available observations (PR and Doppler measurements, respectively) over the considered interval of epochs T .
Let us now consider a subset of measurements noted D in the whole set of observations S PR , S Dp . Given D and R (Equation (14)), δ 2 D is defined as the sum of the squares of R components for indices j belonging to D (indeed, S PR , S Dp measurements being indexed, D also corresponds to a set of indices). Then, according to [23], δ 2 D allows us to quantify the consistency of D through the NFA measure (associated with the Gaussian naive model N (0, σ)): where Γ is the Gamma function, |.| is the cardinality set operator and η 1 is a normalization term that allows us to control the average number of false alarms [17]. The χ 2 test using the SSE (Sum of Squared Error) is used in the classical RAIM (Receiver Autonomous Integrity Monitoring) method [10,34] to detect the presence of erroneous data. However, it requires an a priori parameter, namely the probability of false alarm P FA , to threshold the values of SSE. Conversely, using the NFA criterion, we are free from the fitting of a threshold parameter, since the solution is derived by optimization of the NFA function: the subset of inliers is the subset of measurements that allows us to reach the minimal value of NFA. Let us underline the difference between the parameter σ involved in the naive model and a threshold parameter: whereas a set of inliers obtained by thresholding will be very sensitive to the used threshold value, we have shown [23] that the subset D that minimizes the NFA value is very robust to naive model parameter σ.
Algorithm 1 presents the extended version of Algorithm 1 of [23] that allows us to find the subset D minimizing the NFA criterion. Here, the input parameters are the observation sets S PR and S Dp (possibly empty if Doppler measurements are not considered), the number of iterations N test , the parameter σ of naive model M, the standard deviations, σ PR and σ Dp , for the residue normalization and the binary parameter I Dp that is equal to zero or one, depending on the kind of processed data: only PR data or (PR,Dp), respectively. The output parameters are the subset D of the inliers and the estimation ofξ i .
Following the a contrario RANSAC principle (e.g., [35]), the algorithm performs different estimations or tests (loop until N test ) in order to select the best one according to the NFA criterion. Then, for each test, it performs the three following steps. First, the data selection step consists of randomly drawing d = 8 elements in S PR (the set of PR observations) or, if I Dp , d = 10 elements in S PR and S Dp (the set of Doppler measurements). The numbers eight and 10 correspond to the minimum number of observations to estimateξ 1 orξ 2 further. S PR and S Dp include any available observations performed during the considered interval of n ep last epochs. According to [23], the random drawing of observations is biased in order to favor the drawing of favorable configurations of satellites. Since we use a sliding window over epochs, there is an overlapping between the sets of considered epochs for the estimation at two successive instants. Therefore, from the processing of the previous instant, we know the inliers corresponding to previous n ep − 1 epochs. Then, like in [23], random drawing is constrained, such that: (i) there is at least one measurement per epoch; (ii) for epochs before the last one, the PR /Doppler measurements are chosen among the already detected inliers; (iii) the selection of different satellites is favored.
These d observations are used to derive a preliminary solutionξ 1 orξ 2 (depending on the I Dp value). To derive this solution, a regularization term may be added to Equation (8) or Equation (9), allowing both better conditioning of the problem and the receiver trajectory being smoother. Considering the regularization term, instead of Equation (10), we have to solve Equation (16): (16) and instead of Equation (11), we have to solve Equation (17): In Equation (16) and Equation (17), ξ i,t|t−1 , i ∈ {1, 2}, is the predicted vector state according to dynamic Model (7); abs (v) returns the vector of the absolute values of v components; and , is the vector of the regularization parameters (λ weights the importance of the deviation between estimatedξ i and predicted state vector ξ i,t|t−1 ). The Appendix specifies the derivation of Algorithm 1:ξ i Estimation Based on the NFA Criterion with Inputs S PR , S Dp , σ, σ PR , σ Dp , I Dp , t, N test and Outputsξ and D. 6 Initialize the vectors δ min [i] and NFA [i] and the scalar NFA min to +∞; 7 for 1 to N test do 8 if I Dp = 0 then 9 From S PR and S Dp , respectively, draw randomly d 2 observations x S j , y S j , z S j , ρ j and d , estimateξ 2 using Equation (11) or Equation (17) for a regularized solution; 11 Using Equations (12), (13) and (14), compute the vector of normalized residues, denoted R ; 12 else 13 From S PR , draw randomly d observations x S j , y S j , z S j , ρ j , denoted by {o 1 , ..., o d }; 14 From {o 1 , ..., o d }, estimateξ 1 using Equation (10) or Equation (16) for a regularized solution; 15 Using Equation (12), compute the vector of residues, noted R; 16 end 17 Sort R by increasing values into a vector R = r j j∈{1...M} with π (.) the index permutation function, such that ∀j ∈ {1...M} , r j = R π(j) ;
The second part of the algorithm computes the non-null residues for all of the other (not drawn) observations, either only PR or (PR,Dp). Having increasingly sorted the vector of residues, the last part of the algorithm computes the minimum NFA values by varying the cardinality of D (increasing from d + 1 to M). δ min [i] is a vector that stores the values of the minimal quadratic errors (sum of the squares of the residues) for every cardinality of subset D. Indeed, for a given cardinality of D, the NFAvalue is minimum for minimum value of quadratic error d 2 D that is achieved considering the |D| lowest values of residues (hence, the sorting of R). Then, NFA min [i] is a vector that stores the NFA values corresponding to δ min [i]; NFA min is the minimum among NFA min [i] , ∀i ∈ {d + 1, . . . , M}. The inlier subset is the set D achieving the NFA min value. Finally, state vectorξ 1 orξ 2 is estimated from D and Equation (18) (18) where R j is the residue provided by Equation (14). Algorithm 1 has a linear complexity with N test . For one iteration, the complexity mainly comes from state vector estimation (Algorithm A1, Appendix). The complexity of this latter depends on d: matrix inversion and matrix multiplication are in O(d 3 ). Then, the complexity of the sorting of R is in O(Mlog(M)). For NFA(PR,Dp), d = 10, and M varies in [12,33] considering a temporal window of three epochs. Therefore, to control the computation time, one should fit the parameter N test .
Finally, note that, even if Algorithm 1 provides estimations of GNSS receiver localization parameters, the proposed coupling between Algorithm 1 and the robust localization algorithm (PF/RBPF presented in the next section) is only done in terms of data selection. Indeed, in Algorithm 1, the provided estimation only aims at evaluating the consistency of a subset of data, whereas PF/RBPF allows for non-linear/non-Gaussian data filtering that exploits some classic a priori parameteron the smoothness of the trajectories. Such an independence between the detection step (Algorithm 1) and the filtering step (PF/RBPF) increases the robustness of the global localization algorithm.

Localization Algorithm
The particle filter, also called the Sequential Monte Carlo (SMC) method, is a numerical method that consists of approximating the posterior probability p(x t |z t ) (probability of the state x t given the set of observations z t ) using a sufficient number of particles x i t . A particle represents a state vector solution, and the associated weight w i t represents its likelihood. Such a representation based on samples/particles allows us to approximate and deal with any statistical distribution of error, especially non-parametric ones and non-Gaussian ones.

SIR-PF
The Sequential Importance Resampling (SIR) particle filter [13], also known as the "bootstrap filter", is the most popular method to solve the non-linear filtering problem.
For SIR-PF, the number of the required particles is directly linked to the dimensionality of the state vector. In order to keep a reasonable number of particles (bounded to a few thousands), we assume that either the altitude is constant, as is often in urban environments, or it is known as in our case from the outputξ of Algorithm 1, so that it has not been introduced in the state vector. For the same reasons, velocity is also excluded from the state vector (conversely to the RBPF state vector presented in the next section). Then, the SIR-PF particles are x i t = (e i t , n i t , δ i t ) , where i denotes the particle index and t is the epoch.

Prediction
Step This step, sometimes called PF time update, aims at providing an estimation of the state vector at the next time step. Note that if here, we place it at the beginning of iteration at time t, it can equivalently be placed at the end of iteration at t − 1.
To predict the next position of the particle, we need an estimation of the velocityẋ r . Since,ẋ r is not part of the state vector, it should be provided by external data. Using GPS-only data, we consider Doppler measurements to deriveẋ r : Doppler measurements at time t − 1 provide PR rates from which we derive the receiver velocityẋ r = (ė r ,ṅ r ,u r ) using Equation (3). In order to comply with common notations in the transportation and navigation community,ẋ r can be equivalently represented in terms of norm and orientation: V i t−1 = sqrt(ė 2 r +ṅ 2 r ) and θ i t−1 = arctan (ṅ ṙ e r ), respectively. Then, we predict the next state at t of the i-th particle according to: where the time step dt is equal to one and ν is the prediction noise associated with each component of the state vector. Indeed, as a stochastic approach, PF is based on stochastic simulations provided here by the addition (to the deterministic predictions state vectors) of a Gaussian noise with zero mean and standard deviation (σ e , σ n , σ δ t ).
Note that in our case, the velocity used for prediction is estimated from (ė r ,ṅ r ) at t − 1. Instead of using Doppler measurements at t − 1, we could have used those acquired at t. However, since the prediction is between t − 1 and t, it will not provide necessarily a more accurate prediction. In comparison with the RBPF (presented in Section 3.2.2), let us underline that the velocity estimation is performed as an external process to the SIR-PF itself, since velocity is not a part of the state vector.

Estimation Step
This step, sometimes called PF measurement update, aims at correcting the prediction step estimate according to the observations. Since velocity is not represented in the state vector x i t , the posterior probability of our SIR-PF is only computed relatively to the PR measurements. It is denoted p(ρ t |x i t ) with ρ t the vector of ρ i observed at t. The update process of weights w i t is a weighting function of their previous values [36] by the observation likelihood function p(ρ t |x i t ): In most cases, because of computational constraints, the likelihood function p(ρ t |x i t ) is approximated by a multivariate Gaussian density. Finally, normalization of the weights is performed so that ∑ k i=1 w i t = 1. Having updated the weights, the 'optimal' state vectorx t is derived as the weighted sum of all particles:x

Resampling
This step aims at preventing the degeneracy of the algorithm, in particular to avoid that computer resources are consumed by "unlikely" particles. During this step, a threshold is computed [36] to partition the set of the particles according to their weight [13]. Having removed the particles that present lower weight than the considered threshold, the remaining particles are duplicated in order to keep a constant number of particles, and all of the weights are reinitialized to a constant value (reciprocal of the total number of particles).

Rao-Blackwellised PF
In previous PF, the velocity was estimated directly from Doppler measurements (being 'outside' of the PF estimation step, it does not take into account previous estimations of the PF prediction step). This boils down to assuming no noise on Doppler measurements. In order to avoid such an assumption and to be more realistic, we extend the state vector from (e, n, δ t ) to e r , n r , δ t ,ė r ,ṅ r ,δ t ,ë r ,n r , , i.e., its dimensionality increases from three to eight.
However, standard PF would require a very important number of particles to explore the whole space of solutions, and the PF would become intractable. On the other hand, the Rao-Blackwellization approach [37,38] was proposed both to reduce the complexity and to better approximate the solution in case of convex functions. It is based on the idea that splitting the state vector allows us to decrease the approximate error by exploiting linear substructures [25]. A classic case corresponds to the splitting of the initial state vector into two sub-vectors, one being estimated analytically and the other one by importance sampling (e.g., PF). Thus, the number of particles required for precise estimation remains tractable thanks to the lower dimensionality of the non-linear subsystem [25,38].
Considering our problem, we split the system of eight components describing the prediction step equations into two sub-systems, a linear and a non-linear one, as follows. The equations involving PR observations (Equation (1)) are non-linear leading to a non-linear system for deriving GPS position. On the other hand, the velocity estimation knowing the position of the receiver and the Doppler measurements is achieved solving a linear system (Equation (3)). Thus, we define the two state vectors x p f = (e r , n r , δ t ) and x k f = ė r ,ṅ r ,δ t ,ë r ,n r .
The posterior probability of the RBPF is factorized: where z t still denotes the set of observations. The first term is solved analytically using EKF, and the second term is estimated by Monte Carlo sampling using PF. Then, in RBPF, we can keep the same number of particles as in Section 3.2.1, while considering also the receiver velocity in the state vector and filtering it. The proposed model for RBPF is triangular: where I n×n is the square identity matrix of dimensionality n, 0 m×n is the rectangular zero matrix of dimensionality m × n, Q p f and Q k f are the covariance matrices of the noise, which is assumed zero mean Gaussian (for notation shortness, we omitted the time dependency for covariance matrices) and A p f ,dt and A k f ,dt are the transition matrices defined as follows: The non-linear part is processed using the same PF presented in Section 3.2.1 to estimate the state vector of each particle x p f (i),t and its associated weight w i t . The linear part is processed using an EKF applied to the state vector x k f (i),t of each particle recursively. EKF involves two main steps:

Prediction Step
This step occurs between the prediction step and the estimation step of the SIR-PF. We define intermediate variables, (27) where y t is interpreted as an error measurement and L t and N t are intermediate matrices modeling the impact of the non-linear system on the linear estimation. Then, where P t−1|t−1 is the covariance matrix of x k f t . Note that if the A p f t matrix is null, previous equations boil down to Kalman's filter prediction step. Note that, since the prediction step presented in Section 3.2.1 is involved in Equation (27), the current prediction step occurs after the prediction of the non-linear part of RBPF.

Estimation Step
This step occurs between the estimation step and the resampling of the SIR-PF. It is the classical correction step of the extended Kalman filter.
where C t is the observation matrix of Doppler measurements derived from Equation (3). This analytical correction of thex k f ,t|t subvector is independent from the estimation ofx p f ,t|t that is performed according to the estimation step presented in Section 3.2.1 (Equation (20)).
One of the objectives of this study was to check the interest of removing outliers from the datasets, either PR or (PR,Dp). This can be achieved by comparing the localization results obtained using outlier detection coupled with PF or RBPF.

Experiment and Results
In order to test our localization method, we have acquired data in constrained environments: an urban canyon and forest, characterized by NLOS reception. Figure 1b shows the receiver trajectory in the South of Paris (France). It is 5 km long for an experiment duration of 11 min.  Figure 1a shows the used experimental vehicle ZOE that is equipped with two low cost GPS and one high cost GPS. The two low cost GPS are GARMIN 18x and UBLOX EVK-5T, which are single-frequency receivers delivering the positioning solution at 1 Hz. The high cost GPS is an APS-3 multi-frequency and multi-constellation receiver (L1/L2/L2C GPS, GLONASS and satellite-based augmentation system (SBAS)) that belongs to the GPS class RTK (Real-Time Kinematic). This latter has a sampling frequency equal to 1 Hz, and its location accuracy is equal to 1 cm, according to factory specifications in the case of the "fixed solution". This solution was available during 41% of the experiment (cf. Figure 1c), whereas the two other solutions, 'RTK float' and 'differential', whose precision may drop until 40 cm, were available during 39% and 20%, respectively. APS-3 is used for two purposes: to establish the ground truth and to get the raw data used in the localization algorithms. However, the considered raw data are not post-processed by APS-3.

Platform and Parameters
The GARMIN 18x has an accuracy (measured by the root mean square error) equal to 5 m in location and 0.05 m/s in velocity. Finally, the UBLOX EVK-5T acquires only PR measurements (no Doppler measurements) and is specified to have a location accuracy of 3 m in the static case and an open area.
The configuration of the satellites during the experiment is shown in Figure 2. The number of available satellites varies between four and 11 with an average equal to nine.
For the used algorithms, the parameters are:

Localization Results
The global performance of the localization is represented in terms of the cumulative distribution curve: the better the result, the greater the area below the cumulative distribution curve. In this study, we consider eleven localization processes. Two of them are GPS solutions themselves: either the UBLOX or the GARMIN GPS. The GARMIN and UBLOX EVK-5T solutions are plotted just as references, since it would be unfair to compare high cost and low cost GPS. However, we note that the GARMIN solution seems rather interesting, and even if the GARMIN algorithm is unknown, we may guess that it uses preprocessing of the measurements. For instance, if it uses the satellite elevation mask (discarding the satellites having an elevation lower than 15 • ), according to Figure 2, the satellites S 4 , S 10 , S 11 , S 31 and S 32 will not be used, which corresponds to frequent outliers, as we will see further.
The other processes correspond to different versions of the extended Kalman filter, the particle filter and the Rao-Blackwellised PF: without removing any outliers, by coupling it with the PR outlier detection or with the (PR,Dp) outlier detection. In the three filters (EKF, PF and RBPF), the initial solution is provided either by the least mean square solution or by the output of Algorithm 1 when there is an outlier detection step. For comparison, we also implement a recent robust outlier method called ORKF (Outlier Robust Kalman Filter) [39]. It is similar to the EKF, except that the covariance of the observation noise is estimated recursively inside the estimation step (releasing the assumption on the measurement precision). Figure 3 and Table 1 allow us to draw the following conclusions: • Among the implemented algorithms, the Particle Filter (PF) provides rather disappointing results with an error lower than 6 m in only 55% of cases. This relatively bad performance of PF, against EKF for instance, is probably due to the fact that the velocity is not part of the state vector; it is not at all filtered, conversely to the case of the EKF. • The ORKF has better performance than the simplest version of PF and the classical EKF, and similar performance to EKF + NFA (PR) and EKF + NFA (PR + Dp) when the errors are less than 6 m.
• By removing the PR outliers at the entry of the filters, EKF + NFA (PR) and PF + NFA (PR) allow for much better localization than the 'all-data' EKF, PF or even ORKF for errors lower than 6 m. Besides, if EKF + NFA (PR) still performs better than PF + NFA (PR) for errors lower than 6 m, the gap has narrowed, and in terms of errors lower than 3 m, PF + NFA (PR) outperforms EKF + NFA (PR). • By removing also the Dp outliers, PF + NFA (PR,Dp) provides better results than the previous methods. For instance, its 95th percentile corresponds to an error lower than 9 m, whereas PF + NFA (PR) percentile error is 11.5 m. This clearly illustrates the interest of removing also the Doppler outliers, especially as they are not filtered (by the estimation step of PF). Conversely, in the case of the EKF where velocities are filtered, the effect of removing Dp outliers is less clear: it appears just for errors lower than 9 m. • By removing the PR outliers, RBPF + NFA (PR) has the same performance in localization as the PF + NFA (PR,Dp) version (see Table 1). This can be explained by the fact that, by filtering the velocity estimation, RBPF is rather robust to outliers in Doppler measurements. It also outperforms EKF + NFA (PR). • Finally, removing also the Dp outliers, RBPF + NFA(PR,Dp) outperforms all of the other results.
According to Table 1, if the performance for PR + NFA (PR,Dp) and the two RBPFs is close under 3 m, a higher level of confidence is achieved by RBPF + NFA (PR,Dp) for errors lower than 6 m and 9 m.    Table 2 shows the global precision of the localization. Precision was evaluated through three indicators: the Norm 1 norm, the Norm 2 and the mean and standard deviation of errors. Norm 1 and Norm 2 can be computed on east and north coordinates: precisely, denoting i the error of the position at instant i along a given direction (east or north),  Figure 3: Among the implemented algorithms, when using all-data, EKF and ORKF show good performance, and when removing the outliers (either PR or (PR,Dp)), RBPF outperforms the other approaches. The best results are obtained for NFA (PR,Dp) coupled with RBPF, even if the interest of removing outliers can also be noticed in the case of EKF or PF. Finally, to quantify the improvement due to the NFA outlier detection, we run RBPF with an elevation mask removing satellites below 15 • (as is usually done on most GNSS receiver devices). The results are: Norm 1 = (2.82, 3.10), Norm 2 = (4.68, 4.38) and (µ loc , σ loc ) = (4.64, 4.42). As expected, localization is less accurate than RBPF + NFA (PR,Dp) or even RBPF + NFA (PR), showing that the satellite elevation criterion does not exactly fit the outlier detection. Table 2. Localization error (in m) on (east, north) coordinates, Norm 1 and Norm 2 of error, error mean and standard deviation: comparison of the four versions of KF, the five versions of particle filters and the two GPS solutions on our 11 min 40 s experiment.  Table 3 shows the localization error computed on the three subparts of the trajectory corresponding to the three RTK solution qualities. The localization results are those obtained using RBPF with removal of outliers, either in the PR dataset or in the (PR,Dp) one (we focus on the best results), and the considered errors are computed as previously in terms of Norm 1 , Norm 2 on east and north coordinates and the mean and standard deviation of the distance between estimation and ground truth. From Table 3, we observe a 'correlation' between the quality of the localization result and the RTK quality: localization is more precise on the RTK fixed part than on the RTK float part, and the differential part presents the worst localization results. There are two interpretations of such a fact: (i) the imprecision of the ground truth in the case of RTK float or the differential solutions introduces a supplementary error that slightly degrades the estimated precision of the localization; (ii) the RTK fixed solution occurs mainly in open areas (whereas the RTK float solution also occurs in an urban environment and the differential solution in the forest part; cf. Figure 1b and 1c) where localization is generally good. Indeed, looking at the localization precision distribution versus the RTK solution for other methods , we also note that the results are more precise on the RTK fixed part of the trajectory.

Validation of the Outlier Estimation
In this section, we aim at checking the efficiency of Algorithm 1 in terms of outlier detection. The tricky part is the derivation of a 'ground truth' in terms of outliers. First of all, note that the definition of an outlier itself depends on the adopted point of view: from the statistical point of view, an outlier is a measurement considerably dissimilar or inconsistent with the remainder of the data [40], whereas from the physical point of view and according to the considered application, an outlier is then a measurement affected by multipath or NLOS reception. In this study, we adopt the statistical definition, and we derive an estimation of the biases, like in [7], as follows.
Among the (PR,Dp) set, we want to derive the subset of observations that behave as outliers from the statistical point of view. The only "ground truth" we have is the receiver position that is provided by the APS-3 GPS + GLONASS RTK. The construction of a "ground truth" about outliers from this ground truth about receiver localization proceeds in two steps: (i) firstly, estimation of the biases between observed measurements and expected ones; (ii) secondly, analysis of the biases to classify them as induced by outliers or by inliers.

Bias Estimation for Qualitative Analysis
For the first step, we have to estimate the 'expected' measurements from the receiver localization ground truth. This latter allows us to derive the Euclidean distance between the satellite S j and the receiver position, d j . However, we still need to estimate the clock biasδ t . In [7], the equation (ρ j − d j ) was used. However, the mean estimator is not robust to outlier presence nor to the fact that the oscillator embedded on GPS receivers is not stable nor accurate. Then, we rather use the M estimator [41] as a simple solution among robust estimator class: Assuming e the vector of residues of clock bias estimation (e j = ρ j − d j − cδ t ), α(e j ) is the weight coefficient defined by α e j = 1 |ej| , and the optimal clock bias cδ t and the PR bias estimate∆m j are: In a similar way, we derive the biases∆ṁ j on Doppler measurements knowing both the velocity and location of the GPS receiver. Figures 4 and 5 allow us to check qualitatively the consistency between the large biases (either in PR or Doppler measurements) and NFA detected outliers. Specifically, they show the temporal variation of the estimated biases for PR and Doppler observations of each satellite, and the values detected as outliers by the NFA algorithm are pointed out (by a red marker). We also observe that the estimated signal in Equation (33) is probably affected by atmospheric and electronic noises that differ from one satellite to another. This satellite specificity induces different biases even between consistent curves (e.g., see the S 17 , S 20 and S 23 curves in Figure 4). In the Dp case, the estimation is less affected by atmospheric noise, so that the peaks in Figure 5 corresponding to potential NLOS reception or multipaths reception appear clearly.

Bias Classification for Quantitative Analysis
To evaluate quantitatively the efficiency of outlier detection, we have to label the previously estimated biases either as (induced by) "outlier" or as "inlier". Such a labeling was done only for thẽ ∆m j (due to the objective difficulty of labeling the∆ṁ j ) by a human operator as follows. For every epoch t of the experiment, a bias∆m j is labeled "outlier" if, at t, it appears neither consistent with the average bias of the considered satellite nor with the other satellite biases. Practically, a thresholding step relative to the median value of all∆m j at t is first applied (the test of consistency with other satellites), then followed by visual inspection of the selected biases. For instance, at time 351 s, even if S 23 presents a rather important∆m j value, only S 1 and S 4 are labeled as outliers. Even if previous labeling includes a subjective part, we assume that it is statistically unbiased, and we use it to analyze statistically outlier detection results.
From previous bias labeling, on the one hand, and inlier set D provided by Algorithm 1, we compute the numbers of True Positives (TP, PR ∈ D with bias label "inlier"), false alarms called False Positives (FP, PR ∈ D with bias label "outlier"), misdetections called False Negatives (FN, PR / ∈ D with bias label "inlier") and True Negatives (TN, PR / ∈ D with bias label "outlier"). From these statistics, the accuracy  Table 4 shows the obtained values. The two presented coupling ways between particle filter and outlier detection either restricted to the PR measurements or for (PR,Dp) are called "NFA (PR)" and "NFA (PR,Dp)", respectively. By comparing these two approaches, we note that the performance of both of them is very high. Besides, they appear very close given the statistical imprecision and the labeling process.

Correlation between PR and Doppler Outliers
Having, at least qualitatively, positively assessed the outlier detection, we can interpret its result also in terms of the occurrence of Doppler outliers.
In terms of global statistics and correlation between PR and Doppler outliers, during the experiment, NFA (PR) excludes 9.83% of available PR observations, whereas NFA (PR,Dp) discards less PR observations (8.37%), but discards 2.85% of available Doppler observations. Among the Doppler outliers, 54% are also PR outliers. Thus, one can deduce that, according to these statistics, in constrained environments, Doppler measurements present three-times less outliers than PR measurements, but they, nevertheless, suffer from NLOS or multipath phenomena.

Conclusions
In this paper, a new approach able to cope with a significant number of outliers was presented for GNSS positioning. Based on a contrario modeling, the Number of False Alarms (NFA) criterion allows us to partition the pseudo-range and Doppler measurements between inliers and outliers. Then, detected outliers are removed from the dataset to achieve robust estimation of receiver position and velocity. Two models based on particle filtering have been considered for the localization process. The first model (PF) only filters the receiver position, whereas the second model (RBPF) is a more complete filter that handles receiver position and velocity and using both PR and Doppler observations in its estimation step. Tests have been performed in the case of a receiver on board a vehicle traveling in urban canyons and forest areas. Results show that, by excluding erroneous measurements and filtering the noise of the observations, more accurate localization is achieved.
Future work will deal with the optimization of the time processing and memory. The a contrario approach may be parallelized, since the N test loop in Algorithm 1 may be run independently, and the comparison of the results to get the solution minimizing NFA may be only done at the end of the algorithm. Besides, the prediction part of the RBPF can also be processed simultaneously with the outlier detection. We will also investigate a stronger coupling between particle filtering and a contrario estimation. The proposed a contrario detection algorithm will not only provide the partition between inliers and outliers, but it could also provide an estimate of the state vector (used to interpret the measurements) that may be combined with the particle filter estimate in a fusion process. Finally, we aim at using a more sophisticated observation model instead of basic Equation (1), e.g., including the atmospheric effect, to analyze the detected outliers and, when possible, to correct them.

Appendix
Equation (18) is solved using Algorithm A1. It is presented in the case of ξ 2 estimation, but the case of ξ 1 may be derived as a specific case. The input data are the observations, for the previous solution ξ (2,t|t−1) and the regularization parameter vector λ. Like in most practical applications, the regularization parameter is fitted (or learned) using a first set of data. In our case, we do not regularize clock bias and clock drift, δ t andδ t , so that the corresponding λ coefficients are set to zero. This algorithm involves the computation of the Jacobian that is as follows.
Let us define two non-linear functions: where X = (e r , n r , u r , δt,ė r ,ṅ r ,u r ,δt,ë r ,n r ) is the unknown state vector. Then: E r = e r +ė r dt +ë r dt 2 2 , N r = n r +ṅ r dt +n r dt 2 2 , U r = u r +u r dt, E r =ė r +ë r dt, N r =ṅ r +n r dt, ∆t = δt +δtdt, a e i = e S i −E r R i , a n i = Where dt is the time update and R i = (E r − e S i ) 2 + (N r − n S i ) 2 + (U r − u S i ) 2 is the distance receiver/satellite for all S i , i ∈ {1 . . . n}.
The Jacobian is: · · · ∂ f n (X) ∂n r ∂g 1 (X) ∂e r ∂g 1 (X) ∂n r · · · ∂g 1 (X) ∂g n (X) ∂e r ∂g n (X) ∂n r · · · ∂g n (X) where: ∂n r = dt 2 2 ∂ f i (X) ∂n r and: ∂ė r = a e i , ∂g i (X) ∂ṅ r = a n i ∂g i (X) ∂u r = a u i , ∂g i (X) ∂δt = −cdt ∂g i (X) ∂ë r = a e i dt, ∂g i (X) ∂n r = a n i dt The derivatives of the unit vector are given by: Algorithm A1: Estimation Based on the non-Linear Solver and Including the Regularization Term, Inputs: Matrix S, Whose Lines are the Observations x S i , y S i , z S i ,ẋ S i ,ẏ S i ,ż S i ,ρ i , ξ (2,t|t−1) , λ and Outputsξ 2 ; the Same Algorithm to Estimateξ 1 Omitting Dp Measurementsρ.