A Robust Localization , Slip Estimation , and Compensation System for WMR in the Indoor Environments

A novel approach is proposed for the path tracking of a Wheeled Mobile Robot (WMR) in the presence of an unknown lateral slip. Much of the existing work has assumed pure rolling conditions between the wheel and ground. Under the pure rolling conditions, the wheels of a WMR are supposed to roll without slipping. Complex wheel-ground interactions, acceleration and steering system noise are the factors which cause WMR wheel slip. A basic research problem in this context is localization and slip estimation of WMR from a stream of noisy sensors data when the robot is moving on a slippery surface, or moving at a high speed. DecaWave based ranging system and Particle Filter (PF) are good candidates to estimate the location of WMR indoors and outdoors. Unfortunately, wheel-slip of WMR limits the ultimate performance that can be achieved by real-world implementation of the PF, because location estimation systems typically partially rely on the robot heading. A small error in the WMR heading leads to a large error in location estimation of the PF because of its cumulative nature. In order to enhance the tracking and localization performance of the PF in the environments where the main reason for an error in the PF location estimation is angular noise, two methods were used for heading estimation of the WMR (1): Reinforcement Learning (RL) and (2): Location-based Heading Estimation (LHE). Trilateration is applied to DecaWave based ranging system for calculating the probable location of WMR, this noisy location along with PF current mean is used to estimate the WMR heading by using the above two methods. Beside the WMR location calculation, DecaWave based ranging system is also used to update the PF weights. The localization and tracking performance of the PF is significantly improved through incorporating heading error in localization by applying RL and LHE. Desired trajectory information is then used to develop an algorithm for extracting the lateral slip along Xand Y-axis from the PF estimated position of the WMR, the lateral slip along Xand Y-axis is then used to take some corrective measures. Lateral slip information is also used to find the direction along which WMR has to move to get back along the desired trajectory. Simulation results show that our proposed LHE and RL heading estimation methods significantly improve the PF localization and tracking performance on a slippery surface in both indoor and outdoor environments. The simulation results also show that the accurate locations of WMR and desired path information are used to estimate and compensate the lateral slip.


Introduction
Wheeled Mobile Robot (WMR) has a broad spectrum of applications in both civilian and military engineering areas, such as search and rescue, mine clearance, scout, superstores, restaurants, and space exploration [1,2].Wheel slip becomes a problem when the pure rolling conditions are not satisfied.Under the pure rolling assumption, the WMR's wheels are supposed to roll without slipping.This pure rolling assumption is not satisfied when the WMR either has complex wheel-ground interactions, accelerating, decelerating, steering system has noise, or the WMR is moving on a slippery surface.Because of the above wheel slip reasons during navigation, the WMR can significantly deviate from the planned trajectory.Ignoring the effect of wheel slip may result in a larger difference between actual and command heading of the WMR, instability of tracking and localization system and a complete mission failure.So, to bridge a gap between planning and navigation, WMR slip estimation is required in order to take corrective measures afterward.There are many slip-estimation mechanisms to improve the localization, navigation, and tracking performance [3][4][5][6][7].Inertial Measurement Unit (IMU)-based localization and slip estimation [8], fuzzy-controller and current-sensing based slip estimation [9,10], Kalman and Particle Filter based slip estimation [11,12], and vision-based slip estimation [13].Most of the above wheel slip estimation techniques have some limitations.Vision-based slip estimation is not effective in an indoor environment because of light variation.A major disadvantage of using the IMU is, it typically suffers from accumulated error.Current-sensing based wheel slip estimation works only for longitudinal slip estimation.Furthermore, current-sensing based slip estimation requires some knowledge about the physical features of the earth surface.Kalman Filter based slip estimation is only applicable to linear systems.For indoor slippery surfaces tracking and localization of the WMR is more challenging than outdoor because of the presence of multipath fading and non-line-of-sight (NLOS) conditions.A comprehensive study is presented on design and hardware selection of robots [14,15].Different researchers have come up with different localization technologies for indoor environment; WiFi [16], ultrasonic [17,18], RFID tags [19][20][21], FM [22], magnetic field [23,24], IMU [25], cameras, lasers and sonar [26,27].Most of these technologies and solutions have some disadvantages and limitations especially when they are used in the indoor environment.WiFi signal strength fluctuates in a time-variant indoor environment.Vision sensors provide better performance in comparison to other types of sensors for indoor autonomous navigation of robots because of low cost, low power, high resolution and capturing ability, but they are sensitive to the light variations [28].Ultrasonic sensors (US) are cheap, low power and small in size but they suffer from having a short range and relatively low accuracy in distance measurements [29].Laser rangefinders provide better resolution and longer detection range but they are expensive, bulkier and harmful [30].GPS is not considered a good solution for the indoor environment because of weak signal coverage [31].Compared to the above technologies, localization and tracking with Ultra-Wide Band(UWB) radio technology, in an indoor environment with multipath fading and NLOS conditions, is considered to be a robust solution [32,33].Since no localization sensor takes 100% accurate measurements, it is of immense importance, to fuse the sensed location information along with their uncertainties, from different types of sensors.Bayesian filtering is a powerful tool for handling measurement uncertainties and multi-sensor fusion [34].One best Bayesian filtering based candidate to manage measurement uncertainty is Particle Filtering (PF).It is able to manage non-Gaussian uncertainties that typically appear in the presence of NLOS conditions [35,36], it is used for solving localization, tracking, and navigation problems [37].PF is a Monte Carlo estimation algorithm in which a posteriori probability density function is constructed by using weighted particles to make it suitable for the state estimation of non-Gaussian systems [38].PF is a recursive state estimation approach; it is represented by a set of discrete weighted target states, called "particles".Based on the observation gathered, at discrete time intervals, the set of particles are updated in such a way that the particles will ideally converge on the real target state.In order to track a WMR, an estimate of the WMR's position based on measurements of relative range and angles to its position from fixed landmarks is required.Distance (range) noise has minimal, while angular noise has an exponential impact on the localization and tracking accuracy of the PF because of its cumulative nature [39].Therefore, a small uncertainty in angle can severely degrade localization, navigation and tracking performance of the PF.
This article proposes, (1): Location-based Heading Estimation (LHE) and ( 2): Reinforcement Learning (RL), for heading estimation of the Wheeled Mobile Robot (WMR) in the presence of angular noise.WMR wheel slip is incorporated in PF location estimation by applying the above two methods of heading estimation.The contributions of this work are many-folds.First, wheel slip is incorporated into the WMR location which enhances the localization and tracking performance of PF on the slippery ground surface.Secondly, low-power consumption, cheap and small form factor of integrated circuit DecaWave-based system for the ranging and heading estimation of WMR is proposed, which is a good solution for resource-constrained robots.Third, an algorithm is developed which uses the accurate location of WMR and desired path information for estimating and compensating the robot lateral slip along Xand Y-axis.The lateral slip information is used to find the direction along which WMR should move to get back to the desired path and compensate the wheel slip.Simulation results validate that our proposed system not only localize and track a WMR, which is moving on a slippery surface or at a high speed but also estimate and compensate the wheel slip.It can be used for both indoor and outdoor, slippery and sandy, ground surfaces and any other type of terrain which cause robot wheel slip.The remaining of this article is organized as follow.Section 2 explains the basic theory and background knowledge, Section 3 provides an introduction of DecaWave based Real Time Location System (RTLS), Section 4 presents the results and discussion, and finally Section 5 provides the conclusions.

Bayes Filter
At the core of probabilistic robotics is the idea of dynamic state estimation from a noisy sensor data, that are not directly observable, but that can be inferred.In Bayes filters, a state at time k is represented by a random variables x k , this plays a predominant role in probabilistic robotics by inferring the real state (x k ) of the system from the observations z k .In location estimation problems, state is an object's or person's two or three dimensional location, where observations about the state are gathered from the location sensors.System state estimation means getting the state equation of the system through the observation.According to Bayes rule state-space can be represented as follow: In the above state equation, state of the system at time k is represented by the vector x k , state noise vector at time k − 1 is represented by v k−1 , and a non-linear and time-dependent function describing the evolution of the state vector is represented by g k .The state vector x k is assumed to be unobservable or inferential.Noisy measurements (z k ) provides the information about the state, which are modeled by the equation: In Equation (2), the measurement process is described by a non-linear and time-dependent function (h k ), and the measurement noise vector is represented by n k .In a Bayesian estimation, state vector at time k is estimated from all the measurements (z 1:k ) up to and including time k.At each point in time, a probability distribution over x k , called belief, bel(x k ) = p(x k |z 1:k ), represents the uncertainty.In a Bayesian statistics, this problem can be formalized as the computation of the posterior probability distribution p(x k |z 1:k ).This posterior probability distribution can be recursively calculated in two steps, prediction, and update.Bayes filters sequentially estimate the posterior density (p(x k |z 1:k )) over the random variable x k conditioned on all information contained in the noisy sensors data.Bayes rule provides a convenient way to infer the real state through the inverse of the posterior probability, p(z 1:k |x k ), along with the prior probability p(x).The belief bel(x k ) is defined by the posterior density over the state space (x k ) conditioned on all noisy sensor data available at time k: bel This posterior (bel(x k )) is contained on all the past measurements, z 1 , z 2 , . . ., z k , a sequence of discrete time observations provided by the location sensors.Because of the increased number of sensor measurements, in general, the computational complexity of such posterior densities grows exponentially over the time.Bayes filters assume that the dynamic system follow the first order Markov model, in order to make the computation tractable.According to the Markov assumption the location sensor measurements depend only on an object's current location (Equation ( 4)), and that an object's physical location at time k depends only on the previous state x k−1 (Equation ( 5)).
Under the Markov assumption, belief in Equation ( 3) can be efficiently calculated without loss of information.There are two steps in the Bayesian rule, prediction, and update.

Prediction
Prediction process is based on the prior acquired information to predict the state.Prior probability density (p(x k |z k−1 )) of the state is calculated through the state Equation (1).The probability distribution p(x k |z 1:k−1 ) can be thought of as a prior over the random variable x k before receiving the most latest observation, z k .In the prediction step, p(x k |z 1:k−1 ) is calculated from the filtering distribution

Process Updating
The updating process is modifying the prior probability density by the latest observation to obtain a posteriori density.The prior probability density is updated by the latest observation, z k , using the Bayes' rule, to obtain the posterior over the random variable, x k .The posterior probabilities p(x k |z 1:k ) can be obtained according to prior p(x k |z 1:k−1 ).It is only a prediction process in the last step, and this step amends the last step of the forecast through the observation of the k moments.The posterior probability obtained by this step is then brought into the next prediction process to form a recurrence.
Generally the calculation in the prediction and update process of Bayes filtering is not feasible analytically, hence they need for approximate techniques such as Monte Carlo sampling.

Sequential Monte Carlo Estimation (Particle Filters)
Since the prediction and update steps of the optimal filtering are analytically not tractable, one has to use the approximation techniques, such as Monte Carlo sampling.The most basic Monte Carlo method used for this purpose is Sequential Importance Sampling (SIS).The state space is divided into many parts, in which the weighted particles are filled according to some probability distribution.The posterior density is empirically represented by a weighted sum of N p samples randomly drawn from the posterior probability distribution.Consider a sequence of N p independent and identically distributed random samples x (i) n from a posterior distribution, then the Monte Carlo estimate (an estimation of the true posterior) is given by In Monte Carlo method, δ( n ) is defined as the Dirac function.When N p is very large, p(x n |z n ) approximates the true posterior p(x n |z n ).In Bayesian inference, the posterior mean of a nonlinear function can be estimated as follow: According to Monte Carlo assumption, N p independent random samples are drawn from

Sequential Importance Sampling (SIS)
Since sampling from the true posterior is usually impossible, therefore the idea of Importance Sampling (IS) is used, the importance distribution is given by x Where Equation ( 11) can be rewritten as follow By drawing the independent and identically distributed samples {x n } from q(x n |z 1:n ), Equation ( 13) can be approximated by the following Equation Equation ( 14) is called important function or importance density.Where Wn (x The central idea in Sequential Importance Sampling (SIS) is updating the particles x (i) n and their weights n in such a way that they would approximate the true posterior distribution at the next time step, n.To perform the update process, the following assumption for factorizing the importance distribution at time n, is adopted: The posterior p(x 0:n |z 0:n ) can also be factorized as The importance weights n can be recursively updated as given below In the above Equation (18), observations (z 0:n ) are gathered through the DecaWave based ranging system, which will be discussed in the following sections.

DecaWave-Based RTLS
A DecaWave based Real Time Location System (RTLS) was tested in an indoor environment to locate the position of a Tagged Moving Object (TMO) and update the Particle Filter (PF) weights.The RTLS is based on DecaWave (DWM1000), which benefit from the advantages of Ultra-Wide Band (UWB) radio technology, such as long-ranging, robustness in multipath environments, and low power consumption.It is also cheap and has small form factors of IC technology.It integrates antenna, clock circuitry, Radio Frequency (RF) circuitry and power management in a single module.It has a continuous 360°of visibility and nearly real-time response.Its low weight, small size, and ultra-low power transmission make it a very suitable solution for resource-constrained and mini-robots.Compared to other indoor systems, it offers high performance in noisy environments, coexists with current narrowband and wideband radio services and shows robustness for multi-path fading.The DWM1000 module is based on DecaWave's DW1000, a multi-channel transceiver based on UWB radio communications allows very accurately time-stamping of messages as they leave from and arrive at the wireless transceiver [40,41].It has a line of sight (LOS) 350 m and non-line of sight (NLOS) 40 m ranges in the indoor and outdoor environments, allowing it to be deployed in wireless local area network access points.This RTLS has fixed units to be configured as an "Anchor", and the dynamic one as a "Tag".The location of a Tagged Moving Object (TMO) is established using the 3 fixed anchors with known locations around the area in which the tag is located.It was found experimentally the established location has an average error in the range of (0-30) cm.This noisy location information is used for estimating the heading of a tagged wheeled mobile robot, which is moving on a slippery surface or moving at high speed.Beside the location estimation of a TMO, these anchors are simultaneously used as landmarks and ranging sensors to update the PF weights.Simulations results show the estimated heading used by Particle Filter (PF) improve the localization and tracking accuracy dramatically.To determine the absolute position of a TMO in 2-D, it is necessary to determine how far away it is from the 3 anchors.The distance between any anchor and Tagged Moving Object (TMO) is calculated from the Time of Flight (TOF).All TOF based systems work on the basis of determining the time it takes for a signal to propagate between a transmitter and a receiver.Once this time is known accurately then the distance between the receiver and transmitter can be determined since the speed of propagation of radio waves in the air is known.With this knowledge and some relatively simple mathematics (trilateration), it is possible to calculate the location of a TMO.
Figure 1a1 shows a simple two-dimensional fixed infrastructure of the Real-Time Location System (RTLS), with 3 anchors (A 1 , A 2 , and A 3 ) and one tag (T 1 ), which can be easily extended to three-dimensional.The tag (T 1 ) was connected to a computer through a Wifi-interface while the 3 anchors were placed at a 1-meter height from the ground in 3 corners of the square (500 cm 2 ) workspace.The anchor transmits a message to the tag (T 1 ) and records a time the message left its antenna (t 1 ).The tag receives the message and sends back a reply to all the anchors.The anchor records the time it receives the reply (t 2 ).The anchor then calculates the time difference T = t 2 − t 1 , and distance using the formula d i = cT/2, where c is the speed of light in the air and i = 1, 2 and 3.The distances d 1 , d 2 , and d 3 are used to calculate x and y coordinates of the tag (T 1 ) by using trilateration algorithm.RTLS was used to localize the Tagged Moving Object (TMO) as discussed above by calculating its x and y coordinates, the x and y were then sent by the tag through a Wifi interface and received by the computer.Several experiments were performed to observe the maximum and average localization error of Real-Time Location System (RTLS).Three anchors were placed one meter above the ground.A 500 cm 2 and 1000 cm 2 indoor spaces were chosen for the experiments.First a person, holding a tag in hands one meter above the ground in a 500 cm 2 , slowly move along the diagonal line between (0, 0) cm to (500, 500) cm, the experiment was performed 5 times to know the sample accuracy of RTLS, sample x, y values are shown in Figure 1b, by the black line.Then the anchors were placed in a 1000 cm 2 space, and the person holding a tag in hands move along different paths with speed equal to and higher than human normal motion, sample path pattern are shown in Figure 1c-e.During the experiments a maximum value of error in x, y was found to be about 50 cm, while the average error was found to be around 30 cm.

Wheeled Mobile Robot
We used 4-WMRs with two different types of steering systems for our experiments: (1) differential steering (DS), and (2) Ackermann steering (AS).Both types of WMRs are lightweight (about 2 kg), are battery powered, and have limited resources.

Differential Steering
The function of Differential Steering (DS) is achieved by applying more or less drive torque to one side of the vehicle than the other.A 4-wheeled mobile robot with a DS system that was used for the slip experiments is shown in Figure 1a2, this Wheeled Mobile Robot (WMR) can move around by 360°.It has a small camera, ultrasonic sensors, compass, and a tag; all these components are connected to a Raspberry Pi 3 board with a mini ubuntu installed on it.The tag is part of the DecaWave based ranging system discussed above.Tag establishes the real-time location of WMR with the help of fixed landmarks (anchors).

Ackermann Steering
The Ackermann Steering (AS) mechanism is a geometric arrangement of linkages in the steering of a vehicle designed to turn the inner and outer wheels at the appropriate angles.Features of the Ackermann based steering system robot that was used for the experiments are almost the same as the Differential based steering system robot discussed above, except the steering mechanism.Unlike the DS, AS is not very flexible to turn around 360°.During the experiments, it was observed that DS perform better than AS on a slippery ground surface.

Simulation Environment
The simulation environment is consist of; Ubuntu 16.04, Python3, and Keras with Tensorflow as a back-end.We used Turtle graphics to display simulations results on the computer screen.Keras is a high-level neural networks application programming interface, written in Python and capable of running on top of the TensorFlow.

Results and Discussion
The angular uncertainties have an exponential impact on the localization and tracking performance of the Particle Filter (PF); angular sensors are subjected to accumulated errors introduced by wheel slippage or other uncertainties that may perturb the course of the Wheeled Mobile Robot (WMR).When the WMR is perturbed from the desired trajectory, because of the wheel slip, the DecaWave based Real Time Location System (RTLS) is used to find its position.This position is not accurate and has some noise.Deep Q-Neural Network (DQNN) approach of Reinforcement Learning (RL) and Location-based Heading Estimation (LHE) are used to estimate heading of the WMR from the noisy location of WMR (provided by the RTLS) and the current weighted mean of the PF.PF localization accuracy is dramatically improved by the above heading estimation methods in the environment which cause WMR wheels slip.Desired path information and PF estimated the location of WMR is then used to estimate and compensate WMR's wheel slip.Since the heading and slip estimation methods are based on location information, therefore, they are independent of the number of wheels and shape of the robot.

Distance Noise Effect on PF Performance
Weights of the particles are updated based on the distance of Wheeled Mobile Robot (WMR) to the nearest landmark (anchor).For this simulation we used 4 landmarks at (0, 0), (500, 0), (0, 500), (500, 500) in a virtual workspace of 500 cm 2 .Noise values in the range of (0-30) cm were added in the distance of the Wheeled Mobile Robot (WMR) from the nearest landmark in accordance with the observed noise of Real-Time Location System (RTLS) discussed above.In order to measure the average error in Particle Filter Estimated Position (PFEP), the tag was allowed to move along a line between (0, 0) to (500, 500) in the workspace by making 45°angle along X-axis.The experiment was repeated several times for different values of noise added.The bar chart in Figure 2 shows errors in the moving object real position with respect to the distance uncertainty.The bar chart shows distance uncertainty has a minimal impact on PF location estimation for the following measurements.The errors have almost a linear distribution, for (0-25) cm uncertainty in the distance.The average error in moving object real position induced by the distance uncertainty was measured to be less than 6cm.

Angular Noise Effect on PF Localization Performance
Particle Filter (PF) localization performance is severely degraded by the angular noise because of its cumulative nature.To quantify the angular noise effect on PF localization performance, several computer simulations and experiments were performed.A small Wheeled Mobile Robot (WMR), which can only move forward, was used to measure the cumulative angular noise.A 10 m 2 , indoor space was considered for the experiments and the robot was allowed to move along a path inside the workspace, the experiment was revised 10 times.It was observed that the robot has a difference of 25-30°with the initial commanded angle, which can severely degrade the localization, navigation and tracking performance of PF.Computer simulations were performed to confirm the angular noise effect on PF localization performance.As shown in Figure 5a, in the absence of angular noise robot real path (red line) is in strong correlation with its PF location estimation (green line).In Figure 5a, the commanded angle of the robot is 45°with zero angular noise, while in Figure 5b the commanded angle is 45°with 0.1°of angular noise added at each motion step.Compared to Figure 5a with the zero angular noise, in Figure 5b because of the non-zero angular noise the robot followed a curved path (red line) instead of a linear path.This is because, since PF is using 45°of angle for location estimation without knowing about the angular noise, therefore PF location estimation is severely degraded as shown in Figure 5b by the green line.

WMR Heading Estimation with RL
To mitigate the angular noise effect on the Particle Filter (PF) localization performance we proposed Deep Q-Network (DQN) approach of Reinforcement Learning (RL) for heading estimation of Wheeled Mobile Robot (WMR), that slip because of moving on a slippery surface or sudden changes in speed.The estimated heading will incorporate the real world angular noise into the WMR position and improve the performance of Particle Filter (PF), in a drift-prone environment.A novel idea for incorporating the angular noise into the PF location estimation was introduced.Heading of the WMR was estimated from the location information using RL.The network was trained in such a way that a randomly moving object(Object1) was tracked by another moving object(Object2) by taking one of the several actions; 0: Right, 1: Top Right, 2: UP, 3: Top Left, 4: Left, 5: Bottom Left, 6: Down, 7: Bottom Right, in a two-dimensional square working space of 500 cm 2 .A part of the 500 cm 2 workspace is shown in Figure 3 for demonstration purpose.The actions were awarded based on the distance between Object1 and Object2.If the distance is less than or equal to 5 cm the reward is 1000 otherwise it is −1.The output of DQN is a number from 0 to 7, as shown in Figure 4a.This output is associated with angles as (0,0/360), (1,45), (2,90), (3,135), (4,180), (5,225), (6,270) and (7,315).Neural network was used by RL as a function approximation tool for heading estimation.In RL process, a function that accepts a state s and returns the value of that state as v(s) is the value function.Similarly, there is an action-value function Q(s, a) that accepts a state s and an action a and returns the value of taking that action given that state.After taking every action, the moving agent gets an evaluation feedback r, as well as perceives the new state.The update rule for learning state-action values is given below In the above function, γ is a parameter called discount factor, it determines how much each future reward is taken into consideration for updating our Q-function.The trained network model was applied to the PF for estimating heading of the WMR.The input of DQN was taken as a random point from a 30 cm 2 region where the robot is expected to exist, and the current weighted mean of the PF.The 30 cm 2 region was considered in accordance with the noise pattern observation of Real-Time Location System (RTLS) discussed above.In the simulation environment, Object1 was considered as a WMR with angular noise, and Object2 was considered as a current mean of the PF.Since most of the particles will be around the PF current mean, that's why mean is considered as Object2 to track the WMR.In the beginning, since the angular noise is not very large, the PF will use the commanded angle of the robot.As the angular noise starts to accumulate the robot starts to rely on the DQN estimated angle.After application of this DQN the cumulative angular noise was incorporated into the PF location estimation and hence the performance of the PF was dramatically improved for the conditions that cause WMR wheel slip.It's worth mentioning, that although the neural network was trained for 500 cm 2 region, it can be applied to any larger dimensional space by scaling the XY-coordinates of WMR and current mean positions by applying the following scaling formulas sx, sy = if w >= 500 and h >= 500 , SF x = w/500, SF y = h/500 sx = x/SF x and sy = y/SF y The above expression is used when the width (w) and height (h) of the WMR's workspace is greater than 500 cm, SF x and SF y are scaling factors along Xand Y-axis respectively.x and y are the coordinates of WMR and current mean of PF while sx and sy are the scaled versions of x and y.

DQN Parameters
The Deep Q-Network (DQN) discussed above, has an input layer of size 4, 4-hidden layers of sizes 200, 180, 150, 100 and an output layer of size 8 as shown in Figure 4a.The activation function of input and hidden layers is "relu" while that of the output layer is linear.The loss function is mse.X r , Y r is a random point from a 30 cm 2 region, which represents the position of the Wheeled Mobile Robot (WMR), X m , Y m is current weighted mean of the PF, while [0-7] is the output of the DQN.γ is 0.95, exploration-rate is 1.0, exploration-minimum is 0.01, exploration-decay is 0.995, sample batch size is 100 and number of episodes are 10,000.Number of steps in a single episode versus the number of episodes is shown in Figure 4b.Simulation results in Figure 5 show that angular error has an exponential impact on the performance of PF.Simulation results also show that the cumulative effect of angular noise can be mitigated using Reinforcement Learning (RL).A 500 cm 2 virtual environment was considered for simulation.0°angular noise was taken as a reference to compare the effect of cumulative angular noise, and angular noise mitigation through RL.In Figure 5a-c, the red lines show the Moving Object Real Position (MORP) and the green line shows the PF Estimated Position (PFEP).In Figure 5a, the green line shows that with a small noise in distance and no noise in the angle there is a negligibly small error in the PFEP.To show the cumulative angular noise effect on the PF accuracy, small noise of (0.1°) was added into the robot angle and increased it by 0.1°as the robot moves on, the angular noise is accumulated which severely degrades the location estimation performance of the PF shown by the green line in Figure 5b. Figure 5c shows that error in PFEP was mitigated using a Deep Q-Network (DQN) through estimating the heading of the moving object from location information.Figure 5d, compares, error of the cases a, b and c respectively.In Figure 5d, the red line shows the exponential effect of the cumulative angular error on PF performance, the green line shows that the effect of the cumulative error was reduced by using a DQN, while the blue line is a reference for showing that when there is no angular noise, error in PFEP is negligibly small.

Location-based Heading Estimation (LHE)
In Figure 6, the red circle shows the noisy location (x, y) of the Wheeled Mobile Robot (WMR) provided by the Real-Time Location System (RTLS), and the blue circle shows the weighted mean of the Particle Filter (PF).A random noise (according to Figure 1 experiments) in the range of [−15, +15] cm was added to the actual position of the WMR, the pattern of noise added in the actual position was added according to the ranging system noise observed as discussed above.In Figure 6a-d 6e.This estimated angle can be used by the PF for tracking and localization, in the environment where the robot has cumulative angular noise because of wheel slipping and noise in the robot steering system.The heading is estimated from robot location and not turning of wheels or motion direction of the robot.Therefore even if the robot heading has an error because of the wheels slip or steering system noise the estimated heading and the PF location estimation is very accurate.The following simulation results show the performance of the location-based method, for moving object heading estimation from the noisy location information provided by the Real-Time Location System (RTLS).The estimated heading information is used for improving PF's localization and tracking performance.In Figure 7e the black line shows the pattern of noise added to the location (x, y) of the robot, the blue line shows the PF Estimated Position (PFEP), the red line shows the Moving Object Real Position(MORP) and the green line shows the angular noise added.The pattern of noise added in MORP is in accordance with the RTLS noise pattern observed experimentally, described in Figure 1.Angular noise in Figure 7b was added in the original commanded angle of Wheeled Mobile Robot (WMR).WMR was supposed to move along a diagonal path from (0, 0) to (0, 500) by making 45°a ngle with X-axis.But as shown in Figure 7a,c, because of the angular noise added, the WMR follow a curved path instead of a diagonal path and deviates from its original path shown by the red line.
As shown by the blue line in Figure 7c, the PFEP is very close to the real location of the WMR, although the angle was estimated from the noisy location information provided by the RTLS.Zoom in portions of Figure 7c,e are shown in Figure 7d,f.The angular and distance noise effect on the PF localization and tracking performance, and the above two angular error mitigation models (RL and LHE) results are summarized in the Table 1.   5 and 7. Table 1 shows that our proposed Reinforcement Learning (RL) and Location-based heading estimation methods successfully mitigate the localization error induced by the angular noise.The left column of Table 1 shows the angular noise impact on the Particle Filter (PF) performance while the right column shows distance noise impact on the PF localization performance.Simulation results show that distance uncertainties have minimal impact on PF localization performance while angular uncertainties have an exponential impact on PF localization performance.In the first row of the left column when there is no angular noise, the average localization error is 9.93 cm.A 0.1°a ngular error was induced in each motion step of the Wheeled Mobile Robot(WMR), which keeps on accumulating up to 54.6°, this angular noise-induced localization error of 98.41 cm in the PF location estimation shown in the second row of left column.The last two rows of left column show that location error induced by the cumulative angular noise was significantly reduced by applying RL and LHE methods for heading estimation.Simulation results show that the proposed heading estimation methods significantly enhance the PF localization and tracking performance in the environment and conditions which results in robot wheel slip.

RL and LHE Heading Estimation in Large Dimensions
As discussed above that although Deep Q-Network (DQN) based model was trained for 500 cm 2 space it can be applied to any dimensional space by scaling of XY-coordinates.Similarly, the Location-based Heading Estimation (LHE) can also be applied to any dimensional space without scaling.In Figure 8, both methods were applied for estimating heading of Wheeled Mobile Robot (WMR) in the presence of angular noise.A 1000 cm 2 space was chosen for simulation, 0.1°of angular noise was added in the WMR heading at each motion step, the angular noise was accumulated which induced a large error in PF location estimation.In Figure 8, the black lines show paths followed by the WMR, while green lines show PF location estimation.In Figure 8a,a1, there is zero angular noise in robot heading, and hence the PF localization has a strong correlation with the original path followed by the WMR.In Figure 8b,c, 0.1°of angular noise was added in WMR's heading at each step of motion, because of the accumulated angular noise the PF localization performance was severely degraded which is shown by the green lines.Figure 8d,d1, shows that when the Reinforcement Learning (RL) based heading estimation was used, PF location estimation (green line) shows a strong correlation with the original path (black line) followed by the WMR.Similarly, when LHE was applied for heading estimation in the presence of angular noise, the PF estimated location (green line) has a strong correlation with the original path (black line) of WMR.Simulation results show that the WMR wheel slip which is caused by the angular noise can be incorporated into localization by using our proposed heading estimation methods.Then the PF localization and desired path information can be used to estimate and compensate the WMR lateral slip.In the above expressions (x dp , y dp ) is a point on the desired path while (x p f , y p f ) is a point on the last index of PF estimation array.WMR is below the desired path if n = 0, and above the desired path if n = 1.A simple case of the desired path (Figure 9a, redline) is considered for the sake of demonstrating the slip estimation algorithm, although this algorithm can be applied to any linear and non-linear two-dimensional motion.Because of the angular noise, WMR tends to move above the desired path in Figure 9a, and hence for each PF location estimation (x p f , y p f ) of the WMR, the condition of n = 1 is satisfied for this scenario.Once the value of n is known, then the following formulas are used to find the nearest point on the desired path to the current PF location estimation (x p f , y p f ) of the WMR.

index =
abs(Y adp -y p f )).argmin(), if n = 1 abs(X adp -x p f )).argmin(), if n = 0 In the above formulas, Y adp , X adp are arrays of the Yand X-coordinates of the desired path respectively."abs" mean absolute value, "argmin" will return index of the value in the Y adp or X adp which has minimum diffrence the y p f or x p f respectively.The formulas return index (index y ) of the nearest Y-coordinate of y p f if n = 1 and index (index x ) of the nearest X-coordinate of x p f if n = 0.As shows in Figure 9a, the nearest Y-coordinate of the PF location estimation at A (x p f , y p f ) is the Y-coordinate (y dp ) of the point B (x dp , y dp ) on the desired path.Figure 9a shows that the difference y p f − y dp ≈ 0 for points A and B, while x dp − x p f = X error is a positive value.For the case of n = 1 and index = index y the following expression calculate the deviation from the desired path along Xand Y-axis for the WMR's current PF estimated location (x p f , y p f ).
x error , y error = In the above expression max(Y adp ) is maximum Y-coordinates of the desired path represented by the Y max in Figure 9a.For the second case when n = 0 and index = index x the following expression calculate the deviation of WMR from the desired path along Xand Y-axis.
x error , y error = In the above 2 expressions x error , y error are deviations of WMR from the desired path along Xand Y-axis.Furthermore, atan2(y error , x error ) is used to find the direction along which the robot has to move along to get back to the desired path and compensate the wheel slip caused by the angular noise.As shown by the black line in Figure 9b, that lateral slip deviation (y error ), is almost zero till C(y p f − y dp ≈ 0), after point C, along with x error , y error also increases.Redline in Figure 9b shows the angle of the direction that WMR should move along to get back to the desired path.For the situation of Figure 9a, the angle values are either zero or negative but only their absolute values are shown by the red line in Figure 9b, so 15°= −15°in the clockwise direction or +345°in the anticlockwise direction.Redline in Figure 9b shows the angle that WMR should move with to get back along the desired path is zero until point C, and then its value decreases to reach −15.A map of the atan2(y error , x error ) is provided in Figure 6e.
The above lateral wheels slip estimation method is simple to implement, it is robust and computationally not expensive.This WMR's wheels slip estimation can be applied to a robot with any numbers of wheels and robot of any shape.This method is applicable both in the indoor and outdoor environment.The method can include the measurement uncertainty of sensors used for slip estimation.It can be applied to both linear and non-linear problems.It can be implemented to estimate wheel slip of a robot moving on any type of terrain.Table 2 shows that our proposed method outperforms many state of the art methods in many aspects.

KF
Simple to be implemented, only those sensor's output can be used, that results in Gaussian form output.It can be applied to systems with know initial state.Can only be applied to linear problems.

EKF
It is simple to be implemented, stable, and able to deal with nonlinear input and measurement noise.The high computational effort required in the definition of the Jacobian matrices suffer from the intrinsic linearization errors.
High against input and measurement noise.
Dynamic-linear or non-linear IS VSA estimated by fusing data from IS and GPS.It typically suffers from accumulated error, not good for indoor because of GPS signal weak coverage.
High robustness against changes of conditions, but low against measurement noise.Kinematic SMO VSA estimated using this observer in its linear or non-linear form, according to the type of vehicle model adopted.SMO features a faster convergence speed than EKF because it does not need to deal with massive matrix computation.
High robustness to model uncertainty and system noise Dynamic-both linear and non-linear CS Estimates wheel slippage from motor current.Works only in the direction of motion, but not laterally, and it requires some knowledge of the terrain.

RL+LHE
Simple to implement, robust, stable, deal with input and measurement noise, suitable for both indoor and outdoor.Independent of the number of WMR's Wheels and shape.Not suffer from accumulated error.
High robustness to model uncertainty and system noise.
Both linear and non-linear

Conclusions
An attempt is made to quantify the distance and angular noise effect on the performance of PF.The results show the minimal effect of distance noise while the exponential effect of angular noise on the PF performance.Reinforcement Learning (RL) and Location-based Heading Estimation (LHE) are proposed for mitigating angular noise effect on Particle Filter (PF) localization performance, in a drift-prone environment.Heading of the Wheeled Mobile Robot (WMR) was estimated from the noisy location information provided by the Real-Time Location System (RTLS) using trilateration, and the current weighted mean of the PF by using RL and LHE.The estimated heading was then used to improve the localization performance of the PF in the presence of angular noise.By using the proposed solutions, angular noise caused by the slip can be implicitly incorporated into the PF location estimation.This is a very robust algorithm for the environment with robot wheel slip on a slippery or sandy ground, and a robot with noisy steering systems.The algorithm can incorporate both linear and non-linear angular noise of any magnitude.Furthermore, an algorithm is proposed for WMR's wheel slip estimation and compensation by using the desired path information and PF location estimation.A DecaWave based ranging system is proposed, which benefits from the advantages of UWB radio communication, such as low power consumption, robustness in the multipath padding environments, and long-ranging.This ranging system is a good solution for resources-constrained robots in the indoor environment.Simulation results show that the PF localization performance in the presence of angular noise that results from the WMR's wheels slip is significantly improved by application of the above two methods.Simulation results also show that wheel lateral slip can be accurately estimated from the PF location estimation and desired path information.Lateral slip information can be used to make decisions for moving WMR back toward the desired path.

Figure 1 .
Figure 1.(a1) Real-Time Location System (RTLS) with fixed two dimensional infrastructure, and localization of a tag (T 1 ) with three fixed anchors (A 1 , A 2 , and A 3 ).(a2) Wheeled mobile robot (WMR) with the differential steering system; (b-e) shows a sample data from the RTLS experiments.

Figure 2 .
Figure 2. Y-axis shows an average error in particle filter estimated position (PFEP) induced by the distance noise, while X-axis shows noise added in distance between moving object and the nearest landmark.

Figure 3 .
Figure 3. White square surrounded by the gray squares is Object2; arrows represent the eight possible actions in a state for moving to the next state, and the dark-blue square is Object1 that needs to be tracked.

Figure 4 .
Figure 4. (a) (X r , Y r ) is a random point from a 30 cm 2 region, (X m , Y m ) is the current weighted mean of the particle filter (PF), and [0-7] is the output of Deep Q-Network (DQN).(b) Y-axis shows number of steps (NOS) in a single episode, while X-axis shows number of episodes (NOE).
Figure 4b shows Number Of Steps (NOS) taken by Object2 to arrive the terminal state, decreases with the increased Number Of Episodes (NOE), and hence the network is converged very fast.The network training model shows that the particles around the PF mean will follow the dynamic terminal state (WMR).

Figure 5 .
Figure 5.The red lines show moving-object real position (MORP), while the green line shows the particle filter estimated position (PFEP) when there was (a) 0°, and (b) 0.1°of noise added to the angle of the moving object at each motion step.(c) The red line shows MORP, while the green line shows the PFEP when reinforcement learning (RL) was used to estimate the heading from MORP with noise added to it, as well as PFEP.(d) The red line shows the exponential error growth because of cumulative angular error, the green line shows the RL angular error mitigation, and the blue line shows error in PFEP when there was 0°of error in the moving object angle.
are not the four quadrants of X/Y-plane, instead, they are the 4 configurations of robot position relative to the PF's weighted mean in the square workspace of 500 cm 2 .Depending on the current position (x, y) of moving object relative to the current weighted mean the PF (x m , y m ), there are 4 configurations [a-d], these configurations show the possible current direction of motion of the WMR.A atan2(d x , d y ) function is used for estimating the moving object heading, where d x = (x − x m ) for b and d; and d x = −(x m − x) for a and c.Similar arguments apply for d y .This function takes two arguments dy and dx, which represents a point on X/Y-plane, excluding the origin.The returned angle is in radian and is positive for y > 0, it is confined to the interval (−π, π], a map of the atan2(d x , d y ) is shown in Figure

Figure 6 .
Figure 6.Red circle shows moving object's noisy position, blue circle shows the current weighted mean of the particle filter (PF), and arrows show the direction of moving object relative to the current weighted mean of the PF.

Figure 7 .
Figure 7. (a) A comparative picture of moving object's real position (MORP) and particle filter estimated position (PFEP).(b) Cumulative angular noise versus distance.(c) A clear picture of MORP with noise and PFEP.A zoomed-in portion of (c) is shown in (d).(e) The noise added in MORP, a zoomed-in a portion of which is shown in (f).

Figure 8 .
Figure 8. Black lines show original path followed by a wheeled mobile robot (WMR); green lines show the particle filter (PF) location estimation.(a,a1) Zero angular noise.(b,c) Green lines show PF location estimation of the paths followed by WMR in the presence of cumulative angular noise.(d1,e1) Shows PF location estimation of the paths in (d,e) followed by WMR through the application of reinforcement learning (RL) and Location-based Heading Estimation (LHE) in the presence of angular noise.

4. 6 .
Location-Based Slip Estimation and Compensation n = 1, if x dp -x p f >= 0 and y dp -y p f <= 0 0, if x dp -x p f <= 0 and y dp -y p f >= 0

Figure 9 .
Figure 9. (a) Green line is the particle filter (PF) location estimation of the wheeled mobile robot (WMR), and red line shows the desired path.(b) The X-axis is similar to the X-axis of (a); X error and Y error are the WMR's slip from the desired path along Xand Y-axes, respectively.The red line shows the absolute value of the angle the WMR should move with to return back along the desired path.

Table 1 .
Comparison of distance and angular noise effects on particle filter (PF) performance.PF performance enhancement using Reinforcement Learning (RL) and a Location-based Heading Estimation (LHE) for angular noise mitigation.

Table 1 ,
summarize the angular and distance noise effect on PF localization performance presented in Figures