Multi-Stage Pedestrian Positioning Using Filtered WiFi Scanner Data in an Urban Road Environment

Since widespread applications of wireless sensors networks, low-speed traffic positioning based on the received signal strength indicator (RSSI) from personal devices with WiFi broadcasts has attracted considerable attention. This study presents a new range-based localization method for outdoor pedestrian positioning by using the combination of offline RSSI distance estimation and real-time continuous position fitting, which can achieve high-position accuracy in the urban road environment. At the offline stage, the piecewise polynomial regression model (PPRM) is proposed to formulate the Euclidean distance between the targets and WiFi scanners by replacing the common propagation model (PM). The online stage includes three procedures. Firstly, a constant velocity Kalman filter (CVKF) is developed to smooth the real-time RSSI time series and estimate the target-detector distance. Then, a least squares Taylor series expansion (LS-TSE) is developed to calculate the actual 2-dimensional coordinate with the replacement of existing trilateral localization. Thirdly, a trajectory-based technique of the unscented Kalman filter (UKF) is introduced to smooth estimated positioning points. In tests that used field scenarios from Guangzhou, China, the experiments demonstrate that the combined CVKF and PPRM can achieve the highly accurate distance estimator of <1.98 m error with the probability of 90% or larger, which outperforms the existing propagation model. In addition, the online method can achieve average positioning error of 1.67 m with the much better than classical methods.


Introduction
Outdoor pedestrian positioning in the urban road environment is a challenging but significant topic in the field of intelligent transportation systems (ITS), which has great application prospects, such as transportation planning, security monitoring, pedestrian counting, traffic signal optimization and traffic guidance. Various moving sensors have been developed based on passive and active positioning technologies for capturing mobile target movement dynamics [1]. However, due to complicated factors like the unknown target moving and its outdoor surrounding environment, it is difficult to locate and track mass of mobile targets in cluttered urban road environments, such as vehicles, pedestrians, and cyclists.
Generally, the Global Positioning System (GPS) is considered to be a dominant technology for outdoor localization because of its worldwide availability and high positioning accuracy. However, it requires users to install and operate a mobile application to spontaneously transmit GPS data to the remoted center, which is inconvenient and a serious challenge to battery-based user devices in real life. Meanwhile, GPS-based approaches are also unreliable or even unavailable in dense urban areas owing

•
To solve the problem that it is difficult to accurately establish the RSSI-distance relationship in the RSSI-based positioning scheme, we propose a novel PPRM to capture the uncertainty of RSSI fluctuation. By contrast with the traditional propagation model (PM) or polynomial regression model (PRM), the developed RSSI-distance relationship can be formulated into a dynamic nth-degree polynomial to improve Euclidean distance estimation for pedestrian localization. • Different from the previous filtering algorithms for indoor environment, we propose a new CVKF fusion algorithm to handle real-time RSSI fluctuations for the outdoor pedestrian positioning, and prove that CVKF + PPRM can further improve the distance estimation accuracy.

•
We design an entire system of a multi-stage pedestrian positioning by using the combination of PPRM, CVKF, LS-TSE and UKF, which can achieve high-position accuracy performance in an urban road environment. By contrast with the GPS-based method that requires users to install software and initiate positioning requests, this positioning scheme based on WiFi scanner data can real-time locate pedestrians to help transportation agencies better monitor the abnormal situation of pedestrian flow and behavior.
The remainder of this paper is organized as follows. Section 2 briefly reviews the related work. The detailed methodology is formulated in Section 3. Section 4 describes field experiments and analyzes the localization results. Finally, conclusions and future work are drawn in Section 5.

Related Work
The use of WiFi positioning technologies has been widely discussed by many researchers in the past decade, and the most commonly adopted localization prototype is to use RSSI [14,15]. In terms of whether distance estimation is required or not, schemes can be divided into two groups: range-free and range-based methods [15].
The range-free localization methods do not need to utilize the physical distance to determine terminal location. One of the most widely used range-free methods is fingerprinting localization, which mainly investigates the difference between the received mobile device's fingerprint from multiple WiFi scanners and the reference points (PRs) [27] or the occurrence probability at the same location [28]. The performance of these methods depends on the quantity of reference points adopted per unit area, namely, the reference point density. In order to relieve the training burden while maintaining the performance, many scholars have proposed various improvement measures. Liu et al. [29] have proposed a transfer learning-based framework to enhance the scalability of the fingerprint-based indoor localization framework by reducing offline training cost without affecting the accuracy of localization. Sun et al. [30] have proposed a Gaussian process regression model to predict the fingerprint spatial distribution of signal strength at the uncalibrated area to amplify fingerprints when the reference points are limited. Kuo et al. [31] enriched the reference points by interpolating some virtual points and creating their fingerprints. Generally, these measurements might help partly to reduce the fingerprint calibration efforts, but will fail on the large-scale urban road network for travelers' localization.
By contrast, the range-based methods include RSSI signal filtering and physical distance estimation from WiFi sensors to targets, namely sensor-target distance. In practice, the collected RSSI signals from user devices are very inaccurate and unstable due to noise disturbance at the outdoor environment. Thus, some filtering algorithms have been reported to cope with these noises, such as Kalman filter [32], Bayesian filter [33], and Particle filter [34]. Meanwhile, it is also difficult to accurately calculate the non-linear relationship between RSSI and sensor-target distance due to the cluttered urban road propagation environments. One of the typical estimations is the radio propagation model based on log-distance path loss function, which uses logarithms to describe the RSSI-distance relationship [35,36]. However, these kinds of method are very sensitive to the surrounding environments. Therefore, some researchers have reported improved methods, such as the general regression neural network (GRNN) [32], polynomial regression [11], curve fittings [35] and segmentation fitting method [37]. As known, the main disadvantage of these machine-learning methods is that they generally take multiple iterations to converge to the expected solution based on the huge training dataset. In [38], they conducted a comparative evaluation analysis on four models (log-distance path loss model, exponential model, power model, and polynomial regression model), and found that the polynomial regression held the best results.
In this paper, our proposed two-stage method can achieve a high-accuracy positioning for low-speed pedestrian due to the use of fitted RSSI-distance function by considering the non-linear signal-distance path-loss and continuous target movement tracks along urban roads.

Model Development
The range-based localization scheme generally can be categorized into two stages: the offline stage and online stage. At the offline stage, the RSSI-distance relationship will be formulated and calibrated, whereas at the online one, the real-time RSSI data of a mobile device is collected and processed for localization. In this section, we present the details of the proposed RSSI-based localization scheme in the urban road environment. For simplicity, this paper only considers the localization in 2-D space and positioning of one moving target (a person), and let L t = (x t , y t ) denotes the mobile target location at time t.

System Overview
Suppose that there are M WiFi detectors and W mobile targets corresponding to W media access control (MAC) addresses at the road environment. Let L m = (x m , y m ) denotes the location coordination of the mth WiFi detector, m=1, 2, . . . , M. For each WiFi detector, the algorithm searches the same MAC address in the adjacent detector set. Notably, one can easily extract the series of its data via multiple adjacent WiFi detectors, including RSSI, timestamp, the MAC identification of the related WiFi detector. For the wth MAC address detected by detector m, the RSSI dataset sorted by timestamp is assumed to be s MW t . Among, let s MW t = s mw t m = 1, . . . , M, w = 1, . . . , W denotes the RSSI measurement value of the mth WiFi detector for wth MAC address at time t. The objective of the RSSI-based localization scheme is to estimate the mobile target position on the road network according to all RSSI measurement vectors s MW 0,...,t = s mw 0,...,t m = 1, . . . , M, w = 1, . . . , W . The proposed range-based localization scheme is illustrated in Figure 1, which mainly consists of four steps belonging to two stages. The first step is the offline training stage to formulate the RSSI-distance relationship by using PPRM to realize the automatic fitting function segmentation. Also, the specific nth-degree polynomial will be calibrated at different fitting function segmentations, which combines a Gauss filter and piecewise function in the training process in order to capture the RSSI propagation characteristics at the urban environment as authentic as possible. The last three steps belong to the online positioning stage. At the second step, the combination of CVKF and PPRM is developed to filter the real-time RSSI value and estimate the sensor-target distance via the estimated RSSI-distance mapping at the first step. The third is real-time to calculate the user's position by the algorithm of LS-TSE. Finally, this study will smooth the position estimator via UKF by the usage of historical points. Notably, the function of pre-processing Gaussian filtering and the CVKF algorithm in Figure 1 are different. The former is performed during the offline training phase in order to reduce the fluctuation of RSSI value at the stationary observation point. The latter is designed during the online positioning phase such as to reduce the fluctuation of the captured real-time RSSI value.

Existing Propagation Model
The common propagation model (PM) for describing the relationship of RSSI and sensor-target distance is the log-distance path loss [35,39]. By using some signal samplings obtained at several observed points, the PM coefficients can be calibrated. Given an observed RSSI, the sensor-target distance can be estimated based on the calibrated PM as follows [35]: where, γ represents a path-loss parameter related to the specific wireless transmission environment; 0 d PL is the RSSI at the reference distance d0; N(0, σ 2 ) is a random variable belonging to zero-mean Gaussian distribution with a variance σ 2 ; and PL(d) denotes the detected RSSI corresponding to the transmission distance d. Generally, PM with an ideal propagation might work well in line-of-sight (LOS) scenarios in the outdoor spacious environment. However, this model is too simple to obtain accurate distance in cluttered urban road environment due to signal reflection, shadowing and multipath transition from skyscrapers, tunnels, vehicles, concrete walls and other construction materials.
Recently, some researchers have reported that the polynomial regression model (PRM) has a good ability to describe the RSSI-distance relationship [38]. Unfortunately, the attenuation speed of RSSI strength will change as the transmission distance exceeds a certain threshold. Consequently, the goodness-of-fit of the PRM will gradually decline, and the tiny deviation of RSSI can lead to much larger error in distance computation [37]. To overcome this problem, this paper proposes the PPRM to formulate the relationship between RSSI and distance for outdoor localization.

Piecewise Polynomial Regression Model (PPRM)
By contrast with the existing PM and PRM, the proposed PPRM assumes that the RSSI-distance relationship can be formulated into a different nth-degree polynomial at the different level of RSSI,

Existing Propagation Model
The common propagation model (PM) for describing the relationship of RSSI and sensor-target distance is the log-distance path loss [35,39]. By using some signal samplings obtained at several observed points, the PM coefficients can be calibrated. Given an observed RSSI, the sensor-target distance can be estimated based on the calibrated PM as follows [35]: where, γ represents a path-loss parameter related to the specific wireless transmission environment; PL d 0 is the RSSI at the reference distance d 0 ; N(0, σ 2 ) is a random variable belonging to zero-mean Gaussian distribution with a variance σ 2 ; and PL(d) denotes the detected RSSI corresponding to the transmission distance d. Generally, PM with an ideal propagation might work well in line-of-sight (LOS) scenarios in the outdoor spacious environment. However, this model is too simple to obtain accurate distance in cluttered urban road environment due to signal reflection, shadowing and multipath transition from skyscrapers, tunnels, vehicles, concrete walls and other construction materials.
Recently, some researchers have reported that the polynomial regression model (PRM) has a good ability to describe the RSSI-distance relationship [38]. Unfortunately, the attenuation speed of RSSI strength will change as the transmission distance exceeds a certain threshold. Consequently, the goodness-of-fit of the PRM will gradually decline, and the tiny deviation of RSSI can lead to much larger error in distance computation [37]. To overcome this problem, this paper proposes the PPRM to formulate the relationship between RSSI and distance for outdoor localization.

Piecewise Polynomial Regression Model (PPRM)
By contrast with the existing PM and PRM, the proposed PPRM assumes that the RSSI-distance relationship can be formulated into a different nth-degree polynomial at the different level of RSSI, Sensors 2020, 20, 3259 6 of 20 and the polynomial coefficients are calibrated by using a training dataset. Moreover, PPRM could automatically divide RSSI into different level to calibrate a series of polynomial regression functions. Correspondently, this study assumes that there are K different piecewise segmentations, and let Φ w k (s) denotes the fitting function at the kth segmentations from the wth mobile device. In order to simplify model description, we will take a detector and a target to formulate PPRM model as an illustration. Let s j denotes the RSSI of the jth observed point after Gaussian filtering. In this paper, we use a group of linearly independent polynomial functions to build the fitting function: where, Φ(s) is the estimated distance from the target to a given WiFi detector; n denotes the fitting degree; and a i represents the coefficient of the nth-degree polynomial fitting function, where i = 0, 1, . . . , n. The polynomial coefficients can be estimated based on the training dataset. Moreover, if s j belongs to the kth segmentations, λ = 1; otherwise, λ = 0. The detailed procedure of the PPRM method in this study is developed as follows: Step 1: Pre-process the RSSI value via Gaussian filter. Generally, the collected RSSI raw data sharply fluctuates because of the randomness of Radio Frequency (RF) signals. Even at the same location, RSSI will also fluctuate up and down quickly from 0 to 10 dBm [40]. However, numerous reduplicated experiments showed that the distribution of RSSI at a certain point can be seen as a Gaussian distribution [37].
During the offline training phase, we assume that there are P raw data at the jth observed point from the same sensor. For the pth raw data (p = 1, 2, . . . , P), the probability density function (PDF) is computed by where, µ and σ 2 denote the mean and variance of RSSI, respectively; r p j means the pth raw RSSI data at the jth observed point. Generally, the Gaussian filter threshold is set to 0.6 [37]. Thus, the RSSI of the jth observed point after Gaussian filtering can be expressed as follows [37]: Step 2: Solve the nth-degree polynomial fitting problem based on the least square's principle.
The summation of the model fitting square errors is computed by: where, J is the number of the observed points; d j represents the real distance from the jth observed point to the WiFi detector, j = 1, 2, . . . , J; and s j is the RSSI value at the jth observed point. Among, E is a quadratic function about the coefficient a i , and we can achieve the feasible solution of a i by the least square's principle. In detail, we set the partial derivatives of E(a 0 , a 1 , . . . , a n ) to zero with respect to each coefficient a i as follows: Sensors 2020, 20, 3259 7 of 20 Then, Equation (6) can be converted into the matrix format by: The polynomial fitting coefficients B = [a 0 a 1 . . . a n ] T can be rewritten as: The traditional way to determine the optimal fitting function is to choose the nth-degree polynomial with the least mean error (ME) [11]. Although we can search for the fitting function with the least ME, the fluctuation of residual is still relatively large. Therefore, this study employs the sum of the square fluctuation errors (SSFE) as the convergence criterion based on the [41]. For the nth-degree polynomial, the SSFE can be calculated as follows: where, H 2 is the SSFE of nth-degree polynomial;d j denotes the distance estimation from the jth observed point to the detector; d 1 means the real value of the first observed point; andd 1 represents the distance estimator of the first observed point. Then, the nth-degree polynomial with the least SSFE will be selected as the optimal polynomial function.
Step 3: Piecewise polynomial fitting segmentation. The error of each estimated point F j and the mean error F are defined as: If the errors F j , F j+1 , and F j+2 of three consecutive points are greater than the mean error F, the new piecewise function start from jth observed point and turn to Step 4; otherwise, turn to Step 5. Therefore, the first data for k+1th segmentation fitting function is set to equal to s j Step 4: Re-calibrate the optimal nth-degree polynomial fitting function for the new segmentation. Make a new piecewise function from the point s j to the last observed point. Re-calibrate the optimal nth-degree polynomial fitting function for the new segmentation based on Step 2, and then repeat Step 3.
Step 5: Merge all optimal fitting functions.
According to the segmentations obtained by the previous four steps, a series of the optimal fitting functions are merged into a piecewise polynomial fitting model.
The complete procedure of PPRM is given in the Figure 2.

Target Positioning Based on Real-Time Data
At the online positioning stage, this study proposes a combination of CVKF and PPRM to filter the real-time RSSI value and estimate the Euclidean distance between the target and WiFi detector. In addition, we use UKF to filter the estimated positioning points, which can be obtained by LS-TSE. The UKF output are regarded as the final target position.

Real-Time Data Filtering Based on Constant Velocity Kalman Filter (CVKF)
Actually, there is a significant fluctuation in the RSSI value collected by the WiFi detector, and it cannot be directly used for positioning. Kalman filter [32] is a useful algorithm for signal processing to remove the superimposed noise. However, if the error of the first several observed RSSI values is very large, it will have a serious impact on the position accuracy. Therefore, the RSSI value must be pre-processed before using the Kalman filter. A variety of mobility models have previously been described in the literature such as constant-velocity, constant-acceleration, singer acceleration model, mean-adaptive acceleration model [42]. In the previous research work [40], we also proposed a constant velocity Kalman filtering fusion algorithm for noise reduction. In this study, we will employ the constant velocity algorithm to smooth the real-time RSSI. The estimation and prediction stages can be formulated in the following expressions:

Target Positioning Based on Real-Time Data
At the online positioning stage, this study proposes a combination of CVKF and PPRM to filter the real-time RSSI value and estimate the Euclidean distance between the target and WiFi detector. In addition, we use UKF to filter the estimated positioning points, which can be obtained by LS-TSE. The UKF output are regarded as the final target position.

Real-Time Data Filtering Based on Constant Velocity Kalman Filter (CVKF)
Actually, there is a significant fluctuation in the RSSI value collected by the WiFi detector, and it cannot be directly used for positioning. Kalman filter [32] is a useful algorithm for signal processing to remove the superimposed noise. However, if the error of the first several observed RSSI values is very large, it will have a serious impact on the position accuracy. Therefore, the RSSI value must be pre-processed before using the Kalman filter. A variety of mobility models have previously been described in the literature such as constant-velocity, constant-acceleration, singer acceleration model, mean-adaptive acceleration model [42]. In the previous research work [40], we also proposed a constant velocity Kalman filtering fusion algorithm for noise reduction. In this study, we will employ the constant velocity algorithm to smooth the real-time RSSI. The estimation and prediction stages can be formulated in the following expressions: Sensors 2020, 20, 3259 where, s prev(t) represents the real-time RSSI measured value at interval t from the module of WiFi signal collection in Figure 1; s pred(t) is the predicted value; s est(t) is the smoothed value; V est(t) is the smoothed range rate of the RSSI; V pred(t) means the predicted range rate of the RSSI; α and β are the gain constants, respectively; and T S denotes the duration of updated time interval. The smaller the variable α is, the higher the confidence of the predicted value s pred(t) will be, which means the measured error of RSSI is very small. If β is too large, the filtering response will be slow because the confidence in the newly measured value is reduced. Then, the RSSI sequence smoothed by constant velocity algorithm is input into Kalman filter to obtain the final smooth RSSI series [32]. Next, we can substitute the series into the calibrated PPRM model achieve more accurate distance estimation from the moving target to WiFi detectors as shown in Figure 1.

Collaborative Positioning Based on Least Squares Taylor Series Expansion (LS-TSE)
According to the distance from the target to several adjacent WiFi detectors estimated by PPRM at the online phase, we can calculate the target location. In fact, it is a collaborative positioning problem with multiple WiFi detectors. Thus, this study proposes LS-TSE to achieve it. Firstly, assume the coordinate of the target location is (x, y) and the coordinates of the WiFi detectors are set to (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x m , y m ), and then the following equations can be obtained: where, the z m means the distance from the target to the mth WiFi detector estimated by PPRM. From the over-determinant equations above, we can find that the greater the number of WiFi detectors is, the higher the positioning accuracy is. Equation (17) can be solved by many methods, such as least squares method (LSM) [43]. Although the LSM-based positioning algorithm might minimize the sum of mean-squared error, it cannot guarantee that any estimated point is optimal, which lead to a series of high-error positioning ones [37]. To deal with this issue, this paper employs the Taylor series expansion to estimate the target location. Nevertheless, to ensure the convergence of the algorithm and timeliness, Taylor series expansion needs to estimate the initial target location, which cannot deviate too far from the actual position. Therefore, the initial position obtained by LSM is taken as the input of Taylor series expansion in this study.
Assume the initial position in the iteration is (x , y ), and let corresponding distance from the initial position to the WiFi detector are d 1 , d 2 , . . . , d m , respectively. The actual coordinate can be expressed as the summation of the coordinates obtained by LSM and the position offset: Sensors 2020, 20, 3259 10 of 20 Meanwhile, the first order Taylor series expansion is imported to reduce computational complexity as follows: Consequently, we can obtain the following equation: The solution of Equation (20) can be formulated by: Each iteration needs to determine whether |∆x| + | ∆y| is less than the threshold δ (set to 0.01 [37]) or not.

Positioning Optimization Based on Unscented Kalman Filter (UKF)
Based on the aforementioned methods, we can obtain the estimated location coordinate of each moving target at each moment t. In fact, in the urban road environment, the movement of pedestrian is continuous, and the position at time t has a strong correlation with the previous interval t−1. In [32][33][34], the filter techniques have been used to improve the localization performance when the accuracy is unsatisfied. Basically, they applied the previous localization coordinates as model inputs to restrict the possible move trace to reduce the error. In [32], the authors demonstrated that UKF is superior to the Kalman filter (KF) and extended Kalman filter (EKF) in solving moving target-tracking problems. Therefore, this study uses UKF to smooth the estimated points obtained by the LS-TSE. In the UKF, it is necessary to carefully define the noise covariance matrix Q and measurement noise covariance matrix R, and the covariance matrix G before performing prediction and updating steps. In this study, the initial values of matrices Q, R and G are the same as in [32]. Finally, the output of UKFx t ,ŷ t is regarded as the ultimate positioning coordinate.

Experiment Description
To evaluate the performance and application of the proposed localization scheme, we carried out field experiments on 23 January 2019 at the Wushan campus, South China University of China, Guangzhou. The site is located in Zhujiangnan Rd around Jiaotong Building. Figure 3 plots the floor layout, where a 2-D coordinate system is used to describe the coordinates of each point, and the origin one is chosen as the right-bottom corner point of the road. As shown in Figure 3, two high buildings (Zhonggongjiaoyu building of 27 floors on the left, and Jiaotong building of 6 floors on the right) line both sides of the road, and other obstacles are around the site, such as cars, pedestrians, bicycles, trees, trucks, metal bars, etc.
Guangzhou. The site is located in Zhujiangnan Rd around Jiaotong Building. Figure 3 plots the floor layout, where a 2-D coordinate system is used to describe the coordinates of each point, and the origin one is chosen as the right-bottom corner point of the road. As shown in Figure.3, two high buildings (Zhonggongjiaoyu building of 27 floors on the left, and Jiaotong building of 6 floors on the right) line both sides of the road, and other obstacles are around the site, such as cars, pedestrians, bicycles, trees, trucks, metal bars, etc. Integrated WiFi detectors of DS-007 manufactured by Chengdu DataSky Company of China, have been proved to be suitable for outdoor environments [40]. The field experiments in our testbed showed the effective coverage area of a DS-007 detector can be approximated as a sphere with a radius of 30 m or large. In our experiments, we deployed four WiFi detectors, an android terminal as user device (Galaxy Note 2 manufactured by Samsung, Seoul, Korea) and a notebook computer as the data collection server (Xiaoxin Air 13 manufactured by Lenovo, Beijing, China). According to the width and length of the target area, four WiFi detectors were deployed at the same height on both sides of the road and their coordinates were set to (0, 0), (25, 0), (0, 8), (8,25) in meters, respectively. Also, the scanner data (MAC address, RSSI, and timestamp) collected by WiFi detectors were real-time updated into data server machine by one second.
At the offline stage, we conducted field experiments on the 25 m-length road link from 0 to 25 m by 1 m steps, where 200 replicated experiments at each tested point were repeated such as to reduce the random error. Therefore, the total size of data samples is 25 × 200 = 5000, which includes the calibrated target position and corresponding WiFi scanner data including the target's MAC, timestamp and RSSI value. Among the data, 70% was selected as a training set to calibrate the RSSI-distance relationship model, and the remainder was used for fitting performance validation. At the online stage, the tester held the smartphone with the constant speed as much as possible during walking. Several reference points with surveyed locations were used to generate the ground truth, and a stopwatch was used to record the time when the tester passes these reference points. Then, the locations of the ground truth between two adjacent reference points are generated through interpolation while assuming that the tester walks at a constant speed. To eliminate human body's impact on the experimental results, we unified the way of holding smart phone in the experiments.

Physical Distance Estimation Evaluation via Single Detector
The performance of PPRM-based distance estimation is discussed in this section. In the estimation, the selected range for polynomial degree is set to n = 2, 3,4, 5, and the corresponding PPRM models are called 2-polynomial (POLY2), 3-polynomial (POLY3), 4-polynomial (POLY4), and 5-polynomial (POLY5), respectively. We compare our proposed algorithms with the traditional PM and PRM.

RSSI-Distance Formula Based on PPRM
In the first-stage fitting, the ME and SSFE of the nth-degree polynomial were calculated for the training set, and the result is shown in Figure 4(left). One can find that the ME of POLY2, POLY3, POLY4 and POLY5 are 1.58, 1.47, 1.40, 1.38, respectively, while the difference is no more than 0.2. Meanwhile, the SSFE of four fitting functions are 0.35, 0.54, 0.98 and 0.78, respectively, whereas the difference is up to 0.63. In order to minimize the fluctuation error of the fitting function, the polynomial with the least SSFE is chosen as the optimal fitting function. Therefore, POLY2 is chosen for the first fitting function. This result is consistent with the result of Zhuang et al. [11], who found that POLY2 is the optimal fitting function with better performance and the least computation load. However, we also find the RSSI distribution will dynamically change when the physical distance exceeds a certain value. Also, the individual estimated error and the mean error in POLY2 are calculated according to Equations (10) and (11) as shown in Table 1. For example, the ME is about 1.58 m, while the errors of three estimated points at 15, 14 and 13 m are 3.58, 2.70 and 2.08 m, respectively. According to the previous criterion in Equation (12), the errors of three consecutive estimated points are greater than the mean, and the piecewise polynomial fitting should be set at the 15 m point. This is consistent with the findings of the study [37], which further proves that when the transmission distance exceeds a certain threshold, the attenuation speed of RSSI strength will change. In the second-fitting phase, the ME and SSFE of the nth-degree polynomial are recalculated based on the corresponding training dataset as shown in Figure 4(right). The SSFE values of POLY2, POLY3, POLY4 and POLY5 are 1.23, 0.80, 0.47 and 0.58, respectively, while the difference is up to 0.76. Therefore, POLY4 is chosen for the second fitting function. This may be caused by the fluctuation of RSSIcollected by far-side sensors being much larger than near-side ones. In other words, if these errors are not corrected, the positioning accuracy will worsen. Table 2 summarizes the individual and mean error of PLOY4. It is obvious that the first fitting function from 0 to 15 m does not need to be split because only two or less consecutive estimated points (8 and 9 m) has the errors greater than the mean in terms of Equation (12). Therefore, the RSSI-distance relationship is divided into two segmentations in this study. The first segmentation is 0-15 m, and the second is 16-25 m. However, we also find the RSSI distribution will dynamically change when the physical distance exceeds a certain value. Also, the individual estimated error and the mean error in POLY2 are calculated according to Equations (10) and (11) as shown in Table 1. For example, the ME is about 1.58 m, while the errors of three estimated points at 15, 14 and 13 m are 3.58, 2.70 and 2.08 m, respectively. According to the previous criterion in Equation (12), the errors of three consecutive estimated points are greater than the mean, and the piecewise polynomial fitting should be set at the 15 m point. This is consistent with the findings of the study [37], which further proves that when the transmission distance exceeds a certain threshold, the attenuation speed of RSSI strength will change. In the second-fitting phase, the ME and SSFE of the nth-degree polynomial are recalculated based on the corresponding training dataset as shown in Figure 4(right). The SSFE values of POLY2, POLY3, POLY4 and POLY5 are 1.23, 0.80, 0.47 and 0.58, respectively, while the difference is up to 0.76. Therefore, POLY4 is chosen for the second fitting function. This may be caused by the fluctuation of RSSIcollected by far-side sensors being much larger than near-side ones. In other words, if these errors are not corrected, the positioning accuracy will worsen. Table 2 summarizes the individual and mean error of PLOY4. It is obvious that the first fitting function from 0 to 15 m does not need to be split because only two or less consecutive estimated points (8 and 9 m) has the errors greater than the mean in terms of Equation (12). Therefore, the RSSI-distance relationship is divided into two segmentations in this study. The first segmentation is 0-15 m, and the second is 16-25 m. where, s is the measured value of the RSSI. Notably, at the offline phase, s denotes the RSSI value via Gaussian filtering; at the online phase, s denotes the real-time RSSI value via CVKF filtering.

Physical Distance Estimation at the Static Points
In order to validate the performance of PPRM in the target stationary environment, three methods of PM, PRM, and PPRM are tested to fit the RSSI-distance in the urban road environment, respectively. The coefficients of PM are already corrected, and the fitting degree of PRM is set to n = 2 [11]. Figure 5 shows that the proposed PPRM outperforms PM and PRM. It seems that the relationship between RSSI and distance does not obey the log path-loss model due to the effect of attenuation, reflection, multipath, etc. Meanwhile, Figure 6 depicts the cumulative distribution functions (CDFs) of the distance estimation error obtained by PM, PRM and PPRM, respectively. The 90 percentiles of distance estimation error by PPRM is not greater than 2.48 m, which increases to 4.46 m by PM and 2.79 m by PRM, respectively. Therefore, our tests indicate that PPRM can provide much higher accurate and reliable distance estimation than the popular PM and PRM in the urban road environment.
Sensors 2020, 20, x FOR PEER REVIEW 13 of 21 where, s is the measured value of the RSSI. Notably, at the offline phase, s denotes the RSSI value via Gaussian filtering; at the online phase, s denotes the real-time RSSI value via CVKF filtering.

Physical Distance Estimation at the Static Points
In order to validate the performance of PPRM in the target stationary environment, three methods of PM, PRM, and PPRM are tested to fit the RSSI-distance in the urban road environment, respectively. The coefficients of PM are already corrected, and the fitting degree of PRM is set to n = 2 [11]. Figure 5 shows that the proposed PPRM outperforms PM and PRM. It seems that the relationship between RSSI and distance does not obey the log path-loss model due to the effect of attenuation, reflection, multipath, etc. Meanwhile, Figure 6 depicts the cumulative distribution functions (CDFs) of the distance estimation error obtained by PM, PRM and PPRM, respectively. The 90 percentiles of distance estimation error by PPRM is not greater than 2.48 m, which increases to 4.46 m by PM and 2.79 m by PRM, respectively. Therefore, our tests indicate that PPRM can provide much higher accurate and reliable distance estimation than the popular PM and PRM in the urban road environment.

Analysis of Physical Distance Estimation from Real-Time Data
In practice, the signal propagation environment is always complex and diverse, which may result in high fluctuation of collected RSSI data. The ability to tolerate RSSI fluctuations is one of the important performance indexes for an RSSI-based localization system. In our previous research work [40], we proved that the CVKF algorithm is a better filter for smoothing the great RSSI fluctuation in the outdoor environment. In order to further improve the estimation accuracy, a combination of CVKF and PPRM is tested to fit the RSSI-distance with the comparison of PM, PRM, and PPRM. In the field experiment, the tester held the mobile terminal and walked straight from the reference point (0, 0) to the reference point (25,0), and then returned to the origin (0, 0). Figure 7 shows the measured data and estimated distance, and Figure 8 descripts the CDFs of the distance estimation errors by PM, PRM, PPRM and CVKF+PPRM. The results show that the proposed CVKF+PPRM outperforms other three models in real-time distance estimation while the target keeps moving. Notably, average error obtained by CVKF+PPRM is only 0.93 m, which outperforms the existing PM (2.05 m), PRM (1.37 m), and PPRM (1.27 m). In Figure 8, the 90 percentiles of distance estimation error by CVKF+PPRM is not greater than 1.98 m, which increases to 4.24 m by PM, 2.84 m by PRM, and 2.76 m by PPRM, respectively. To sum up, one can find that the PPRM fitting model after data filtering by CVKF shows promise in improving dynamic estimation accuracy.

Analysis of Physical Distance Estimation from Real-Time Data
In practice, the signal propagation environment is always complex and diverse, which may result in high fluctuation of collected RSSI data. The ability to tolerate RSSI fluctuations is one of the important performance indexes for an RSSI-based localization system. In our previous research work [40], we proved that the CVKF algorithm is a better filter for smoothing the great RSSI fluctuation in the outdoor environment. In order to further improve the estimation accuracy, a combination of CVKF and PPRM is tested to fit the RSSI-distance with the comparison of PM, PRM, and PPRM. In the field experiment, the tester held the mobile terminal and walked straight from the reference point (0, 0) to the reference point (25,0), and then returned to the origin (0, 0). Figure 7 shows the measured data and estimated distance, and Figure

Analysis of Target Positioning Estimation Results via Multi-Detector
In the previous section, we have demonstrated how to estimate the target distance to a specific WiFi detector, and proved that the CVKF+PPRM framework can handle the non-linear estimation problem associated with the RSSI-distance. This section will discuss the localization performance based on LS-TSE fusion via multi-detector. The average error (AE) and root mean square error (RMSE) are imported to evaluate the proposed algorithms, which represent the closeness between the target's estimated position ˆ( , ) t t x y and the actual one (xt, yt) at a specific time t. In detail, the corresponding measurement of effectiveness (MOE) indexes are defined as follows [32].
Average localization error (error in X-Y estimates): Root mean squared error (RMSE):

Analysis of Target Positioning Estimation Results via Multi-Detector
In the previous section, we have demonstrated how to estimate the target distance to a specific WiFi detector, and proved that the CVKF+PPRM framework can handle the non-linear estimation problem associated with the RSSI-distance. This section will discuss the localization performance based on LS-TSE fusion via multi-detector. The average error (AE) and root mean square error (RMSE) are imported to evaluate the proposed algorithms, which represent the closeness between the target's estimated position ˆ( , ) t t x y and the actual one (xt, yt) at a specific time t. In detail, the corresponding measurement of effectiveness (MOE) indexes are defined as follows [32].
Average localization error (error in X-Y estimates): Root mean squared error (RMSE):

Analysis of Target Positioning Estimation Results via Multi-Detector
In the previous section, we have demonstrated how to estimate the target distance to a specific WiFi detector, and proved that the CVKF+PPRM framework can handle the non-linear estimation problem associated with the RSSI-distance. This section will discuss the localization performance based on LS-TSE fusion via multi-detector. The average error (AE) and root mean square error (RMSE) are imported to evaluate the proposed algorithms, which represent the closeness between the target's estimated positionx t ,ŷ t and the actual one (x t , y t ) at a specific time t. In detail, the corresponding measurement of effectiveness (MOE) indexes are defined as follows [32].
Average localization error (error in X-Y estimates): Root mean squared error (RMSE): Compared with LS-TSE, trilateral localization (TRI) [40] and the least squares model (LSM) [43], the combined LS-TSE+UKF can outperform them as shown in Figure 9. The TRI-based algorithm selects the RSSI data collected by the WiFi detectors located at three reference points of (0, 0), (25, 0), (0, 8), and the UKF parameter is determined based on the literature [30]. The tester straightly keeps walking from the origin point (0, 1) to end (25,7) via the intermediate ones (13,1) and (13,7), and Figure 9 illustrates the collected actual and estimated target trajectories in X-Y coordinates under different models of LS-TSE+UKF, LS-TSE, LSM and TRI.
Sensors 2020, 20, x FOR PEER REVIEW 16 of 21 Compared with LS-TSE, trilateral localization (TRI) [40] and the least squares model (LSM) [43], the combined LS-TSE+UKF can outperform them as shown in Figure 9. The TRI-based algorithm selects the RSSI data collected by the WiFi detectors located at three reference points of (0, 0), (25, 0), (0, 8), and the UKF parameter is determined based on the literature [30]. The tester straightly keeps walking from the origin point (0, 1) to end (25,7) via the intermediate ones (13,1) and (13, 7), and Figure 9 illustrates the collected actual and estimated target trajectories in X-Y coordinates under different models of LS-TSE+UKF, LS-TSE, LSM and TRI. Meanwhile, the Figure 10a-c demonstrates the comparisons of localization errors on X-axis and Y-axis under four mentioned algorithms. In the experiment, the target moves along the X-axis during the initial 30 s. The average positioning error of the proposed LS-TSE+UKF algorithm on the X-axis is only 1.14 m. However, it is worth noting that the LS-TSE+UKF has a peak error between 5 s to 15 s, which might be due to the fact that the UKF has a time-lag and sensitivity to initial values. Then, the LS-TSE+UKF begins to converge after 15 s, and the positioning error on the X-axis gradually drops. Meanwhile, the average positioning error of LS-TSE+UKF on the Y-axis is only 1.06 m, which outperforms the existing TRI algorithm (2.14 m), LSM algorithm (1.77 m), and LS-TSE algorithm (1.64 m). Meanwhile, the Figure 10a-c demonstrates the comparisons of localization errors on X-axis and Y-axis under four mentioned algorithms. In the experiment, the target moves along the X-axis during the initial 30 s. The average positioning error of the proposed LS-TSE+UKF algorithm on the X-axis is only 1.14 m. However, it is worth noting that the LS-TSE+UKF has a peak error between 5 s to 15 s, which might be due to the fact that the UKF has a time-lag and sensitivity to initial values. Then, the LS-TSE+UKF begins to converge after 15 s, and the positioning error on the X-axis gradually drops. Meanwhile, the average positioning error of LS-TSE+UKF on the Y-axis is only 1.06 m, which outperforms the existing TRI algorithm (2.14 m), LSM algorithm (1.77 m), and LS-TSE algorithm (1.64 m).  Table 3 also shows the minimal, maximal and average localization error as well as the RMSE in X, Y and X-Y axis. Figure 10d and Table 3 reveal that our proposed algorithm has the ability to closely follow the actual target's trajectory. The average localization error of the LS-TSE+UKF is 1.67 m, which is outperforms the existing TRI algorithm (2.44 m), LSM algorithm (1.96 m), and LS-TSE algorithm (1.82 m). Moreover, the average RMSE of LS-TSE+UKF drops approximately 33.67%, 10.81%, and 5.04% compared to that of TRI, LSM, and LS-TSE, respectively. Meanwhile, the CDFs of the localization errors shows the proposed combination of LS-TSE and UKF has a higher reliability than others. In details, one can find that the LS-TSE+UKF localization scheme could achieve error of <2.99 m with the probability of 90% or more, which increases to 5.33 m by TRI, 3.85 m by LSM, and 3.62 m by LS-TSE, respectively. In summary, it can be easily concluded that the proposed LS-TSE + UKF outperforms other algorithms in the urban road environment.   Table 3 also shows the minimal, maximal and average localization error as well as the RMSE in X, Y and X-Y axis. Figure 10d and Table 3 reveal that our proposed algorithm has the ability to closely follow the actual target's trajectory. The average localization error of the LS-TSE+UKF is 1.67 m, which is outperforms the existing TRI algorithm (2.44 m), LSM algorithm (1.96 m), and LS-TSE algorithm (1.82 m). Moreover, the average RMSE of LS-TSE+UKF drops approximately 33.67%, 10.81%, and 5.04% compared to that of TRI, LSM, and LS-TSE, respectively. Meanwhile, the CDFs of the localization errors shows the proposed combination of LS-TSE and UKF has a higher reliability than others. In details, one can find that the LS-TSE+UKF localization scheme could achieve error of <2.99 m with the probability of 90% or more, which increases to 5.33'm by TRI, 3.85 m by LSM, and 3.62 m by LS-TSE, respectively. In summary, it can be easily concluded that the proposed LS-TSE + UKF outperforms other algorithms in the urban road environment.

Complexity Discussion
Finally, we compare the average computing time of the four algorithms implemented in Matlab on a computer (processor 2.2 GHz Intel Core i5-8500, Memory 8 GB and Windows 10 operation system). The average computing time of TRI, LSM, LS-TSE and LS-TSE + UKF are 0.286, 0.347, 2.685 and 3.214 s, respectively. It is observed that the proposed approach has a reasonable complexity compared to other approaches. Given the prominent improvement of LS-TSE+UKF localization accuracy over other methods, such moderately increased complexity may be acceptable.

Conclusions
In this paper, we have proposed a new range-based localization method based on the integration of the piecewise polynomial regression model (PPRM), constant velocity Kalman filter (CVKF), least squares Taylor series expansion (LS-TSE) and unscented Kalman filter (UKF) for low-speed pedestrian positioning by using WiFi scanner data. Firstly, the proposed method uses CVKF to filter the real-time RSSI value and estimate the straight-line distance between the target and WiFi detector via PPRM. Then, a filtered algorithm with UKF is developed to smooth the subset of estimated 2-dimentional positioning points, which can be obtained by LS-TSE. Finally, the UKF output points are regarded as the ultimate positioning solution. The proposed model uses the trajectory-based fitting technique considering the kinetic continuity of a pedestrian, which is capable of reducing peak errors and can achieve high positioning accuracy for moving traffic pedestrians.
Field experiment results show that the combined CVKF and PPRM can achieve highly accurate Euclidean distance estimation having the average error of 1.98 m with the probability of 90% at the offline fitting stage, which outperforms the existing propagation model (PM) (4.24 m), polynomial regression model (PRM) (2.84 m), and PPRM (2.76 m). Meanwhile, for 2-dimensional localization of single low-speed moving target at the online phase, it can achieve the average positioning error of 1.67 m which is much better than trilateral localization (TRI) (2.44 m), the least squares method (LSM) (1.96 m), and LS-TSE (1.82 m), respectively. Furthermore, our proposed localization method can be easily realized in the practical application and can promote the development of a robust WiFi-based positioning scheme.
In this study, the test environment is a typical urban road scenario with only a single pedestrian moving. In our future work, we will extend the experiments on different urban sidewalks with masses of pedestrians for the validation of the reliability and effectiveness of the proposed localization method under various geometric configurations.