Wi-Fi/MARG Integration for Indoor Pedestrian Localization

With the wide deployment of Wi-Fi networks, Wi-Fi based indoor localization systems that are deployed without any special hardware have caught significant attention and have become a currently practical technology. At the same time, the Magnetic, Angular Rate, and Gravity (MARG) sensors installed in commercial mobile devices can achieve highly-accurate localization in short time. Based on this, we design a novel indoor localization system by using built-in MARG sensors and a Wi-Fi module. The innovative contributions of this paper include the enhanced Pedestrian Dead Reckoning (PDR) and Wi-Fi localization approaches, and an Extended Kalman Particle Filter (EKPF) based fusion algorithm. A new Wi-Fi/MARG indoor localization system, including an Android based mobile client, a Web page for remote control, and a location server, is developed for real-time indoor pedestrian localization. The extensive experimental results show that the proposed system is featured with better localization performance, with the average error 0.85 m, than the one achieved by using the Wi-Fi module or MARG sensors solely.


Introduction
The maturation and popularization of wireless communication technology has led to the emergence and development of various Location-Based Services (LBSs). LBSs are significantly required by the pedestrian in indoor environments, especially for emergency applications. In outdoor environments, the Global Position System (GPS) has been widely used and is able to provide high-enough localization accuracy. Unfortunately, due to the serious signal attenuation from the satellites, the localization performance in indoor environments becomes greatly degraded and even unavailable. Although various indoor localization systems, such as the Wi-Fi Received Signal Strength (RSS) [1,2], ZigBee [3], ultra-wideband [4], radio frequency identification [5], Bluetooth [6], and inertial sensors [7] based indoor localization systems, have been significantly studied, the design of the highly-accurate and low-cost indoor localization system still forms an interesting topic.
Wi-Fi RSS based indoor localization has been recognized as one of the most popular and representative indoor localization technologies. In the indoor Wi-Fi environment, commercial mobile devices can be used to collect the Wi-Fi RSS with the built-in Wi-Fi module. The signal propagation model based indoor localization approach generally requires the precise modelling of the RSS and physical distance, which is difficult to obtain [8]. The location fingerprint based indoor localization approach does not require foreknowing the locations of Access Points (APs), and thereby it adapts to the environmental well. This approach involves two phases: the offline and online phases. In the offline phase, the Wi-Fi RSS from APs is collected at each Reference Point (RP), and then a radio map is constructed based on the Wi-Fi RSS, MAC addresses, and location coordinates. Then, in the online phase, the new Wi-Fi RSS is collected in a real-time manner to be matched against the pre-constructed radio map for the localization. However, this method often leads to large localization errors and location jumps due to the irregular fluctuation of the RSS which is caused by the complex indoor environment. Besides, the computation cost involved in fingerprint matching generally increases with the increase of the dimensions of the target environment.
At the same time, with the rapid development of the Micro-Electro-Mechanical System (MEMS) technology, the Magnetic, Angular Rate, and Gravity (MARG) sensors have been embedded in various handheld devices such as the smartphones. By using the MARG sensors, the Pedestrian Dead Reckoning (PDR) approach, which relies on the previous locations, stride length, and walking direction to calculate the current location of the pedestrian, has become very popular for pedestrian localization. Some previous studies use the empirical models about the walking state to estimate the stride length [9], meanwhile the existing heading estimation approaches [10][11][12][13] usually rely on the gyroscope solely or the fusion of the gyroscope and magnetometer to calculate the user heading. However, the stride length cannot be accurately estimated in each step when the pedestrian walks in changing scenarios and the heading calculation may also not be accurate when the measurement data suffer from magnetic interference and gyroscope drift. Although the PDR approach is able to achieve high localization accuracy in short time, the drift of the estimated locations will occur as the time goes on, since the inertial measurement unit easily suffers from the accumulative errors of the estimated stride length and walking direction. On the contrary, the Wi-Fi RSS based localization approach estimates the absolute locations of the pedestrian, which can help to mitigate the accumulative errors of the PDR. These two localization approaches are complementary to each other for the sake of improving the indoor localization accuracy.
Although a large number of fusion algorithms have been studied for integrating the PDR and Wi-Fi localization, many critical issues are still open. The current fusion algorithms mainly focus on deploying the Particle Filter (PF) [14,15], Kalman Filter (KF) [16], and the improved algorithm based on the previous two algorithms. The PF helps a lot in improving the localization accuracy, especially for the nonlinear and non-Gaussian systems, but it involves high computation cost. In addition, it is difficult to obtain the optimal Importance Density Function (IDF) which is used for generating the high-quality particles. The KF is based on the assumption that the noise obeys the Gaussian distribution with zero mean, which cannot be easily satisfied in the actual indoor environment.
In summary, the most important contribution of this paper is to design a novel indoor localization system by using built-in MARG sensors and Wi-Fi module. To improve the robustness and effectiveness of our system, we accomplish the design and optimization of three main modules, including the enhanced Wi-Fi localization, enhanced Pedestrian Dead Reckoning, and Extended Kalman Particle Filter (EKPF) based fusion algorithms. We propose integrating the map information and the results of the PDR and Wi-Fi localization by using the EKPF, which depends on the EKF to optimize the IDF with the purpose of reducing the number of particles and approximating the true posterior distribution. In the PDR, by considering magnetic interference and gyroscope drift, the complementary filter auxiliary Extended Kalman Filter (EKF) is used to integrate the tri-axial accelerometer, tri-axial gyroscope, and tri-axial magnetometer for the sake of estimating the heading of the pedestrian. In addition, we adopt a neural network to estimate the parameters of the stride length model as well as calculating the velocity per each second. For the Wi-Fi localization, we apply the improved affinity propagation clustering to save the computation time of fingerprint matching and reduce the number of large localization errors. This process is shown in Figure 1 and the detailed steps will be discussed in the following sections.
The rest of this paper is organized as follows. Section 2 introduces the related works. Section 3 describes the proposed system including the PDR and Wi-Fi localization approaches, as well as the related fusion algorithm. In Section 4, the extensive experiments are carried out in an actual indoor environment. Finally, Section 5 concludes the paper and provides some future directions.

Related Works
In recent decades, various low-cost inertial sensors have been equipped in the off-the-shelf smartphones due to many functional requirements, and meanwhile the Wi-Fi network has also been widely deployed in the indoor environment. Thus, the PDR and Wi-Fi localization approaches are recognized as the two most promising approaches used for indoor localization. The earliest work focusing on the fusion of the PDR and Wi-Fi localization is discussed in [17]. The authors achieve the integration of the inertial sensors and Wi-Fi module by using both the PF and KF. In concrete terms, as one of the most important parameters in the PDR, the heading of the pedestrian is estimated by the KF, which fuses the output of the gyroscope and the angle of trajectory calculated by the Wi-Fi localization. However, the approach is limited in terms of localization accuracy since their performance significantly depends on the robustness of Wi-Fi localization. In [18], the authors propose an adaptive integration frame to fuse localization information from the GPS/MARG and Wi-Fi/MARG integrated subsystems, and meanwhile use a Sequential Importance Resampling (SIR) PF approach to integrate the Wi-Fi RSS measurements and data from the MARG sensors. The particles are calculated by using the velocity and attitude coming from MARG sensors, while ignoring the impact of Wi-Fi observation data on particle sampling. In this paper, we depend on the EKF with observation data from the Wi-Fi and MARG sensors to optimize the IDF for the sake of reducing the number of particles and approximating the true posterior distribution.
The work studied in [19] focuses on integrating the gyroscope, accelerometer, and Wi-Fi fingerprint by using a complementary EKF. This approach estimates the heading of the pedestrian, based on the data from the gyroscope and the linear model between the step length and step frequency. An auto-calibration process is introduced in [20] to reduce the measurement error of heading direction as well as the estimation error of step length during movement.
Considering the variation of Wi-Fi RSS, computation cost involved in fingerprint matching, and utilization of the low-cost inertial sensors built-in smartphones, the refresh rate model of the Wi-Fi signal is introduced in [21] to optimize the Wi-Fi localization, and meanwhile the affinity propagation clustering is discussed in [22] to reduce the computation cost for the positioning. At the same time, an empirical model constructed from the individual height and peak-to-peak magnitude of the acceleration in each step is developed in [23] to estimate the stride length of the pedestrian. The quaternion based heading estimation approach [24] is capable of compensating the drift of heading estimation with low computation cost. Different from the approaches above, in this paper, we first rely on a new velocity calculation algorithm to reduce the error of speed estimation, and meanwhile integrate the tri-axial gyroscope, tri-axial accelerometer, and tri-axial magnetometer by using the complementary filter auxiliary EKF [25] to improve the accuracy of heading estimation. Second, we apply the improved affinity propagation clustering to reduce the number of large localization errors, as well as achieving the low computation complexity. Finally, the extend Kalman particle filter is applied to accomplish the Wi-Fi/MARG fusion, which is capable of eliminating the location jumps caused by the Wi-Fi localization and the accumulative errors of the PDR.

Pedestrian Dead Reckoning
In the PDR, the previous location, estimated stride length, and heading are used to reckon the current location of the pedestrian. To reduce the accumulative error, we propose an enhanced algorithm for the estimation of the velocity and walking direction.

Velocity Estimation
We rely on the data from the tri-axial accelerometer to estimate the velocity of the pedestrian, as shown in Figure 2. In the offline phase, we optimize the calibration coefficient of an empirical model by using the Back Propagation (BP) neural network [26], and then use this model to estimate the stride length. In concrete terms, we first calculate the relative magnitude of acceleration, which is defined as the difference between the local and measuring acceleration, namely Accnorm, and meanwhile use a low-pass filter to eliminate the noise involved in Accnorm. Second, we perform the step detection to extract the gait characteristics, such as the gait frequency and statistical quantities of acceleration magnitude, i.e., mean, variance, maximum, and minimum of Accnorm. Finally, we optimize the calibration coefficient based on the BP neural network. In the online phase, we estimate the stride length and velocity of the pedestrian by using the empirical model and step detection respectively in a real-time manner.

• Step Detection
Due to the error of the installation of inertial instruments, we rely on the periodic variation of Accnorm to detect the gait of the pedestrian. In Figure 3, the abnormal peaks are caused by the random self-dithering of the human body and the noise of sensors. By setting the interval between every two adjacent local maximum values as no smaller than the time threshold T threshold and the local maximum values as larger than the amplitude threshold Am threshold , the valid steps (with green circles) are detected in Figure 3. Considering that the pedestrian approximately takes two steps for each second and the updating rate of the data from the MARG sensors is 50 Hz, we set T threshold = 0.3 s and Am threshold = 0.8 m/s 2 .

• Velocity Calculation
The velocity estimation varies from step to step, especially when the stride length is treated as a constant. We use the empirical model in Equation (1) to estimate the stride length, L k , based on the acceleration measurement.
where acc k max and acc k min are the maximum and minimum values of Accnorm for the k-th step. ρ is the calibration coefficient which equals the ratio between the real distance d real and estimated distance d estimated . According to the biomechanical property, the stride length is determined by the step frequency and height of the pedestrian. The nonlinear model for the calibration coefficient ρ is constructed from the step frequency, f, height of the pedestrian, h, and mean and variance of accelerometer magnitude, Amean and Avar, as shown in Equation (2). In our system, the coefficient ρ is estimated by using the BP neural network.
(2) Figure 4 shows the block diagram of velocity calculation. By performing the step detection, we can obtain the number of steps, num t , in each second. In the t-th second, if num t = 0, the velocity is set as 0 m/s, which indicates that the pedestrian is not moving, if 0 < num t < 2, the velocity is set as 0.68 m/s, which indicates that the pedestrian walks for one step, and otherwise, we calculate the velocity by where L i t is the i-th stride length, which is calculated by Equation (1). interval i t is the number of sampling points between every two valid local maximum values. f s = 50 Hz is the updating rate of the data from the MARG sensors.

Heading Estimation
The state-of-the-art smartphones are generally equipped with the MARG sensors, like the tri-axial accelerometer, magnetometer, and gyroscope. Theoretically, the attitude of the carrier can be estimated by using the data in the gravity and geomagnetic fields, or the data from the gyroscope. However, the accelerometer and magnetometer are easily interfered with by the noise of sensors; meanwhile, the angular rate obtained from the gyroscope always drifts for a long time. To solve these problems, we adopt the complementary filter auxiliary EKF to estimate the heading of the pedestrian, as shown in Figure 5. • Initial quaternion calculation Based on the Euler theorem of the rigid body rotation [27], we use the quaternion matrix to describe the rotation of the body frame relative to the East-North-Up (ENU) reference frame, as shown in Equation (4).
is the attitude quaternion, which is set as a unit vector. We calculate the initial attitude quaternion Q init by using the data from the accelerometer and magnetometer, as shown in Equation (5).
where the initial measurement vectors from the tri-axial accelerometer and magnetometer are respectively. g is the local gravity acceleration. b n y and b n z are the northward and perpendicular components of the local earth magnetic field in the ENU reference frame respectively. Finally, by using the method in [28] to calculate the magnetic heading angle, the initial heading of the pedestrian is calculated by • Gyroscope error modification The accumulative error of angular rate which is caused by the drift of the gyroscope significantly deteriorates the precision of heading estimation. Although the attitude angular, which is calculated by using the data from the accelerometer and magnetometer, is not featured with high precision, it is able to remain stable for a long time. The gyroscope, accelerometer, and magnetometer can be used to complement each other since the gyroscope exhibits the high-pass property, while the accelerometer and magnetometer exhibit the low-pass property. Therefore, the complementary filter method, which relies on the complementary characteristics of the MARG sensors to modify the data from the gyroscope, improves the precision of attitude estimation. The angular velocity θ (k) in time domain is modified asθ where ω b (k) is a vector of the k-th measured angular rate. θ ref (k) is the k-th attitude angular which is calculated by using the data from the magnetometer. K is an adjustable parameter. However, the modified attitude angular cannot be easily obtained from Equation (7). To solve this problem, we select the Mahony based multi-axis complementary filter which is also used in [29] to modify the angular rate obtained from the gyroscope. For pedestrians with a low walking speed, the attitude angular always has small variation in every 0.02 s. Thus, we use the estimated attitude angular in the latest updating to construct the attitude matrix for the sake of modifying the drift of the gyroscope. Specifically, we standardize the newly collected measurement vectors in the acceleration and geomagnetic field, Acc b and Mag b , in the body coordinate frame, and then estimate the projection vectors in the acceleration and geomagnetic field, a s and m s , by using the quaternion rotation matrix. Since the measurement and projection vectors are in the same coordinate frame, the cross product of them can be described as We select the error vector e as the input of the Proportion Integral (PI) controller [30] to calculate the modification of angler rate δω , such that where Kp is the proportional integral coefficient. Ki is the control integral coefficient. ∆t is the integral time constant. e = e a + e m is the error in total.
After the value δω is obtained, the modified angular rate equals to where ω (k) is the k-th modified angular rate vector.
• Attitude angular estimation Since the data from the gyroscope can be used to estimate the attitude angular based on the differential equation of the rigid body motion, we use the EKF to estimate the attitude angular. In addition, we rely on the measurement vectors in the gravity and geomagnetic fields to modify the attitude angular.
By setting the state variable as a quaternion vector, Q, and measurement variables as the measurement vectors in the gravity and geomagnetic fields, the state and measurement equations can be constructed as where is the local earth magnetic field vector in the ENU reference frame. I is the identity matrix. T S is the system interval.
Finally, based on the recursive filtering process in EKF by Equation (12), including the correction phase and prediction phase, we can obtain the optimal quaternion vector in each second, and then use the newly updated quaternion to estimate the heading of the pedestrian by Equation (14).

Wi-Fi Localization
The Wi-Fi RSS is the superposition value of multiple signal paths, which easily suffers from the multipath interference. Figure 6 shows an example of the Cumulative Distribution Functions (CDFs) of RSS from two different APs at a given RP for 3 min. It can be obtained that the RSS is not stable, even in a short time duration, and the probability of RSS fluctuation around its mean within the range of [−5 dBm, 5 dBm] is higher than 90%. Furthermore, to examine the impact of the blocking of human body on RSS fluctuation, we design another experiment in which the RSS is collected under four different orientations, i.e., 0 • , 90 • , 180 • , 270 • , as shown in Figure 7. From these figures, we can find that (i) RSS fluctuation becomes significantly large and even reaches more than 20 dBm by the blocking of the human body; (ii) different orientations result in a different mean of RSS and different ranges of RSS fluctuation; and (iii) RSS from different APs at each RP generally has a different mean of RSS and different ranges of RSS fluctuation, which guarantees the effectiveness of Wi-Fi localization.
There are two phases involved in Wi-Fi localization, namely the offline phase and online phase. In the offline phase, we collect the RSS sequences labelled by timestamps to construct a radio map, in which the orientation of human body, MAC address of each AP, and location of each RP are also recorded. Then, based on the similarities of the RSS and the corresponding geographic location coordinates, affinity propagation clustering is performed to obtain a batch of clusters of RPs.
In the online phase, the orientations of human body are estimated by using the heading estimation approach, which has been carefully discussed in Section 3.1. Then, the newly collected RSS and the related estimated orientation are matched against the pre-constructed radio map to obtain the best-matching cluster of RPs. After that, the locations of the pedestrian are estimated by performing the WKNN on the selected RPs in the best-matching cluster.

• Radio map construction
To deal with the problem that the RSS collected under different orientations of the human body may be featured with significant variation, we construct eight radio maps, including the four under the orientations 0 • , 90 • , 180 • , 270 • , and the other four constructed by using the mean of two radio maps with the neighbouring orientations. By assuming that there are M APs and P RPs, the RSS vector is notated as where ψ k i,M 1 ≤ i ≤ P, 1 ≤ j ≤ M , 1 ≤ k ≤ 8 is the mean of RSS from the j-th AP, at the i-th RP, and under the k-th orientation. The radio map with respect to the k-th orientation, Λ k , is constructed as • Affinity propagation clustering For simplicity, we assume that the RSS vectors collected at the RPs i and j are ψ i and ψ j respectively. To describe the similarity of every two RPs, we set the similarity function as During affinity propagation clustering, we define two types of messages which are transmitted among different RPs, namely the responsibility message and availability message. The responsibility message, r(i, j), describes the credibility about how suitably the RP j serves as the clustering centre for the RP i, while the availability message, a(i, j), describes the credibility about how suitably the RP i selects the RP j as its corresponding clustering centre.
In general, the larger sum of the values r(i, j) and a(i, j) indicates the higher likelihood of the RP j to be selected as the clustering centre. In addition, the responsibility and availability messages are updated iteratively among different RPs until the optimal clustering centres and the corresponding clusters have been obtained. We define H k as the set of clustering centres and C k j as the set of RPs in the j-th cluster for the radio map with the k-th orientation. Considering that some clusters of RPs may contain the singular points, which are geographically far away from the clustering centres of the corresponding clusters, we use the outlier detection method [31] to eliminate the singular points after obtaining the set of clusters by the APC.

Online Phase
The online phase contains two main steps, namely the coarse localization step and fine localization step. The newly collected RSS vector is notated as In the coarse localization step, we rely on the estimated heading of the pedestrian, ϕ A , to obtain the best-matching radio map. The pseudo code of this process is shown in Figure 8. Then, we match the newly collected RSS vector ψ A against the selected radio map to obtain the best-matching cluster of RPs for the fine localization.
The clustering centre of the best-matching cluster is defined as where H s is the set of clustering centres for the selected radio map Λ s . ψ s j is the RSS vector collected at the j-th RP and in the radio map with the s-th orientation.
In the fine localization step, we perform the WKNN on some specific RPs in the best-matching cluster to locate the pedestrian. Specifically, we calculate the Euclidean distance between the newly collected RSS and the RSS at each RP in the best-matching cluster by where S ij i = 1, 2, · · · , M, j = 1, 2, · · · , n is the RSS collected from the i-th AP, at the j-th RP, and in the best-matching cluster. n is the number of RPs in the cluster C s j . ψ A,i is the newly collected RSS from the i-th AP. The estimated location of the pedestrian,L = (x,ỹ), is as follows.
where L i = (x i , y i ) (i = 1, 2, · · · , K) is the i-th selected RP with the smallest Euclidean distance in the best-matching cluster. K is the number of selected RPs used for the WKNN. ω i is the weight of the i-th selected RP.

Input:
: the estimated heading of the pedestrian : the radio map with respect to the k-th orientation k Λ  Figure 8. Pseudo code of the selection of the best-matching radio map.

Extended Kalman Particle Filter
The performance of the conventional particle filter degrades significantly when there is a large difference between the optimal IDF and real posterior distribution function [32]. To solve this problem, we use the EKF to obtain the optimal IDF with the purpose of guaranteeing that the particles are located in the high likelihood region. By fusing the Wi-Fi fingerprint information and the data from the MARG sensors, the proposed EKPF is able to reduce the number of particles with low computation cost, as well as improving the localization accuracy.
For the PDR, we construct the state equation in Equation (23); meanwhile, by assuming that the state variable is stable in a short time, we also construct the observation equation in Equation (24).
where X k is the state variable vector. Eloc k and Nloc k are the optimal eastern and northern locations. vel k and head k are the optimal velocity and heading of the pedestrian. W k is the matrix of processing noise which obeys the Gaussian distribution with the zero mean and covariance matrix Q w . are the current velocity and heading of the pedestrian estimated by using the PDR. V k is the matrix of observation noise which obeys the Gaussian distribution with the zero mean and covariance matrix R v .
The steps of the proposed EKPF are summarized as follows.
First of all, due to the fact that the PDR is a relative localization approach, we rely on the Wi-Fi localization to estimate the absolute initial location of the pedestrian, (Eloc 0 , Nloc 0 ). At the same time, the initial particles X i 0 , i ∈ 1, · · · , M , where M is the number of particles, satisfy where X 0 = Eloc 0 Nloc 0 0 0 T is the mean of the initial particles. p (X 0 ) is the proposal distribution function which obeys the Gaussian distribution. Q 0 is the covariance matrix of the initial processing noise which is used to describe the dispersion degree of the initial particles. The weight of the i-th initial particle is set as ω i 0 = 1/M. Second, we apply the EKF [33] to update the state of particles in Equations (26)-(30), and meanwhile fuse the measurement information to optimize the weights of particles in Equation (31) During the updating of the state of particles, the particles may be located at the incorrect locations due to the interference of noise. To solve this problem, we fuse the indoor map information to modify the trajectory of the pedestrian. Specifically, we define the particles which are located in the inaccessible areas as the invalid ones, and modify the weight of the i-th particle intô Finally, we adopt the residual resampling algorithm [34] to optimize the weights of particles further. Based on the optimal particlesX i k and the corresponding weightsq i k , we estimate the locations of the pedestrian byX

Environment Layout
The experiments are conducted on the same floor with the size of 64.6 m × 18.5 m in a building, as shown in Figure 9. There are five APs and 363 RPs which are labelled with red stars and black dots respectively in the target environment. The distance between every two neighbouring RPs in the green and yellow shaded areas is set as 0.6 m and 0.8 m. The origin of the coordinate frame is labelled with a red dot, while the X-and Y-axis point rightwards and upwards respectively.
The Galaxy S3 smartphone running under the Android 4.1.2 operation system is selected as the receiver. The interface of our developed software contains two parts, i.e., Android based mobile client (see Figure 10a) and Web page based remote control (see Figure 10b). The body coordinate frame with respect to the receiver is shown in Figure 11. The receiver collects the data from the Wi-Fi module and MARG sensors, and then transmits them to the location server through the Wi-Fi network. The location server estimates the locations of the pedestrian, and then returns the localization results to the receiver and remote control. The refresh rate of the data from the MARG sensors is 50 Hz and the updating rate of the localization is 1 Hz.

AP Location
Reference Point Origin of coordinate frame

Stride Length Estimation
To examine the performance of stride length estimation, we invite 10 volunteers with the height ranging from 150 cm to 181 cm for the testing.

• Testing design
Taking the difference of walking speed into account, each volunteer is required to walk along the same path with the length of 100 m under three different walking speeds, namely slow, normal, and fast, to train the stride length model. After the stride length model is obtained, each volunteer is required to walk along another path with the length of 140 m under the random walking speed for the testing.

• Data analysis
The two existing popular methods used to optimize the stride length model are summarized as follows. The first one [35], M 1 , is based on the ratio of the real and estimated walking distances to optimize the calibration coefficient, and then uses Equation (1) to estimate the stride length, L M 1 . The second one [9], M 2 , is based on the height and stride frequency of the pedestrian to construct a linear mathematical model to estimate the stride length, L M 2 . Different from the methods above, the proposed one, M 3 , is based on the BP neural network to optimize the calibration coefficient to estimate the stride length, L M 3 . In Figure 12, we use the box-plots to compare the localization errors by using M 1 , M 2 , and M 3 to estimate the stride length. From this figure, we can find that the interquartile range by M 1 is larger than the ones by M 2 and M 3 , which indicates that the localization error by M 1 is the largest. Based on the upper Whiske, the maximum localization error by M 3 is the smallest, whereas an abnormal localization error (with a red cross) is resulted by M 3 .

Heading Estimation
The testbed used for heading estimation is shown in Figure 13. The trajectory of the pedestrian is notated as A → B → C → D → E → F → A with the length of 126.56 m. To investigate the performance of the proposed Complementary Filter Auxiliary EKF (CFAEKF) for heading estimation, we compare it with two existing approaches: the EKF and Gradient Descent Algorithm (GDA) [30].  Figure 14 shows the CDFs of errors of heading estimation by the proposed CFAEKF and the conventional EKF and GDA. From this figure, we can find that the proposed CFAEKF performs best, with most errors of heading estimation falling within the range of 10 • , compared with the EKF and GDA. In addition, Figure 15 shows the statistical errors of heading estimation. From this figure, we can find that the 50th and 90th percentile errors by the EKF, GDA, and CFAEKF are 6.3 • , 5.4 • and 1.8 • , and 13.7 • , 14.5 • and 4.9 • respectively; meanwhile, the CFAEKF reduces the mean error by 57.89% and 58.84% from the ones by the EKF and GDA. To illustrate this result clearer, Figure 16 shows the result of heading estimation. From this figure, we can find that the result of heading estimation by the CFAEKF is more similar to the real heading, namely baseline, compared with the ones by the EKF and GDA.  For the CFAEKF, since the output of the gyroscope is modified by the PI controller, the selection of control parameters has a significant impact on the heading estimation. Figure 17 shows the CDFs of errors of heading estimation under different values Kp and Ki. From these figures, we can find that when the value Kp is larger than 1, the value Kp has a slight impact on the error of heading estimation.
When the value Ki is smaller than 100, the impact on the error of heading estimation of value Ki is slight, while as the value Ki increases to 1000, more than 10% of the errors are larger than 150 • . Due to the consideration of the sensitivity and stability of our system, we set Kp = 2 and Ki = 1.

Performance of Wi-Fi Localization
To investigate the performance of Wi-Fi localization, we design two experiments including the static and dynamic localization. Figures 18 and 19 show the testbeds for the static and dynamic localization respectively. For the static localization, we collect 10 RSS samples at each test point with random orientations of human body, while for the dynamic localization, the pedestrian walks along a given path A → B → C → D → E → F which is labelled by the blue solid line.  • Direction database The proposed radio map, namely the Direction Database (DD), is constructed from the radio maps with eight different orientations. The Mean Database (MD) is defined as the radio map which is constructed from the mean of the four radio maps with the orientations 0 • , 90 • , 180 • and 270 • . For the comparison, the conventional radio map, namely the Static Database (SD), is constructed by collecting the Wi-Fi RSS at each RP without the blocking of the human body in the fixed orientation.
To investigate the impact of different databases on the accuracy of the static localization, we show the CDFs of errors in Figure 20. From this figure, we can find that the localization performance by using the DD is better than the ones by the MD and SD under four different orientations. To illustrate this result clearer, we select five popular performance metrics, i.e., mean error, standard deviation of errors, maximum error, 67% error, and probability of errors over 5 m, to compare the statistical errors by these three types of database in Table 1. Due to the blocking of the human body, the probabilities of errors over 5 m by the MD and SD are higher compared with the DD; meanwhile, the DD reduces the mean error by 30.65% and 36.61% from the ones by the MD and SD under the orientation 2.  • Affinity propagation clustering Considering both the time overhead and localization accuracy, we propose using the improved affinity propagation clustering for the coarse localization and the inverse matching auxiliary WKNN for the fine localization, namely WKNN + Inverse matching + Improved APC. For the inverse matching method, we search for the K RPs which are geographically close to the one-step prediction location from Equation (26), estimate the RSS at the one-step prediction location by weighting the RSSs at the K selected RPs, and modify the collected RSS at the one-step prediction location by the estimated one.
Using the Wi-Fi RSS data collected on the given path (see Figure 19), we compare the CDFs of errors by using the WKNN, WKNN + Inverse matching, and the proposed WKNN + Inverse matching + Improved APC in Figure 21. From this figure, we can find that there is no error over 20 m by the WKNN + Inverse matching and WKNN + Inverse matching + Improved APC, which indicates that the inverse matching approach can effectively decrease the large error probability. In addition, the improved APC is able to reduce the large localization errors further since the process of coarse localization will discard the abnormal RPs which are far away from the corresponding clustering centres. In Table 2, we can find that the proposed approach decreases the mean error by 63.68% and 19.62%, and the 67% error by 61.47% and 20.75% from the ones by the WKNN and WKNN + Inverse matching respectively.

Performance of Fusion System
We compare the location tracking performance of the proposed fusion system and the conventional PDR-based Localization (PBL) and Wi-Fi based Localization (WBL) systems without data fusion in Figure 22. From this figure, we can find that the PBL involves the accumulative error for a long time, while the WBL contains many irregular large errors due to the disturbance of the Wi-Fi signal. As can be seen from Figure 23, the proposed fusion system performs best in localization accuracy due to the fact that it not only avoids the accumulative error by the PBL, but also eliminates the large errors by the WBL. In Table 3, it is found that the mean error by the PBL and WBL are 2.22 m and 6.79 m respectively, which is larger than the one by the proposed fusion system, 0.85 m. In addition, the proposed fusion system decreases the 67% error by 64.41% and 84.85% from the ones by the PBL and WBL respectively.
We continue to compare the location tracking performance of the proposed fusion system and the conventional fusion systems, EKF and Robust EKF (REKF) in Figure 24. Compared with the EKF and REKF, the proposed fusion system achieves the best location tracking performance, especially when the pedestrian makes a turn. The CDFs of errors by using different fusion systems are shown in Figure 25. Obviously, the proposed fusion system performs better in localization accuracy than the EKF and REKF since its particle updating scheme results in the well data fusion by combining the optimal IDF and map information. As illustrated in Table 4, the proposed fusion system decreases the mean error by 49.40% and 29.75% and standard deviation of errors by 63.64% and 38.03% from the ones by the EKF and REKF respectively.      Finally, we investigate the computation time required by the proposed fusion system. The target user walks along a given path A → B → C → D → E → F → E → D → C → B → A with the length of about 142 m, as shown in Figure 19. This process takes this user about two minutes. After that, we calculate the computation time in total involved in the localization process by the proposed system under different particle numbers. In our experiment, all the calculations are conducted on a desktop with the Intel (R) Core (TM) i3-3240 CPU and 4 GB RAM. As can be seen from Figure 27a, the computation time gradually increases with the increase in particle number, whereas the particle number has a slight impact on localization accuracy according to the CDFs of errors in Figure 26. The computation time required by the three different fusion systems mentioned above is shown in Figure 27b. Based on the results in Figure 25 and Figure 27b, we can find that although the proposed fusion system consumes a little more computation time compared with the EKF and REKF, its related localization accuracy is much higher than the other two.

Conclusions
In this paper, we propose a new Wi-Fi/MARG indoor localization system by fusing the data from the Wi-Fi module and MARG sensors, and then integrating the indoor map information to improve the localization accuracy. For the PDR, we employ the complementary filter auxiliary EKF to estimate the heading and the velocity of the pedestrian by using the BP neural network, while for the Wi-Fi localization, we rely on the improved affinity propagation clustering approach to improve the localization accuracy of the WKNN. The proposed integration system can avoid the accumulative error of the PDR, as well as reducing the large error probability of the Wi-Fi localization which is caused by the disturbance of the Wi-Fi signal. Furthermore, we rely on our developed software, including the Android based mobile client, Web page for remote control, and location server, to verity the effectiveness of the proposed system in the real indoor environment. In the future, we will focus on the design of an effective and efficient radio map construction approach to reduce the time and labor cost involved in the site survey of the large-scale indoor environment.