Improved Particle Filter in Machine Learning-Based BLE Fingerprinting Method to Reduce Indoor Location Estimation Errors

: Indoor position fingerprint-based location estimation methods have been widely used by applications on smartphones. In these localization estimation methods, it is very popular to use the RSSI (Received Signal Strength Indication) of signals to represent the position fingerprint. This paper proposes the design of a particle filter for reducing the estimation error of the machine learning-based indoor BLE location fingerprinting method. Unlike the general particle filter, taking into account the distance, the proposed system designs improved likelihood functions, considering the coordinates based on fingerprint points using mean and variance of RSSI values, combining the particle filter with the k-NN (k-Nearest Neighbor) algorithm to realize the reduction in indoor positioning error. The initial position is estimated by the position fingerprinting method based on the machine learning method. By comparing the fingerprint method based on k-NN with general particle filter processing, and the fingerprint estimation method based on only k-NN or SVM (Support Vector Machine), experiment results showed that the proposed method has a smaller minimum error and a better average error than the conventional method.


Introduction
Currently, GPS (Global Positioning System) location estimation is used outdoors, and while indoors, signal attenuation and refraction problems make GPS errors even greater [1].Currently, the use of smartphones, which have a high penetration rate, is often considered for indoor position estimation.Frequently, wireless signals such as Wi-Fi (Wireless Fidelity) [2], Bluetooth, and ZigBee [3] are often used, all of which use a transmitting terminal to measure the RSSI (Received Signal Strength Indication) of the signal.With the launch of Bluetooth 4.0 technology, low-cost power technology is incorporated into Bluetooth 4.0, and location estimation methods based on BLE (Bluetooth Low Energy) [4] have more advantages.There have been some related studies on BLE channel separation [5], or the combination of BLE and the dead reckoning method [6].
Indoor location estimation methods can be divided into two main categories: by distance and by position fingerprints.The distance-based estimation method is based on triangulation [7].The triangulation method calculates the distance by observing the RSSI values of multiple APs (Access Points).Its accuracy is insufficient due to the propagation loss of RSSI caused by indoor walls and obstructions [8].
The position fingerprinting method [9,10] is a position estimation method that uses the distribution pattern of RSSI in the indoor environment.RSSI measurements are taken in advance at several locations and used as position fingerprints (a collection of RSSI values for each location).Localization is performed by comparing the similarity of the position fingerprint RSSI with the actual measured RSSI.Several methods have been proposed for indoor location estimation using fingerprint positioning method, including fingerprinting using CNN (Convolutional Neural Network) [11][12][13], k-NN (k-Nearest Neighbor) [14][15][16], and SVM (Support Vector Machine) [17].
In the CNN-based method, the RSSI value is transformed into an image that becomes a location fingerprint.By learning the pixel values of the image, the location estimation can be calculated.The method using CNN provides high accuracy in simulations but requires a large amount of data for training, which makes it costly in practical experiments.In the method using k-NN, k fingerprints near the observation point are selected based on the measured RSSI values, and the location is estimated by calculating the average value.In the k-NN-based method, the estimation accuracy is greatly affected by the distribution of RSSI noise.In the method using SVM, the RSSI values of the reference points are learned and transformed into a classification problem by a kernel function for position estimation.The choices of the kernel function and parameters greatly affect the accuracy of the estimation.
On the other hand, particle filters [18], which can deal with non-linear cases, have also been widely applied to position estimation for object tracking [19], post-processing, and other purposes using statistical models.The computational complexity of traditional methods using particle filters is also high when the number of particles is large, and therefore methods that can reduce the complexity are considered [20][21][22].Related studies have shown that general particle filtering allows for filtering and improving the accuracy of the predicted positions of machine learning algorithms [22].In [22], the weight function used in the particle filter considers distance.However, there have been few discussions on the suitable weight functions used in the particle filter used in the fingerprinting method.
This study proposes a fingerprint positioning method based on k-NN and improved particle filters, whose weight functions consider the coordinates.The proposed system first estimates the position using k-NN, and then corrects the estimated position using a particle filter with improved likelihood functions to reduce the error of k-NN caused by noise.In the proposed particle filter, improved likelihood functions are designed using the mean and variance of RSSI values of fingerprints around particles, and the results were comparatively analyzed.The proposed system reduces the error of the k-NN and reduces the computational complexity of the method using a particle filter.Experiments demonstrate the effectiveness of the proposed system by comparing the estimation results of the conventional methods.

Position Fingerprinting Method
Indoors radio signals reflect and are affected by many things, such as obstacles, temperature, and human movement.Unique RSSI values are generated at different locations.
RSSI values are measured and collected at as many points as possible indoors, and when estimating location, the RSSI values are compared to the values of the location points.Since this is similar to the relationship between a person and a fingerprint, it is also known as the position fingerprinting method [9,10].The position fingerprinting method is divided into two parts: the "online phase" and the "offline phase".Figure 1 illustrates the procedure of the position fingerprinting method.In the offline phase, first, the experimental environment is determined, classified, and displayed on a map.In the experimental environment, multiple APs are set up in advance, and the RSSI values of each reference point within the entire map are measured sequentially.
In the online phase, when performing location estimation, the RSSI value collected from the terminal AP at the location point is used to compare with the fingerprint points stored in the location fingerprint database, the fingerprint point with the most similar RSSI value characteristics is selected, and the location coordinates of the location point can be estimated by the location estimation algorithm based on the RSSI distribution rules.

CNN-Based Location Estimation Method
CNN (Convolutional Neural Network) simulates the human brain's cognitive process of external information and can extract complex structural features in a large amount of data to establish processing methods [23].In the literature [24], it was proposed to convert RSSI values into a single vector or matrix to be processed as pixel values in images and to use the features of RSSI values in indoor environments to learn the nonlinear relationship between RSSI and location coordinates.
Although the CNN-based approach can achieve high accuracy in simulation, the number of fingerprint points used will be thousands to tens of thousands when the environment is large.In other words, it is very costly.

k-NN-Based Fingerprinting Method
The k-NN (k-Nearest Neighbors) is a classification algorithm commonly used in machine learning algorithms.k-NN-based fingerprinting [14] is often applied to the multiclassification problem corresponding to location estimation.Fingerprinting with k-NN is achieved by using fingerprint points around a location point by calculating the Euclidean distance between the location point and the fingerprint points.Figure 2 illustrates the concept of the k-NN-based fingerprinting method.
In the fingerprinting method using k-NN, the estimation accuracy may be significantly affected by the distribution of fingerprint points and the noise of observed RSSI.

SVM-Based Fingerprinting Method
SVM (Support Vector Machine) [17] is an algorithm capable of handling classification and regression with excellent performance for high-dimensional, nonlinear pattern recognition problems [25].Data classification is achieved by adding a kernel function K(x, y) to the SVM to construct a partitioned hyperplane that maximizes the geometric interval in the high-dimensional space.The SVM is trained in the offline phase using data from the fingerprint database to determine the nonlinear relationship between RSSI values and location coordinates, and then the localization calculation is performed in the online phase.
The problem is that accuracy is affected by the kernel function and the number of parameters, and as training data increases, the computational complexity becomes enormous.

State-Space Model
By building a state-space model like Figure 3, indoor position estimation can be considered as a state estimation problem.The current position of the measurement point at time t is considered to be related to the position and state at time t − 1.The discrete-time state-space model [26] can be used to implement the position estimation problem.
The state Equation (Equation ( 1)) represents the relationship between the states of the target at different moments.The observation Equation (Equation (2)) is used to represent the relationship between the observed values and the state of the target.
ω t is the process noise, representing the uncertainty of the movement.v t is the observation noise, representing the error of the observation.ω t and v t are independent of each other.

Bayesian Filter
Unlike fingerprinting methods using CNN, k-NN, and SVM, Bayesian filters [27] view the state estimation problem in the state-space model as probabilistic inference and transform the location estimation of a geodetic point into a calculation of the posterior probability density.Position estimation is performed by calculating the probability of positioning points and reference fingerprint points.
It is difficult to apply the Bayesian filter directly.For different cases, the Bayesian filter is divided into Kalman filter [28] (KF) and particle filter [29] (PF).The Kalman filter is applied to linear and Gaussian state spaces.However, for indoor positioning, particle filters are often used for nonlinear and non-Gaussian state spaces, where it is difficult to satisfy linear and Gaussian [30].

Particle Filter
The particle filter combines Bayesian estimation and the method of Monte Carlo approximation [31], and considers N particle samples to represent the prediction and state distributions.The state distribution is simulated by the particle samples, and the particle weights and position distribution at t are adjusted using samples rather than integration operations, and the state distribution is updated according to the adjusted particles.Particle filtering consists of four parts: initialization, sequential inertial importance sampling, attaching new weights, and sampling.The two most important parts are sequential importance sampling [29] and resampling [32,33].

Configuration of the Proposed System
The procedure of the system proposed in this study is shown in Figure 4.

1.
Measure the RSSI values and create fingerprint points using the position fingerprinting method.

2.
Determine the initial state using the k-NN-based position estimation method based on the RSSI values of the location point.

3.
Correct the estimated coordinates by k-NN using a particle filter.Initial state At t = 0, determine the initial estimated location using the fingerprinting method with k-NN and generate N particles near the initial location, each randomly scattered and uniformly distributed.Let ω Movement of particles Move N particles.Based on an average stride length, add noise at a distance of d = 0.65 m ± 0.1 m and completely randomize the direction of movement.Delete particles whose coordinates are outside the wall by comparing the coordinates of the experimental environment with the coordinates of the particles after moving.The particle movement equation, with the movement angle θ, can be expressed as where θ ∈ (0, 2π).

3.
Weighted by likelihood function In this study, the designed likelihood function is , where the new weights are obtained by the distance between the particle and the surrounding fingerprint points.The particle weights are ω . At this time, the weight of the particle is defined as the magnitude of the likelihood.4.
Normalization For each particle, normalize as follows.

Resampling
The weight ω (i) t of the resampled particles is set to 1/N.The state can be predicted and the result of the position estimation can be output.

Design of Likelihood Functions
In this study, two likelihood functions are designed to calculate the likelihood of particles to investigate the effect of weight functions considering position coordinates.The likelihood functions can be designed by calculating the weights of particles using a weighting function by distance based on a mixture normal distribution.

Generating a Mixed Normal Distribution
The RSSI distribution approximates a normal distribution, which allows the Gaussian Mixture Model [34] to be used to generate a mixture normal distribution.The mixture of normal distribution is where ω id mix , µ id mix , and σ id mix are the weight, mean, and variance of the mixture normal distribution generated, respectively.Furthermore, id indicates the ID of the BLE.

Weighting Function by Distance
When locating at a given position, the distribution of fingerprint points indicates that four fingerprint points around the particle can be used.Let c i (x,y) be the four surrounding fingerprint points and d i (x, y) be the distance from the particle to the fingerprint points.The weight value of the particle varies depending on the distance of the fingerprints.
The distance from the particle to each of the four surrounding fingerprint points is d i (i = 1, 2, 3, 4), and the following function is used to find the distance and weighting function ω i (x, y).The function is altered from the classical sigmoid distance weighting function by adding coefficients and a logarithmic function to make the curve fall more gently and the weights are close to 1 at a distance of zero.
The determination of a is shown in Figure 5. Here, the distance weighting function ω i (x, y) is expressed as follows:

Likelihood Function
Let id be the BLE ID, R id (x,y) be the RSSI value measured by BLE id at point (x, y), µ id (x,y) be the mean, and σ id (x,y) be the variance; the distribution of R id (x,y) is Let R id t be the RSSI value measured from the BLE beacon at time t.The following equation is used to determine the upper probability of each R id t .
Let D be the set of id's ordered by decreasing the value of F id mix (R).Furthermore, let (xp n t , yp n t ) = p(n).The distribution of R id (x,y) is used to calculate the likelihood of the particles in p(n).When the particle is within a radius r of the BLE beacon Thus, the likelihood function f p(xp n t , yp n t ) of the particle corresponding to the coordi- nate (xp n t , yp n t ) is expressed as follows: Likelihood Function A The mean µ id (x,y) and variance σ id (x,y) are used to calculate the likelihood of the fingerprint points around the particle, and then the distance weighting function is used to process the estimates to calculate the likelihood of the particle, resulting in a location estimate.Figure 6 shows the flow of calculating the likelihood function A.

Likelihood Function B
The likelihood function B is more accurate, as it uses more reference points around the particle.An interpolated positional fingerprint with a weighting function is generated based on distance.By using the mean µ id (x,y) and variance σ id (x,y) in the interpolated data, the likelihood of the mixed normal distribution of the location points can be calculated.Figure 7 shows the flow of calculating the likelihood function B. The following equation can be used to generate an interpolated fingerprint with a distance weighting function.

Experiment
Indoor location estimation was performed using the proposed system based on RSSI values actually measured from each BLE beacon.The estimation accuracy of the proposed system is evaluated by comparing it with the existing methods using k-NN and SVM.In this study, the experiment was conducted in a university research room (about 7.8 m × 7.9 m × 2.8 m).Obstacles included desks, lockers, etc.No one other than the experimenter was present during the measurement.Four BLE beacons BLE id |id = 0, 1, 2, 3 were placed at different locations on the ceiling.The height of the measurement terminal was fixed at 0.65 m when measuring RSSI values.The experiment environment is shown in Figure 8a.Specifications about the experiment devices are in Table 1.In this study, RSSI values were measured evenly across 73 locations for 3 min each.51 of these locations were used as training and reference, and the remaining 22 locations were used as locating points.The distribution of reference and test fingerprints is shown in Figure 8b.

Location Estimation by SVM
In SVM-based location estimation methods (described in Section 2.1.3),accuracy is affected by the kernel function [37] and parameters.When different kernel functions are used, the SVM algorithm is also different.
There are four general types of kernel functions: linear, polynomial, radial basis function (RBF), and sigmoid.The polynomial kernel function and RBF are applied to the RSSI value distribution for nonlinear and non-Gaussian states [38].Their equations are expressed as follows.The angle brackets represent the dot product.
K polynomial (x, y) = (< x, y > +p) q p ⩾ 0, q ∈ N + (16) In this study, grid search is used to determine the parameters of the optimal solutions for two kernel functions.The following table counts the location estimation errors of the two kernel functions, and it can be seen that the radial basis kernel function is more accurate (Table 2).The proposed method first determines the initial location of location estimation based on k-NN (Section 2.1.2),then uses a particle filter in order to improve the estimation accuracy of the k-NN-based method.The k-NN was first experimented separately, Figure 9 shows the value of k influences the accuracy.This study sets k = 4, which offers the best estimation accuracy.

Location Estimation by Proposed Method
The proposed system first estimates the position using a k-NN-based position fingerprinting method and then corrects the estimated position using a particle filter.Figure 10 shows the particle distribution and the variation in the coordinate positions estimated using k-NN for each minute.
The proposed A and B use the likelihood functions A and B, respectively.Figure 11 shows the error distribution in the room.It shows that errors are higher near the walls and obstacles because the RSSI values are affected.

Results and Error Comparison
Based on the above results, the specific errors of each method (Table 3) and the comparison of errors (Figure 12) are summarized.The proposed methods obtain higher average accuracy than k-NN and SVM.Furthermore, it can be seen that proposed B has a higher average accuracy than the general particle filtering process [39].However, proposed A has a relatively more stable standard deviation.Proposed B has several more large errors at the wall than proposed A, but elsewhere, proposed B has relatively small errors.This makes the maximum error and variance of proposed B larger than that of proposed A. However, the average error of proposed B is the smallest.The choice of which method to use can be combined with the needs of the location estimation environment.For example, proposed B can be used in a general environment, and proposed A in case of signal instability.
To additionally validate the usability in larger environments, we used a BLE RSSI dataset [40].This dataset contains RSSI from multiple points, and we selected a number of continuous points to simulate continuous RSSI measurements.The results of different methods are shown in Table 4.It is worth noting that because of the distribution pattern of fingerprints, k-NN sometimes has points with 0 errors.However, the proposed method has improved the average error and stability.

Conclusions
This paper proposed a location estimation system using an improved particle filter regarding the improvement of indoor location accuracy by the BLE position fingerprinting method.Improved likelihood functions considering position coordinates in particle filters were designed based on the position fingerprinting method.Especially, by using the likelihood function B, this method can extend position fingerprints with interpolated data and improve estimation accuracy.Meanwhile, the error of location estimation becomes large when the location is close to a wall or obstacle, which is probably because of the extended error of the position fingerprint.
Anyway, comparison and evaluation between position fingerprint methods based on only k-NN and SVM were conducted by experiments.It was confirmed that the proposed method has improved estimation accuracy and stability.
In future studies, it is considered possible to demonstrate the accuracy of location estimation drawn by the proposed method in different environments, e.g., location estimation in rooms of different shapes or sizes, in multiple rooms, for multiple people, etc.In addition, to further improve the estimation accuracies, this study will design a method for outliers in the RSSI values of reference fingerprints.

Figure 4 .
Figure 4. Configuration of the proposed system.3.2.Position Correction with Particle Filter Define a set of N particles P = {p 1 , p 2 , • • • , p N }.The coordinates of particle p i at time t are xp (i) t , yp (i) t and the weight is ω (i) t .The process of k-NN-based particle filtering algorithm is as follows.

Figure 8 .
Figure 8.(a) Layout of furniture and location of beacons.(b) Layout of fingerprint points.The grids on the floor represent floor tiles with a side length of approximately 46.5 cm.

Figure 9 .
Figure 9. Errors vary with the value of k.

Figure 10 .
Figure 10.Distribution of particles and estimation of k-NN at different moments t (s).

Figure 11 .
Figure 11.Error distribution in the room.The gray areas represent lockers and the yellow areas represent desks.(a) Proposed A (b) Proposed B.

Table 1 .
Specifications of experiment devices.

Table 2 .
Estimation errors of SVM-based method.

Table 3 .
Estimation errors by proposed and existing methods.

Table 4 .
Estimation errors on dataset.