An Information-Based Approach to Precision Analysis of Indoor WLAN Localization Using Location Fingerprint

1 Chongqing Key Lab of Mobile Communications Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China; zhoumu@cqupt.edu.cn (M.Z.); tianzs@cqupt.edu.cn (Z.T.); zhangqiao6120@gmail.com (Q.Z.); hewei@cqupt.edu.cn (W.H.) 2 China Internet Research Lab, Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China; wuhaibo@cstnet.cn * Correspondence: qiufeng6906@gmail.com


Introduction
In the recent decade, there has been a growing interest in indoor localization techniques, which are based on existing indoor wireless communication infrastructures and devices.Due to the implementation ease and cost efficiency [1,2], the indoor wireless local area network (WLAN) fingerprint-based localization approach is preferred compared to the conventional trilateration localization approaches, which are easily compromised by the propagation path loss, multi-path fading and environmental shadowing.As far as we know, there are generally two phases involved in WLAN fingerprint-based localization [3][4][5][6], namely the off-line phase and on-line phase.In the off-line phase, the fingerprint database is built based on the received signal strength (RSS) measurements, which are associated with the calibrated reference points (RPs) in the target environment.Then, in the on-line phase, the target locations are estimated by the matching from the newly-recorded RSS measurements against the pre-built fingerprint database to estimate the target locations [7].
The multi-path fading, environmental shadowing and channel interference always result in significant temporal and spatial variations of RSS distributions and eventually lead to low localization precision [8].To the best of our knowledge, there are very few works focusing on the theoretical analysis of the WLAN fingerprint-based localization precision, but in fact, the fundamental limit of localization precision can help greatly in AP placement optimization to achieve the highly-precise localization purpose.
In this paper, we propose an analytical model to characterize the variation of RSSs under different signal distributions for the sake of deriving the closed-form fundamental limit of the WLAN fingerprint-based localization precision based on the Fisher information matrix (FIM) [9][10][11][12][13].Furthermore, the impact of the number and geographical locations of APs and environment noise on localization precision is also discussed.
The rest of this paper is structured as follows.The related works are given in Section 2. In Section 3, we introduce the system in detail, including the calculation of the fundamental limit of localization precision and optimization of AP placement.Simulation results are provided in Section 4. In Section 5, some interesting discussions are presented.A case study is presented in Section 6.Finally, we conclude the paper in Section 7.

Related Work
In the past decade, the WLAN fingerprint-based localization precision has been studied extensively.The authors in [14] proposed that the number and locations of APs, the physical layout and the mean of RSS measurements at each RP have a significant impact on localization precision.To optimize AP placement, the existing approaches are generally based on the signal coverage, service connectivity, network throughput and transmission rate [15].The authors in [16] invented a weighted kernel function by which the impact of different APs is differentiated.Specifically, larger weights are assigned to the APs that contribute more to the localization.The authors in [17] developed an approach to find the minimum number of APs that are used to provide the full coverage, as well as to conduct the localization with the errors falling into a given scope.Chen in [18] proposed an AP placement optimization approach by distinguishing every two WLAN RSS fingerprints to guarantee satisfactory signal coverage.The localized local discriminant embedding and AP selection (LLDE-APS) approach in [19] not only optimizes the AP placement, but also saves the power consumption of the mobile terminal.The approach in [19] is featured with the main advantage of extracting the most discriminative RSS features for target localization.However, the existing works in the literature paid very little attention to the theoretical relationship between the AP placement and localization precision.To solve this problem, we propose a novel AP placement optimization approach to improve the localization precision, as well as to preserve the real-time capacity of the system.In [20], the authors proposed the max Euclidean distance (ED)-based AP placement optimization approach, aiming to achieve the maximal sum of the ED between every two location fingerprints in a WLAN environment.However, the existing work from the literature paid very little attention to the theoretical and analytical relations of the AP placement and localization precision.To solve this problem, we proposed a novel AP placement optimization approach based on the relationship between the geographical locations of APs and localization errors with the purpose of enhancing the localization precision, as well as guaranteeing the real-time capacity of localization.
In wireless sensor networks (WSNs), there is a variety of works focusing on the error analysis by using the FIM.The authors in [21] presented the conditions in which the multipath delay can be used to improve the localization accuracy and meanwhile introduced the FIM to investigate the highest achievable localization accuracy.The authors in [22] also studied the relationship between the multipath delay and time of arrival (TOA)-based localization accuracy based on the prior statistics of the errors.The authors in [23] relied on the Cramer-Rao lower bound (CRLB) to estimate the locations of sensors by using the unbiased Gaussian range estimation from the anchor nodes in the angle of arrival (AOA)-based localization.In this paper, we utilize the FIM to derive the fundamental limit of WLAN fingerprint-based localization precision under different signal distributions.This result can provide valuable insights into the improvement of fingerprint-based localization precision, as well as the overall design of the WLAN localization system.Furthermore, the simulated annealing algorithm (SA) [24,25] is used to search the optimal AP locations, which correspond to the lowest fundamental limit of localization precision.Although the FIM has been significantly considered in localization precision analysis, the existing literature mainly applied it to the situation that the TOA, AOA or RSS following the Gaussian distribution is selected as the location metric.However, in this paper, we aim to reveal the relations of the fundamental limit of localization precision and various signal distributions.
We clarify that the three main contributions of this paper are that: (i) the FIM is used to derive the fundamental limit of indoor WLAN localization precision; (ii) the theoretical analysis towards the relationship between the localization errors and signal distributions is presented; and (iii) the SA algorithm is selected to search the optimal AP locations, which correspond to the lowest fundamental limit of localization precision.Assume the i-th user location As discussed in [26], we assume that the RSSs in the non-line-of-sight (NLOS) environment follow the Rayleigh distribution, while the ones in the line-of-sight (LOS) environment follow the Entropy 2015, 17, 8031-8055 Gaussian distribution.We define that the i-th user is in the LOS environment for the m-th AP when there is no wall blocking for the rectilinear propagation from the m-th AP to the receiver.On the contrary, if there is wall blocking, it is defined that the user is in the NLOS environment for the m-th AP.Secondly, we rely on the characteristics of the FIM to derive the fundamental limit of localization precision under the Gaussian signal distribution, Rayleigh signal distribution and mixed signal distribution, respectively.Thirdly, we select all of the candidate AP locations to construct the solution space.Then, we construct the objective function of the SA algorithm for AP placement optimization with the purpose of achieving the lowest fundamental limit of localization precision.Finally, we search the optimal AP locations by the SA algorithm.The proposed system consists of two main modules: (i) the calculation of the fundamental limit of WLAN fingerprint-based localization precision; and (ii) the AP placement optimization by using the SA algorithm to achieve the lowest fundamental limit of localization precision.The flow chart of the system is shown in Figure 1.

Localization Precision vs. Signal Distributions
We select the COST231model [27] to characterize the propagation property in an indoor WLAN environment.The COST231 model considers the large-scale path loss, as well as the signal penetration, like the wall attenuation factor (WAF) and floor attenuation factor (FAF), as described in Equation (1).
where P and P(d 0 ) stand for the RSSs recorded at the locations with d and d 0 meters from the AP, respectively.d 0 is the reference distance, and β is the path loss exponent.wP w is the signal fading by the walls with the number of w. χ is the noise with the variance of σ 2 .By assuming that θ i = ( x i , y i ) T is the estimated location with respect to the i-th real location θ i = (x i , y i ) T , we can calculate the covariance matrix of θi by: where σ 2 x i and σ 2 y i are the mean square errors (MSEs) of the estimated location in the X and Y coordinates, xi and ŷi .σ xi ŷi and σ ŷi xi are the covariance between xi and ŷi and between ŷi and xi , respectively.
Since the value θ i is calculated from the estimate of P, the values θ i and P have the same variance σ 2 under the unbiased estimates of P and θ i .Thus, the localization error will increase as the value σ increases.By setting f θ i (P) as the probability density function (PDF) of P with respect to θ i , the expectation of the sharpness of f θ i (P) equals 1/σ 2 , as shown in Equation ( 3).
Then, we have: For the biased estimate of θ i , we easily obtain: we have var[U (θ

Analysis with Gaussian signal distribution
As the RSSs follow the Gaussian signal distribution, the joint PDF of RSSs is calculated by: where is the deviation of the RSSs collected by the receiver, represents the coordinate of the k-th AP and m is the AP number.We convert Equation (10) into: where . At this point, the geometric relationship between the i-th real location and the k-th AP is shown in Figure 2. Based on Equation ( 7), we have: where Based on this, we calculate that: Based on Equations ( 7) and ( 9), we can simplify Equations ( 13)-( 16) into: Therefore, we can calculate { J (θ i )} −1 by: where Based on Equation (6), we have: Finally, the fundamental limit of localization precision with respect to θ i equals:

Analysis with Rayleigh signal distribution
As the RSSs follow the Rayleigh signal distribution, the joint PDF of RSSs is calculated by: where 4−π 2 σ2 2 is the deviation of RSSs.We convert Equation ( 18) into: where . Similar to the previous discussion under the Gaussian signal distribution, we have: Based on Equations ( 7) and ( 9), we obtain: Finally, the fundamental limit of localization precision with respect to θ i is: For simplicity, we define the average of the fundamental limit of localization precision for the target environment as: where n is the number of RPs.

Impact of the AP Number
Based on Equations ( 20) and (28), the relationship between the fundamental limit of localization precision with m and m + 1 APs respectively is described as: c = ρ 1 and s for the Gaussian and Rayleigh signal distributions, respectively.Based on Equations ( 30) and ( 31), we observe that the increase of the AP number reduces the fundamental limit of localization precision.Furthermore, the relationship V m (θ i ) = V m+1 (θ i ) holds as the AP locations are collinear (i.e., α i(m+1 3 and 4 show the cumulative distribution functions (CDFs) of errors with different AP numbers in the LOS and NLOS environments, respectively.As can be seen from these figures, the variation of the AP number has a slight impact on the localization errors when the AP number is larger than three.Furthermore, compared to the Rayleigh signal distributions, the Gaussian signal distribution generally results in higher localization precision.

Impact of Noise Variance
Based on Equations ( 20) and ( 28), we can observe that the variation of σ has a significant impact on the fundamental limit of localization precision.To illustrate this observation more clearly, we present the CDFs of errors with different variances of noise in the LOS and NLOS environments, respectively, as shown in Figures 5 and 6.From these figures, the increase of noise variance (i.e., σ 2 ) results in the decrease of localization precision, as expected.

Fundamental Limit with a Mixed Signal Distribution
In the actual indoor environment, the RSSs from different APs cannot be guaranteed to follow a unique signal distribution.Therefore, without losing generality, we assume that the RSSs from the former m 1 APs follow the Gaussian signal distribution, while the RSSs from the latter m 2 APs follow the Rayleigh signal distribution.Based on this, we have: Thus, we have: (33) Therefore, the fundamental limit of localization precision with respect to θ i under the mixed signal distribution is calculated by:

AP Placement Optimization
As one of the most representative heuristic optimization algorithms, the SA algorithm applies the concept of the annealing process in metallurgy to conduct the optimization search.In the SA algorithm, an initial temperature is set before the annealing process.As the temperature drops, the SA algorithm iteratively searches for the optimal solution.Different from many existing optimization searching algorithms, like the hill climbing (HC) algorithm [30], the SA algorithm distributes an acceptance probability to each newly-obtained solution, rather than discarding the new solution, which has a higher cost compared to the previous one.Based on this, the SA algorithm is featured with the high probability of achieving the global optimum.Furthermore, the SA algorithm has been proven to consume much lower time overhead compared to the widely-used brute-force searching (BFS) [31] algorithms.
In our system, we rely on the SA algorithm to search the optimal AP locations, which correspond to the smallest value of the objective function f =V ave (θ), as shown in Figure 1.Specifically, we first define the parameters used in the SA algorithm, i.e., the starting temperature T 0 , cooling factor α, iteration number N and ending temperature T s .Second, we select all of the candidate AP locations to construct the solution space W. Third, we construct the initial solution W 0 = W current , where W current is the solution in the current iteration, and then conduct the solution updating to obtain the new solution W new = W current .Fourth, we do the searching of the optimal solutions in an iterative manner.The acceptance probability of each newly-obtained solution, p, is calculated by: where l is a constant.T is the temperature for the current iteration.f (W new ) and f (W current ) are the values of the objective function when the AP is located at W new and W current , respectively.The temperature for the z + 1-th iteration equals T z+1 = αT z .To illustrate this process more clearly, the flow chart of the SA algorithm used for AP placement optimization is shown in Figure 7.

Simulation Results
We conduct the simulations in two typical environments, the regular LOS and irregular NLOS environments.For simplicity, we select the k-nearest neighbor (k-NN) algorithm [2] to examine the error performance of the proposed approach.The k-NN algorithm computes the Euclidean distances in signal space between the newly-recorded RSSs and pre-stored RSSs in the fingerprint database and then calculates the geometrical center of the k-nearest neighbors as the estimated location.The parameters that are also used in [32] are shown in Table 1.  Figure 9 shows the CDFs of errors achieved by the LLDE-APS [19], conventional max ED-based [20], symmetric and proposed fundamental limit-based AP placement optimization approaches.As the simplest approach, the symmetric AP placement optimization approach divides the target environment into a batch of subareas with the approximately same dimensions, and meanwhile, an AP is placed at the geometrical center of each subarea.As can be seen from Figure 9, we observe that the proposed approach provides smaller localization errors compared to the conventional ones when the AP number is larger than two.

Irregular NLOS Environment
Figure 10 shows the irregular NLOS environment with dimensions of 36 m by 21 m.We uniformly calibrated 176 RPs (with •'s) with an interval of 1 m in the lobby.In this environment, there are in total six office rooms, two washrooms, three corridors, one lobby, one staircase and two elevators.The six areas where the candidate APs can be placed are notated as Subareas 1-6. Figure 11 shows the CDFs of errors by using the previously-introduced four AP placement optimization approaches as the AP number increases from one to four.Based on the results in Figures 9 and 11, it can be proven that in both regular LOS and irregular NLOS environments, the proposed approach can guarantee the high WLAN fingerprint-based localization precision.

Computation Overhead
We conduct all of the computation on a desktop with an Inter(R) core(TM) i3-3220 processor, 4 GB RAM and Windows 8 operating system.We compare the time overhead by the max ED-based AP placement optimization and the proposed AP optimization as the AP number increases from one to four.By using the SA algorithm, the time overhead involved in the max ED-based AP placement optimization is much higher than the proposed AP optimization, since the ratios of time overhead are much larger than one, as shown in Figure 12.From this figure, we observe that the proposed approach consumes much lower time overhead compared to the max ED-based AP placement optimization approach.Furthermore, the increase of the AP number enlarges the ratios of the time overhead.Based on this, the proposed approach is proven to be more efficient for the scenario with a large number of APs used for localization.Figure 13 compares the time overhead required by the BFS, SA and HC algorithms under different numbers of APs.From Figure 13, we observe that the time overhead by the HC algorithm is the lowest among these three algorithms, whereas it very easily falls into the local optimum.Different from the HC algorithm, the SA algorithm is based on the universal searching scheme and, meanwhile, requires lower time overhead compared to the BFS algorithm.Therefore, it is proven that the SA algorithm generally performs better than the HC and BFS algorithms, especially under a large number of APs.

Discussion
The interesting questions about the difference in the placement of the optimal AP locations, as well as the localization errors under different signal distributions remain to be answered.
Figure 14 shows the results of the placement of the optimal AP locations under the Gaussian, Rayleigh and mixed signal distributions, respectively, in the NLOS environment.From these figures, we can find that the optimal AP locations, which correspond to the lowest fundamental limit of localization precision, are generally non-collinearly distributed.To illustrate this result clearly, we take the mixed signal distribution as an example.By connecting every two most adjacent optimal AP locations with a purple line, the purple lines should be collinear when the optimal AP locations are collinearly distributed. (f) (g)   Figure 15 compares the CDFs of errors with respect to the Gaussian, Rayleigh and mixed signal distributions, respectively, in the NLOS environment.As can be seen from these figures, we observe that the Gaussian signal distribution generally brings the smallest localization errors, while the largest errors result from the Rayleigh signal distribution.Based on this, we can make a reasonable conjecture that a high localization precision is most likely to be provided when the APs with the LOS property to the receiver are used for localization.

Case Study
Based on the previous discussion, we find that the proposed approach can help much in designing a highly-precise localization system.In this section, we will continue to focus on the cases study in a real indoor WLAN environment.Specifically, the positioning precision and the corresponding time overhead are investigated under different AP numbers, AP optimization approaches and signal distributions.
As shown in Figure 16, all of the experiments are conducted in a real indoor WLAN environment with dimensions of 57 m × 25 m on the same floor in the Yi Fu building at Chongqing University of Posts and Telecommunications (CQUPT).The 10 candidate AP locations are notated as 1 , 2 , • • • , 10 .The Samsung S7568 smartphone, which has our developed WLAN RSS scanner installed, and D-link DAP-2310 AP are selected as the receiver and transmitter, respectively, as shown in Figure 17.To construct the fingerprint database, we uniformly calibrate 73 RPs in two straight corridors and one office room, namely Areas 1, 2 and 3.

Positioning Errors under Different Signal Distributions
We compare the mean of positioning errors corresponding to the groups of optimal candidate AP locations under the Gaussian, Rayleigh and mixed signal distributions, respectively, in Table 2. From this table, we observe that the group of optimal candidate AP locations under the mixed signal distribution achieves the lowest positioning errors compared to the Gaussian and Rayleigh signal distributions.Furthermore, the mixed signal distribution is featured with the same error performance as the Rayleigh signal distribution, which relies on the fact that the RSSs in the target environment can be fitted well by the Rayleigh signal distribution, especially when the AP number is larger than three.

Time Overhead by Using Different AP Optimization Approaches
To examine the efficiency of the proposed approach, Figure 20 illustrates the time overhead involved in the proposed fundamental limits, LLDE-APS and max ED approaches, respectively, as the AP number increases from three to six.From this figure, we can find that the time overhead required by the proposed approach is significantly lower than the one by the LLED-APS or max ED approach.This result is due to the reason that the LLED-APS or max ED approach requires traversing all of the RPs in each round during the iterative process, whereas the proposed approach optimizes the AP locations by calculating the closed-form solution to the fundamental limit of localization precision only once.

Extension to a Multi-Floor Environment
We continue to conduct the experiments in a typical indoor multi-floor environment (including the fourth the fifth floors in the Yi Fu building at CQUPT), as shown in Figure 21.These two floors have the same planestructure depicted in Figure 16.The difference from the previously discussed single-floor environment can be summarized in three ways: (i) the propagation model in the multi-floor environment considers not only the signal fading by the walls, but also the one by the floors; (ii) the estimated location is a two-dimensional coordinate in the single-floor environment, while the one in the multi-floor environment is a three-dimensional coordinate; and (iii) the FIM of the estimated location in the single-floor environment is a symmetric two by two matrix, while the one in the multi-floor environment is a symmetric three by three matrix.Based on this, the previous results under the single-floor condition can be easily extended into the multi-floor scenario.In addition, to perform the testing, the nine candidate AP locations are notated as 1 , 2 , • • • , 9 , and meanwhile, the target environment is divided into five subareas, namely Areas 1, 2, 3, 4 and 5. To construct the fingerprint database, we uniformly calibrate 53 and 73 RPs on the fourth and fifth floors, respectively.To investigate the performance of the proposed approach in the indoor multi-floor environment, Figure 22 compares the CDFs of errors by using the proposed fundamental limits, existing LLDE-APS and max ED approaches.From this figure, it is proven that the proposed approach can generally achieve the highest positioning accuracy compared to the existing AP optimization approaches in the indoor multi-floor environment.3 compares the mean of positioning errors corresponding to the groups of optimal candidate AP locations under the Gaussian, Rayleigh and mixed signal distributions, respectively.In this table, we can find that by using the groups of optimal candidate AP locations, the mixed signal distribution always achieves the smaller mean of positioning errors than the one by the Gaussian or Rayleigh signal distribution.Furthermore, the groups of optimal candidate AP locations under different AP numbers are illustrated in Table 4.

Conclusions
In this paper, we proposed a novel information-based approach to analyze the localization precision, as well as to optimize the AP placement for indoor WLAN localization using the location fingerprint.We derived the fundamental limit of WLAN fingerprint-based localization precision by using the FIM and then relied on the SA algorithm to search the optimal AP locations, which correspond to the lowest fundamental limit of localization precision.Compared to the widely-used max ED-based and symmetric AP placement approaches, the proposed approach performs better in the aspects of localization errors and time overhead.Moreover, experimental results are presented in order to support our claims.For the future work, the information-based precision analysis of WLAN fingerprint-based localization in a multi-floor environment forms an interesting topic.

= k + 1
Derive fundamental limit of localization precision under the Rayleigh and Gaussian signal distributions respectively Yes Derive fundamental limit of localization precision under the mixed signal distribution Construct the objective function of the SA algorithm for AP placement optimization Select all the candidate AP locations to construct the solution space

Figure 2 .
Figure 2. Geometric relationship between the i-th real location and the k-th AP.

Figure 3 .
Figure 3. CDFs of errors with different AP numbers in the LOS environment.(a) Under the Gaussian signal distribution; (b) under the Rayleigh signal distribution.

Figure 4 .
Figure 4. CDFs of errors with different AP numbers in the non-LOS (NLOS) environment.(a) Under the Gaussian signal distribution; (b) under the Rayleigh signal distribution.

Figure 5 .
Figure 5. CDFs of errors with different variance of noise in the LOS environment.(a) Under the Gaussian signal distribution; (b) under the Rayleigh signal distribution.

Figure 6 .
Figure 6.CDFs of errors with different variances of noise in the NLOS environment.(a) Under the Gaussian signal distribution; (b) under the Rayleigh signal distribution.

Figure 7 .
Figure 7. Flow chart of the SA algorithm used for AP placement optimization.

Figure 8 Figure 8 .
Figure 8 shows the regular LOS environment with dimensions of 12 m by 12 m.The 144 RPs (with •'s) with an interval of 1 m are uniformly calibrated in this environment.

Figure 9 .
Figure 9. CDFs of errors in the regular LOS environment.(a) With one AP; (b) with two APs; (c) with three APs; (d) with four APs.

Figure 11 .
Figure 11.CDFs of errors in the irregular NLOS environment.(a) With one AP; (b) with two APs; (c) with three APs; (d) with four APs.

Figure 12 .
Figure 12.Ratios of time overhead by the SA algorithm.

Figure 13 .
Figure 13.Time overhead by the brute-force searching (BFS), SA and hill climbing (HC) algorithms.

Figure 14 .
Figure 14.Placement of the optimal AP locations.(a) With three APs; (b) with four APs; (c) with five APs; (d) with six APs; (e) with seven APs; (f) with eight APs; (g) with nine APs; (h) with 10 APs.

Figure 15 .
Figure 15.CDFs of errors under different signal distributions.(a) With three APs; (b) with four APs; (c) with five APs; (d) with six APs; (e) with seven APs; (f) with eight APs; (g) with nine APs; (h) With 10 APs.

Figure 19 .
Figure 19.CDFs of errors by using different AP optimization approaches.(a) With three APs; (b) with four APs; (c) with five APs; (d) with six APs.

Figure 20 .
Figure 20.Time overhead by using different AP optimization approaches.

Figure 21 .
Figure 21.Layout of an indoor multi-floor environment.

Figure 22 .
Figure 22.CDFs of errors by using different AP optimization approaches.(a) With three APs; (b) with four APs; (c) with five APs; (d) with six APs.

Table 2 .
Mean of positioning errors under different signal distributions.

Table 3 .
Mean of positioning errors under different signal distributions.

Table 4 .
Groups of optimal candidate AP locations under different signal distributions.