An Effective Fingerprint-Based Indoor Positioning Algorithm Based on Extreme Values

Wi-Fi-based fingerprint indoor positioning technology has gained special attention, but the development of this technology has been full of challenges such as positioning time cost and positioning accuracy. Therefore, selecting reasonable Wireless Access Points (APs) for positioning is essential, as the more APs used for positioning, the higher the online computation, energy and time cost. Furthermore, the received signal strength (RSS) is easily affected by diverse interference (obstacles, multipath effects, etc.), decreasing the positioning accuracy. AP selection and positioning algorithms are proposed in this paper to solve these issues. The proposed AP selection algorithm fuses RSS distribution and interval overlap degree to select a small number of APs with high importance for positioning. The proposed positioning algorithm uses the location distance between reference points (RPs) to construct a circle and leverages extreme values (maximum and minimum values) of circles to determine the possibility that the test point (TP) appears in each circle, then it finds useful APs to determine the weight of RPs. Extensive experiments are conducted in two different areas, and the results show the effectiveness of the proposed algorithm.


Introduction
With the development of mobile devices, the demand for location-based services (LBS) is increasing [1,2]. An accurate outdoor location can be obtained by satellite signals. However, it is difficult to use satellite signals for indoor positioning due to the complexity of the indoor environment [3]. There are many indoor positioning technologies based on sensors, such as ultra-wideband (UWB) [4], Wi-Fi [5], Bluetooth [6] and vision [7]. Among these, Wi-Fi-based indoor positioning technology directly uses the existing Access Point (AP) to collect signals without installing additional equipment. Therefore, this positioning technology is a common solution for indoor positioning [8][9][10][11][12][13].
Wi-Fi-based indoor positioning technology can be divided into trilateration-based methods [14] and fingerprint-based methods. The former has a severe distance deviation calculation under non-line-of-sight (NLOS) conditions, making positioning accuracy worse than the latter [15]. Therefore, the fingerprint-based positioning method is a popular topic for indoor positioning.
The implementation of the fingerprint-based positioning method can be divided into two phases. Firstly, the implementer collects the Received Signal Strength (RSS) of APs at Reference Points (RPs) to construct an offline fingerprint map. Then, the Test Point (TP) location is estimated by matching online RSS at the TP with the fingerprint map.
However, RSS is easily affected by diverse interference (AP power outage or running mal-function, obstacles, etc.). Some of the APs' information (AP location, propagation path, etc.) collected online have been changed compared with the APs' information collected offline. The changed information enables us to make a large positioning error [16,17].
Therefore, removing the useless information is meaningful for positioning. Furthermore, the fluctuation of the signal is another major issue. The RSS of each AP at the RP should be an interval, not a fixed value, due to the measured noise. To comprehensively consider all collected RSS samplings, RADAR [18] and HORUS [10] have been proposed. However, they only consider the influence of fluctuation on offline fingerprint maps, ignoring the impact on online TPs. Generally, the online RSS received at the TP for positioning is only 1~2 sampling(s), making it hard to reflect the RSS interval of the TP. Therefore, the above issues may make the positioning accuracy decrease.
Considering the above issues, it is necessary to leverage all RSS samplings of APs at RPs for positioning because only using all received values can truly reflect the distribution of RSS. As for the few samplings received at the TP, it is hard to use the collected samplings to construct its RSS distribution. However, the closer the two points are in Euclidean space, the more similar the RSS distributions are [19]. The random RSS value of each AP received at the TP should be located in the RSS extreme values interval of the same AP at some RPs containing the TP in Euclidean space. In the issue that some of the APs' information have changed, it influences the RSS of these APs at the TP located in the RSS interval of different RPs.
Based on the above ideas, a new fingerprint-based positioning algorithm is proposed. The proposed algorithm first leverages a circular boundary with radius ρ to judge the RSS similarity between RPs and find unchanged APs of each circle by comparing the RSS value of the TP with RSS extreme values (maximum value and minimum values) of circles. Then, the circles with the largest number of unchanged APs are selected as Similar Circles (SCs). Next, the RPs in SCs are selected as Similar RPs (SRPs). Finally, the unchanged APs in each SC are intersected to obtain the Useful APs (UAPs), the information of which being less changed. Based on the UAPs selected, the weighted average of SRPs is found by a new weighting algorithm considering the Wi-Fi propagation characteristics to obtain the final estimation of the TP.
To approximate the RSS extreme values of circles, the Gaussian Process Regression (GPR) algorithm and the Wi-Fi propagation model are adopted to fit the RSS values. Then, the collected RSS and the fit RSS are combined to better reflect the RSS extreme values in each circle.
Moreover, market-oriented commercial location-aware services need to estimate the user's location quickly and reduce energy loss. Therefore, an excellent commercial LBS system needs to consider the online positioning time.
Due to the increasing availability and high density of wireless APs in indoor environments, more and more networks and APs can be detected by mobile devices. This may cause a high time cost by using all detected APs for online positioning. Furthermore, APs with low recognition may weaken the uniqueness of fingerprints, resulting in poor positioning accuracy. It is beneficial to use the APs with large differences in RSS distribution between RPs in the area of interest. Therefore, we can select a small number of APs with high importance, instead of all APs, for positioning to shorten the online positioning time while maintaining accuracy.
A novel offline AP selection algorithm, DIOD, fusing the RSS Distribution and the Interval Overlap Degree, is proposed based on the above idea. This algorithm first uses the samplings at different RPs to calculate the overlap length (OI) of RSS intervals and then uses the RSS distribution to distinguish the importance of APs for positioning. Finally, a small number of APs with high importance are selected.
In short, the contributions of this paper are summarized as follows: (1) Positioning algorithm: The proposed positioning algorithm considers the dynamic characteristics of fingerprints and uses a new boundary, a circular boundary, to handle the signal noise. The GPR algorithm is adopted to fit the RSS value in each circle. Then, instead of using RSS Euclidean distance between SRPs and the TP, the proposed algorithm better considers the Wi-Fi propagation characteristics to calculate the RSS similarity between SRPs and the TP.
(2) AP selection algorithm: A new AP importance evaluation criterion based on IOD is proposed by considering RSS distribution rather than only the RSS value. The proposed algorithm can be leveraged to reduce the online positioning time.
We have implemented the proposed algorithm and conducted extensive experiments on two different experimental areas. The experimental results show that the proposed positioning algorithm can improve the positioning accuracy and the AP selection algorithm can obtain higher positioning accuracy by only using a small number of APs.
The rest of this paper is organized as follows. Related work is introduced in the next section. Section 3 presents the related problem formulation, the preliminary experiment and the proposed algorithm framework. Section 4 introduces the proposed algorithm in detail, including the AP selection algorithm and positioning algorithm. The experimental results and related analysis are introduced in Section 5 in detail. Section 6 presents a few further discussions. Section 7 makes a conclusion.

Related Work
Some classic algorithms [10,18,20] for fingerprint-based indoor positioning have been proposed. The NN algorithm calculates the Euclidean distance of RSS between the TP and RPs, and the coordinate of the RP with the smallest distance is returned as estimation of the TP. RADAR [18] and WKNN [20] also directly calculate the RSS distance, but they select top k RPs with smaller distances as SRPs, and then the weighted average of the coordinates of SRPs is taken to calculate the estimation. HORUS [10] calculates the probability that the RSS of the TP appears at RPs and then selects the RP with the maximum likelihood as the estimated location of the TP.
Based on the above traditional algorithms, some improved algorithms have been proposed. A novel weighted average algorithm [21] is proposed using the physics distance instead of the RSS distance. Logarithmic Gaussian Distance (LGD) [22] is used to calculate the RSS similarity between RPs and the TP. A penalty function for LGD is proposed [23] to suit the complex indoor environment. This algorithm can enhance positioning accuracy, but it performs worse under a dynamic environment. Clustering-based approaches [24][25][26] are commonly used for positioning, and these algorithms use a feature (mean RSS value, etc.) to divide the RPs into several sets. The classical clustering methods are K-means [24] and affinity propagation [25]. With the rapid development of machine learning, Refs. [27][28][29] propose to improve the positioning accuracy. Their common characteristic is that they all need a large amount of data for training. As a simple and fast learning method, the Extreme Learning Machine (ELM) has been widely used in the field of fingerprint-based positioning [29]. There are few works based on extreme values. Ref. [30] only uses the extreme values of RSS to filter out APs. Compared to these previous works, we propose a novel positioning algorithm based on extreme values (maximum and minimum values) to select the SRPs and UAPs. This algorithm achieves higher positioning accuracy in the following extensive experiments.
There are also some works about the AP selection algorithm. It can be divided into two categories, online selection and offline selection. For online selection, a common approach is MaxMean proposed in Ref. [31], which employs the absolute mean RSS value as the importance index and chooses those with the strongest mean RSS values. This type of algorithm uses the specific online RSS to obtain higher positioning accuracy, but it needs to operate for each TP, which is not convenient. For offline selection, it uses offline fingerprint database information. Its positioning accuracy may not be as accurate as that of the online selection algorithm [32], but it eliminates the need for online AP selection. This paper mainly discusses the work related to offline selection algorithms. Refs. [33,34] use information gain and entropy to select APs, respectively. However, they only use the mean RSS value as a single feature, failing to judge the importance of APs. Based on machine learning, SVM [35] is used to evaluate the group importance of selected APs instead of the individual importance. This algorithm needs a long time for data training, which is not suited for a large-scale AP environment. IOD [30] is used to select the APs, achieving higher accuracy than other AP selection algorithms [33,35]. However, IOD hardly considers the RSS distribution and only uses the overlap length of the RSS interval for AP selection. We propose a DIOD algorithm fusing the RSS distribution and IOD based on this work. Experimental results show that the proposed DIOD algorithm can improve the positioning accuracy compared with IOD.
For the issue of dynamic changes in the environment, many researchers have done a work on updating the fingerprint map. LEMT [19] uses the RSS relationship between RPs to build a decision tree model for fingerprint map updating. DNCIPS [8] collects RSS at fixed points and uses Log-distance propagation law and GPR to update the fingerprint map. Both of them require additional equipment. WINIPS [36] can record the RSS received online at each AP position. Refs. [37,38] update the fingerprint map online by GPR and partial least-squares regression [39], respectively, by fusing the information of the Inertial Measurement Unit (IMU).
Moreover, more accurate localization results can be obtained through cooperative localization by sensor fusion [40,41], and the fusing motion information has been studied extensively [42,43]. The information of IMU [38,44] is used to improve positioning accuracy. Compared to these works, the positioning algorithm proposed in this paper concentrates on the positioning performance of Wi-Fi itself. It is independent and orthogonal to these fused works and may be combined with the other sensors' information to improve positioning accuracy.

Preliminaries and Framework of Proposed Algorithm
In this section, we first review the classical RSS fingerprinting problem. Next, the preliminary experiment is presented to illustrate the fluctuation of RSS. Finally, the framework of the proposed algorithm is briefly introduced.

Subsection Problem Description
The actual location of the TP is expressed as l u = (x u , y u ), and the fingerprint map consists of RPs' locations and corresponding RSS. Assume that there are M APs and N RPs. The location of RPs can be stored as L N = (l 1 , l 2 , · · · , l N ) . l n = (x n , y n ) represents the location of RP n . The RSS collected at RP n is expressed as follows: where RSS m n = [RSS m n (1), RSS m n (2), · · · , RSS m n (T)] . RSS m n (t) represents the RSS of AP m at RP n in the tth sampling. T represents the number of samplings.
The fingerprint map can be expressed as follows: For the convenience of expression, assume that only one sampling is received at the TP. The RSS collected at the TP is expressed as follows: The RSS for RPs and the TP with a null reading from the APs is set as −100 dBm to keep the APs' dimension the same between RPs and the TP.
The positioning problem can be formulated as follows: wherel u is the estimated location of the TP by using the relationship map ( * ).
To facilitate understanding, the major symbols are listed in Table 1. The circle with RP n as center and ρ as the radius Largest number of unchanged APs

Preliminary Experiment
To clearly illustrate the RSS fluctuation, we conduct an experiment that collects 60 samplings continuous from an AP at a fixed RP by using Xiaomi MI-2. Figure 1 shows the statistics histogram. It can be seen that even at the same location, RSS also changes over time. A similar experimental result is reported in [45]. This phenomenon shows that few samplings do not accurately reflect the RSS distribution, which decreases the positioning accuracy by only using a single feature of RSS [19,20].  Figure 2 illustrates the framework of our algorithms, which can be divided into two stages, AP selection and positioning. In the first stage, DIOD keeps APs with high importance for positioning. It is an offline AP selection algorithm. In the positioning stage, the positioning algorithm is divided into three modules, SRP selection, UAP selection and weighted average modules. The SRP selection module contains area division and SCs selection. Area division is to construct circles based on the radius ρ. SCs selection compares the RSS value of the TP and the RSS extreme values in circles that contain the unknown RSS calculated offline by the GPR algorithm and the RSS collected at RPs and then selects SRPs whose RSSs are similar to the TP. For the UAP selection module, the unchanged APs in each SC are taken as an independent set, and, taking the intersection of all sets, the set of APs in the intersection are the UAPs. The weighted average module considers the Wi-Fi propagation characteristics and uses the RSS difference of UAPs received at SRPs and the TP to quantitatively calculate the similarity between SRPs and the TP and then obtain the estimation of the TP by weighting the coordinates of SRPs. Figure 3 shows the layout of the RPs, the TP and the APs.   This section briefly reviews the basic IOD algorithm [30]. Assume that the RP i and RP j can detect AP m . The RSS sampling data of the AP m collected at RP i and RP j is expressed as follows:

Framework
Thus, the intervals RSS m i and RSS m j can be represented as [min(RSS m i ), max(RSS m i )] and min RSS m j , max RSS m j , respectively. The IOD algorithm [30] first calculates the OI between RSS m i and RSS m j , which is the length of the red interval in Figure 4, and then normalizes the OI RSS m i , RSS m j to get the value of IOD RSS m i , RSS m j , i.e., where length(RSS m i ) and length RSS m j represent the length of RSS m i and RSS m j , respectively, i.e., The smaller the value of IOD RSS m i , RSS m j is, the stronger the distinguishing ability of AP m is.

Issue Statement of the IOD Algorithm
In an ideal environment, RSS obeys a Gaussian distribution. However, due to the diverse interface, it often presents a multi-modal distribution. There are usually four distribution modes [46]: Gaussian, bi-modal, left-skewed and right-skewed distribution, respectively. Therefore, IOD will misjudge the importance of APs considering the RSS distribution. Figure 5 shows the RSS distribution of AP m and AP d at RP i and RP j . The horizontal axis represents the value of RSS, and the vertical axis represents the probability that each RSS value appears in the total number of samplings. At this time, the importance of AP d is higher than AP m for positioning due to the different RSS distribution. However, in Figure 5, there are the following relationships.
Therefore, the IOD algorithm has the following equation.
Based on the IOD algorithm, Equation (13) means that AP m and AP d have the same importance for positioning, which causes an incorrect judgment.

The Proposed DIOD Algorithm
The IOD algorithm [30] only considers the one-dimensional RSS interval overlap, which ignores the distribution characteristics. We improve the IOD algorithm by expanding the one-dimensional interval overlap length into a two-dimensional overlap area (OA). As shown in Figure 5, when the OA area is larger, the importance of AP is lower. Otherwise, the importance of AP is higher. The area of the two-dimensional overlapping part is positively correlated with the probability p.
where p m i and p m j are calculated as follows: In Figure 5, p m i > p d i and p m j > p d j . Therefore, based on Equations (14)- (16), we have the following inequality.
Equation (17) shows that the importance of AP d is higher than AP m for distinguishing RP i and RP j . It means that DIOD makes a correct judgment.
Because there are N RPs, RP combinations can be obtained in all. Let DIOD m be the final value to evaluate the importance of AP m . It can be calculated as follows: After calculating the final DIOD value of all APs, APs with lower values are selected for positioning.

SRPs Selection Module
The RSS Extreme Values Collected at RPs in a Circle The SRP selection module is proposed based on two RSS characteristics. The first is that when two RPs are close in Euclidean space, they often have similar RSS characteristics [19]. The second is that when the RSS samplings at a fixed point reach a certain number T, the RSS obtained can reflect the extreme values received at this point [30].
Based on the first characteristic, area division with distance threshold ρ is used to distinguish the RSS similarity between RPs. Note that a large improper threshold may generate the RPs with a distance less than ρ with a large signal distance variance, decreasing the positioning accuracy. According to our experiments, the best value of ρ should be set between 1 and √ 2 times the adjacent RP spacing (more details can be found in Section 5.3). As the fingerprints are collected at known locations, it can compute the spatial distances l i − l j between RP i and RP j . The RSS characteristics between two RPs are similar if the corresponding spatial distance is less than the threshold ρ. Otherwise, the RSS characteristic is not similar. Figure 6 shows the determination of RPs with similar RSS. For the second characteristic, an experiment is carried out to verify its correctness. Figure 7 shows the measured RSS of one AP at four RPs. It can be seen that when T is greater than 45, the extreme values obtained remain constant. These results show the correctness of the characteristic and indicate that the T should be greater than 45. This characteristic means that the accurate RSS extreme values of RPs can be obtained through sampling. In subsequent experiments, the value of T is set as 60. Based on the above characteristic, we take AP m and C n as an example to illustrate the rules of determining the unchanged APs of each circle. The RSS of AP m received at the TP in a short time is a random variable. The minimum and maximum values of this random variable are represented as α and β, respectively. Therefore, RSS m TP ∈ [α, β]. The RSS interval of AP m received at RPs in C n can be expressed as min RSS m C n , max RSS m C n , where RSS m C n is expressed as follows: where f n represents the total number of RPs in C n . However, the extreme values collected at RPs in C n do not accurately reflect all extreme values in C n . Therefore, an improved GPR algorithm [8] is adopted to fit the extreme values in C n . The improved GPR is briefly introduced in the next subsection.

The RSS Extreme Values Collected at RPs in a Circle
The standard GPR algorithm assumes that the independent variable follows the Gaussian distribution. The mean and variance of the uncollected points are estimated according to the joint normal distribution hypothesis by training the collected RPs. Consider the following RSS observation model of AP m received at RP i .
where η denotes the observation noise and satisfies N 0, σ 2 n . The relationship between one observation and another is just the covariance function, which can be expressed as: where σ 2 f is the variance and µ is a length parameter. They are both the hyper-parameters. 2-norm is represented by l i − l j , which denotes the Euclidean distance between two vectors.
The predicted RSS for an unknown position l * can be calculated as follows: where κ * is an vector of covariances between RPs' locations and an unknown location, K is the covariance matrix of RPs' locations, Z is the RSS observations values vector, and I is the identity matrix. The improved GPR that combines the Wi-Fi signal propagation model [41] and standard GPR can approximate the real RSS more accurately. The unknown RSS of AP m at l * can be calculated as follows: where ψ(L N ) = [ψ(l 1 ), ψ(l 2 ), · · · , ψ(l N )] T . d i is the distance from RP i to AP m .
The parameters σ f , µ, σ n , th RSS m 0 , δ, x AP m , y AP m are trained by the fireworks algorithm [8] in this paper.
Based on the improved GPR algorithm, the RSS of each uncollected point in circles can be calculated. For the RSS collected at all uncollected points in circle C n is expressed as GRSS m C n . Therefore, the approximated RSS interval of AP m in C n can be expressed as min GRSS m C n , max GRSS m C n . Based on the approximated RSS values calculated by improved GPR and the collected RSS values of RPs, the whole extreme values of AP m in C n can be expressed as min m C n , max m C n , where:

The SRPs Selection Criterion
Because the RSS interval of AP m at the TP is [α, β], [α, β] should be located in a circle which contains the TP in Euclidean space. Assume this circle is C n , so the relationship of the RSS interval between the TP and C n satisfies the following equation.
Since the value of α and β is hard to be obtained in advance due to the short sampling(s) at the TP, Equation (26) Therefore, AP m is added as an unchanged AP of C n . Similarly, Equation (27) can be used as a judgment condition to determine other unchanged APs of C n and determine the unchanged APs of other circles.
After M APs and N circles are tested, all circles have their own unchanged APs and these circles with the largest number of unchanged APs q(q ≤ M) are selected as SCs. The RPs contained in these SCs are all selected as SRPs.

UAPs Selection Module
Due to the environment changes, some of the APs' information have been changed. At this time, these changed APs need to be removed to ensure positioning accuracy. In Section 4.2.1, circles with the largest number of unchanged APs q are selected as SCs. When q = M, it means that all APs have no drastic changes, and all APs can be selected as UAPs to calculate the similarity between SRPs and the TP. When q < M, it means that some of the APs' information have been changed, and it is necessary to remove these APs.
Without loss of generality, assume that there are F SCs with largest number of unchanged APs q. The q unchanged APs of each SC are taken as an independent set and perform intersection operations on F independent sets to obtain UAPs. The UAPs can be expressed as follows: An example is used to illustrate the UAPs selection module clearly. Suppose that M = 7, there are 6 SCs SC 1 , SC 2 , SC 3 , SC 4 , SC 5 , SC 6 with the largest number of unchanged APs, and each SC has q = 6 unchanged APs. Table 2 shows the specific six unchanged APs of each SC. It can be seen that although these SCs have the same number of unchanged APs, they may have different unchanged APs. To keep UAPs as unchanged APs belonging to all SCs, the unchanged APs contained in each SC are taken as an independent set, and then the intersection of all sets is taken. The APs in the intersection are the UAPs selected, which can be expressed as follows:  [19,20] use the Euclidean distance of RSS to calculate the similarity between the SRP and the TP and use the reciprocal of the Euclidean distance as the weight of the SRP. It can be expressed as follows: where RSS , SRP i represents the ith SRP. Such weight assignment in Equation (30) does not adequately consider the propagation characteristics of Wi-Fi signals. According to [47], the relationship between Wi-Fi signal strength propagation and distance can be expressed as Equation (24).
Based on Equation (24), it can be seen that when RSS gradually becomes smaller, the absolute value of its slope gradually becomes smaller, which shows the asymmetry of ∆RSS and ∆d. More generally speaking, the basic rule is to emphasize closer APs with stronger RSS values. However, the traditional Euclidean distance [20] criterion does not consider this propagation characteristic. Based on the UAPs selected, we propose a new weight algorithm to solve this issue, i.e., where S represents the number of UAPs and U AP j represents the jth UAP.
Compared with Equation (30), Equation (31) allows the larger RSS (the absolute value is smaller) to have a higher weight when calculating the RSS similarity between the TP and SRPs. Finally, based on the coordinates of SRPs and weight, the estimation of the TP can be calculated as follows:l where R represents the number of SRPs.

Experimental Settings
Extensive experiments are carried out in two different environments at Beihang University. Figure 8 shows the layout of Experiment Area 1 with area 70 m × 15 m. Figure 9 shows Experiment Area 2 with area 72 m × 25 m. In Figures 8 and 9, red circles and green squares represent the RPs and TPs, respectively. Xiaomi MI-2 is used as the collection device to collect the RSS at RPs and TPs, and the sampling frequency of Xiaomi MI-2 is 1 Hz. In the offline fingerprint map, T = 60 samplings are collected at each RP. Only 1~2 sampling(s) are collected at each TP to simulate the normal walking speed of pedestrians. For Area 1, there are 125 RPs and 79 TPs, and the adjacent RP spacing is 1.8 m. For Area 2, there are 35 RPs and 27 TPs, and the adjacent RP spacing is 3.6 m. Over 100 APs are detected on both areas. The experiments are tested when students walk around rather than in a relatively static environment. To eliminate the influence of weak and unstable signals, we conduct data preprocessing (i.e., filter out the signal strength less than −85 dBm) before the experiments. Our code and datasets are available at https://github.com/dadadaray/FingerPosition (accessed on 15 December 2021).

Comparison Algorithms and Performance Metric
The positioning results of the proposed algorithm are compared with four existing algorithms to verify the effectiveness of the proposed algorithm.
For the AP selection algorithm, IOD [30] is used as a comparison algorithm to verify the effectiveness of the proposed DIOD.
The detailed IOD algorithm [30] can refer to Section 4.1. PLGD, NN and ELM are adopted as comparison algorithms for the proposed positioning algorithm.
PLGD [23] uses Logarithmic Gaussian Distance (LGD) with the penalty function instead of Euclidean distance to calculate the RSS similarity between the RP and the TP and selects k RPs with smaller distances for weighted average positioning.
NN [18] uses Euclidean distance to evaluate the RSS similarity between the TP and the RP. The coordinate of RP that is most similar to the TP is returned as the TP's location.
ELM [29] uses the neural network of machine learning to train the positioning model and then predicts the corresponding position according to the signal strength.
Moreover, a series of performance metrics are adopted to evaluate the results of the experiments. These performance metrics are positioning error (PE), mean positioning error (MPE) and cumulative distribution function (CDF). The definitions of these indexes are expressed as follows: CDF(PE u ) = P(PE ≤ PE u ) (35) where G represents the number of TPs, PE = [PE 1 , · · · , PE G ] and P( * ) represents the probability.

Feasibility Evaluation
Due to the different adjacent RP spacings in different experimental environments, it is difficult to directly give a fixed radius value to adapt to different environments. Based on this reason, the adjacent RP spacing of each area is taken as the unit distance; therefore, the unit distance of Areas 1 and 2 are 1.8 m and 3.6 m, respectively. Radius ρ is set as different multiples of unit distance rather than a fixed value to observe the influence on positioning accuracy. The following four intervals are explored: (1) 1, respectively. The results show that the multiple with 1, √ 2 performs best compared with other radius intervals, no matter how many APs are used. This result is within expectations because when the radius ρ is between 1 and √ 2 times the adjacent RP spacing, the circle considers the RSS similarity of surrounding RPs and avoids the decreasing RSS similarity in the same circle due to the large value of radius. Based on the above experimental results, the following results of our algorithms are obtained by using this optimal multiple.  To determine whether the selected SCs can surround the TP position, Figure 12 shows the correct rate of SCs against the number of APs in two areas. It shows that the probability of the TP's actual location being located in the selected SCs is more than 80%. Therefore, the SCs can be used for positioning. Based on the results in Ref. [30], when the number of APs selected for positioning is 5 to 8, calculating positioning accuracy by using the IOD algorithm is better than using other algorithms such as information theory-based and machine learning-based algorithms [33,35]. When the number of APs selected continues to increase, the positioning results of IOD are almost the same as other algorithms. Therefore, the IOD algorithm [30] can provide high positioning accuracy with a small number of APs while improving the online running time.
To verify the effectiveness of the proposed DIOD algorithm, IOD [30] and DIOD algorithms are used to select APs for positioning, respectively, and then to observe the positioning accuracy by using our proposed positioning algorithm. Figures 13 and 14 show the MPE against the number of APs selected in Area 1 and Area 2, respectively. It can be seen that when a small number of APs are chosen, the MPE of DIOD performs better than IOD. This is because DIOD considers the RSS distribution. Furthermore, it can be seen that the experimental results in Area 1 are more smooth than those in Area 2. The reason is that the corridor of Area 2 is narrow, causing more multipath effects, so the reliability of the RSS distribution obtained in Area 2 is slightly worse than that in Area 1. Furthermore, DIOD can improve the positioning accuracy by 5.83% and 17.73% at the highest in Area 1 and Area 2 respectively, compared with IOD algorithm.

Positioning Accuracy Evaluation
Before the comparison test between different positioning algorithms, we verify the effectiveness of the weighting average module in the proposed algorithm; Equation (30) and Equation (31) are used to obtain the results, respectively, by using the same UAPs and SRPs. Figures 15 and 16 show the results, respectively. It can be seen the proposed weighting algorithm (i.e., Equation (31)) performs better than the traditional algorithm (i.e., Equation (30)). The reason is that DIOD considers the signal propagation characteristics. Then, it is necessary to optimize the parameters of the comparison algorithm, such as the k in PLGD. Figures 17 and 18 show the influence of PLGD with different k values on positioning accuracy in two areas, respectively. Therefore, k = 3, with the best accuracy, is selected for PLGD in Area 1 and k = 2, with the best accuracy, is used for PLGD in Area 2.    Firstly, we compare the positioning results between different algorithms in the case of a small number of APs. Note that ELM requires a large number of features, and the effect of ELM is not strong in the case of a small number of APs; therefore, for this experiment, the positioning accuracy of ELM is omitted since its performance is nowhere near comparable to other algorithms. Figures 19 and 20 show the MPE in two areas, respectively. It can be seen that in the different areas, the performance results between NN and PLGD are unstable. PLGD performs better than the NN algorithm in Area 1 but the opposite in Area 2. However, the proposed positioning algorithm can maintain the best positioning accuracy in both experimental areas. The reason is that the proposed algorithm considers the relationship between RSS and the location of RP.  Secondly, we compare the optimal positioning CDF between different algorithms. For ELM, all detected APs are selected as features to train the positioning model. Figures 21 and 22 show the results in the two areas, respectively. It can be seen that the proposed algorithm obtains the best accuracy, with 1.81 m and 2.28 m in the two areas, respectively. From these experimental results, it can be seen that the proposed positioning algorithm can provide higher positioning accuracy. Furthermore, in Area 1, our proposed positioning algorithms can improve the accuracy by 16.59%, 34.18% and 54.86%, compared with PLGD, NN and ELM. Similarity, in Area 2, our proposed positioning algorithms can improve the accuracy by 44.93%, 12.64% and 69.19%.

Time Cost of Proposed Positioning Algorithm
The positioning time of locating one TP using the proposed positioning algorithm in two areas is about 1 s, but the DIOD algorithm can reduce the time cost. When five APs are selected by DIOD, the time cost of our positioning algorithms required to locate a single TP in the two areas is about 0.093 s and 0.029 s, respectively. Although the time is larger than that of traditional simple algorithms, such as the NN algorithm, the time cost of our proposed algorithm still allows Wi-Fi to fuse with other sensors and further improve the positioning accuracy. In summary, the proposed algorithm is meaningful because it can obtain high-precision positioning results by adding the negligible cost of time.

Further Discussions
Limitations. Our proposed algorithm achieves high-precision positioning results. It does not require any information about APs and uses all RSS samplings for positioning. However, the proposed algorithm also faces the issue of device heterogeneity. We will further improve it by investigating its potential integration with state-of-the-art calibrationfree positioning techniques such as those in [45,48].
Future Directions. Although deep learning needs a lot of data for model training and its interpretability is poor, its effect is good. Therefore, deep learning has developed rapidly in various fields. We will explore the possibility of combining the proposed extreme value-based method and deep learning to improve the location accuracy.

Conclusions
In this paper, a novel AP selection algorithm fusing the RSS distribution and IOD is firstly proposed to evaluate the importance of APs for positioning. Then, a new positioning algorithm including area division, SCs, UAP selection and weighted average is proposed to obtain higher positioning accuracy. For the AP selection algorithm, experimental results show that the positioning accuracy calculated using DIOD is higher than the positioning accuracy calculated using the IOD algorithm. For the positioning algorithm, experimental results show that the proposed positioning algorithm performs better than the existing positioning algorithms. The low time cost also allows for Wi-Fi to be fused with other sensors' information online to obtain higher positioning accuracy.
In addition, the empirical value of the radius ρ is given. According to the experimental results conducted on two experimental areas, the radius ρ should be set between 1 and √ 2 times the adjacent RP spacing. The experimental areas in this paper are general, so the proposed algorithm may be applied on a large scale. Furthermore, fingerprintbased positioning based on Wi-Fi or other networks (for example, Bluetooth) has the same procedure, so the empirical value of the radius can be scaled for fingerprint-based positioning based on other networks.