A Two-Level WiFi Fingerprint-Based Indoor Localization Method for Dangerous Area Monitoring

Localization technologies play an important role in disaster management and emergence response. In areas where the environment does not change much after an accident or in the case of dangerous areas monitoring, indoor fingerprint-based localization can be used. In such scenarios, a positioning system needs to have both a high accuracy and a rapid response. However, these two requirements are usually conflicting since a fingerprint-based indoor localization system with high accuracy usually has complex algorithms and needs to process a large amount of data, and therefore has a slow response. This problem becomes even worse when both the size of monitoring area and the number of reference nodes increase. To address this challenging problem, this paper proposes a two-level positioning algorithm in order to improve both the accuracy and the response time. In the off-line stage, a fingerprint database is divided into several sub databases by using an affinity propagation clustering (APC) algorithm based on Shepard similarity. The online stage has two steps: (1) a coarse positioning algorithm is adopted to find the most similar sub database by matching the cluster center with the fingerprint of the node tested, which will narrow the search space and consequently save time; (2) in the sub database area, a support vector regression (SVR) algorithm with its parameters being optimized by particle swarm optimization (PSO) is used for fine positioning, thus improving the online positioning accuracy. Both experiment results and actual implementations proved that the proposed two-level localization method is more suitable than other methods in term of algorithm complexity, storage requirements and localization accuracy in dangerous area monitoring.


Introduction
Indoor localization plays an important role in public safety and disaster management and relief [1]. Location data can help managers warn people who make unauthorized entry into areas to enhance public safety. Such situations include the monitoring of dangerous military areas or indoor inflammable stacking areas. Another example is the fire emergence response: when a fire incident occurs in a building, management personnel can locate both the disaster site and people quickly [2]. During disaster relief operations, location information can assist rescuers to quickly acquire tools, get closer to those to be rescued and evacuate them [3,4].
Common indoor positioning technologies include range-based and non-range-based [5,6]. The former is realized by measuring the physical relationship between the measurable parameters and the position of the signal arrival time and signal strength. The latter includes two phases: offline

Review of Related Literature
A wireless fingerprint indoor positioning method usually has two phases: offline and online. The positioning performance of the online phase determines the overall performance of the system and its positioning efficiency. Commonly used matching algorithms include K-nearest neighbor (KNN) [10], kernel-based algorithm [11], support vector machine [12], artificial neural network (ANN) [13,14], and SVR [15].
In terms of performance, a localization system using the above algorithms usually cannot simultaneously meet the requirements of localization accuracy, algorithm complexity and storage reduction. For example, the KNN algorithm is simple but not precise enough, while ANN has high positioning accuracy but its algorithm is complex and needs much storage. Research shows that the (1) RSSI data collection and pre-processing: collecting RSSI fingerprint data multiple times, and filter them to reduce fluctuations resulting from changing indoor temperature, humidity, multipath effect and other factors. (2) Data segmentation: the offline fingerprint data, clustered by the APC algorithm, is segmented into different fingerprint subsets according to the Shepard-similarity between the fingerprint data.
(3) Localization model creation: in different fingerprint subsets, the PSO-SVR algorithm is used for training to obtain a localization model.
The online stage includes the following steps: (1) RSSI data collection and pre-processing: RSSI signals are collected and pre-processed in a similar way as in the offline phase. (2) Coarse localization: according to the cluster center similarity metric, the similarity between the fingerprint of the target point and each cluster center is calculated and the fingerprint subset with the highest similarity is selected for further localization. (1) RSSI data collection and pre-processing: collecting RSSI fingerprint data multiple times, and filter them to reduce fluctuations resulting from changing indoor temperature, humidity, multipath effect and other factors. (2) Data segmentation: the offline fingerprint data, clustered by the APC algorithm, is segmented into different fingerprint subsets according to the Shepard-similarity between the fingerprint data.
(3) Localization model creation: in different fingerprint subsets, the PSO-SVR algorithm is used for training to obtain a localization model.
The online stage includes the following steps: (1) RSSI data collection and pre-processing: RSSI signals are collected and pre-processed in a similar way as in the offline phase. (2) Coarse localization: according to the cluster center similarity metric, the similarity between the fingerprint of the target point and each cluster center is calculated and the fingerprint subset with the highest similarity is selected for further localization. Please note that the focus of this paper was on localization algorithms. Due to the space limitation, four aspects of the problem and the proposed solution are briefly introduced as follows: (1) Filtering during the data pre-processing: as described in [34], filtering is critical to the positioning accuracy. After having compared various methods including median filtering, moving average filtering, and Gaussian filtering, we believe that Gaussian filtering is better than the other two methods in that it has smaller variance and more smoothing results. Therefore, the Gaussian filtering has been chosen as the proposed data pre-processing method. (2) Access point (AP) selection: it is not always good to choose all APs, and whether an AP is selected as a fingerprint depends on two aspects: 1) whether its fingerprint vector is stable, since there are significant differences in RSSI values collected between online and offline Please note that the focus of this paper was on localization algorithms. Due to the space limitation, four aspects of the problem and the proposed solution are briefly introduced as follows: (1) Filtering during the data pre-processing: as described in [34], filtering is critical to the positioning accuracy. After having compared various methods including median filtering, moving average filtering, and Gaussian filtering, we believe that Gaussian filtering is better than the other two methods in that it has smaller variance and more smoothing results. Therefore, the Gaussian filtering has been chosen as the proposed data pre-processing method. (2) Access point (AP) selection: it is not always good to choose all APs, and whether an AP is selected as a fingerprint depends on two aspects: (1) whether its fingerprint vector is stable, since there are significant differences in RSSI values collected between online and offline stages, resulting in large positioning errors; (2) whether it has a high number of occurrences since this determines the total error resulted from the differences between default values and real values. An indicator is defined as follows: where η AP j is the occurrence frequency of AP j ; SD j is the standard deviation of RSSI of AP j ; δ is a minimal positive number set to prevent the zero denominator of Formula (1). SD j is defined as following: where RSSI ij is the ith sampling value of the AP j , and RSSI j is the mean value of the sampling value of the AP j . After calculating indicators of all APs and sorting them, the largest APs were selected as features of the fingerprint vectors.
(3) Matching in the coarse localization stage: there are two kinds of criteria for coarse positioning as reference states [9,21]: (1) cluster center similarity criterion, which calculates the similarity between real-time fingerprint and each cluster center according to the similarity definition, and select the cluster with the highest similarity as the cluster matching result; (2) the average fingerprint similarity criterion, which firstly calculates the average value of the every cluster fingerprint, and then calculates the similarity between the real-time fingerprint and the average value of each cluster according to the similarity definition, and selects the cluster with the highest similarity as the cluster matching result. For the sake of simplicity of the algorithm, this paper adopted the first criterion as the cluster matching criterion in the coarse localization stage. (4) Radio map building: the common method of establishing a fingerprint database is to manually collect RSSI fingerprint data at a fixed reference point (RP). Based on the results from [35], usually, the denser the RPs deployed in the target location area, the higher the localization accuracy. However, such a manual method leads to a complicated position system and a heavier workload in the offline phase. This paper adopted the Gaussian Process Regression (GPR) model to predict the fingerprint of some virtual RP points based on collection of limited real RP points, thus reducing the workload when building the radio map.

The Shepard Similarity Metric
The clustering process classified sample points according to their similarity. It required the similarity within the cluster to be as large as possible and the similarity between clusters to be as small as possible. Therefore, calculating the similarity of clustering objects was one of the most important tasks of the clustering algorithm. There are various similarity measurements: Minkowski distance, cosine similarity, Pearson similarity and Jaccard similarity [9,31]. A Minkowski distance can be denoted as: where s i and s j represent fingerprints of two points in l dimensions and, if p = 1, then that distance is the Manhattan distance and p = 2 is the Euclidean distance. The format of s i is defined as follows: where the fingerprint of the referent point i includes RSSIs received from l APs. But the situation in indoor fingerprint location is a little different from the above. The power attenuation model of the wireless signal is as follows.
where PL(d) is received RSSI at the distance d from the anchor point; PL(d 0 ) represents the received RSSI when the distance is d 0 from the anchor point; ζ is the random variable compensating for fluctuations and n is the path loss coefficient. As can be seen from the above formula, RSSI and distance d satisfy the property of a logarithmic relationship rather than a linear relationship. Therefore in indoor fingerprint localization, the method used Formula (3) is not suitable for RSSI signal similarity measurement. In order to punish the RSSI role of distant node signals, weak signals and noise signals in localization, the node signal similarity is defined as the exponential form of Euclidean distance-the Shepard similarity: The similarity of distant nodes can thus be rapidly reduced, eliminating the superposition of distant signals in the estimation of the node position.

The APC Algorithm Based on Shepard Similarity
The APC algorithm is a very effective clustering algorithm. It differs from other traditional clustering algorithms, where the required number of categories and clustering centers are usually set at the beginning of the iteration of the clustering algorithm. In the APC algorithm, each data object has the same chance to become the clustering center, thus eliminating dependence on initial conditions.
In the APC clustering algorithm, two types of information are exchanged between data points: i and j: responsibility and availability. These two pieces of information are represented by the matrices R (i, j) and A (i, j), respectively. R (i, j) quantifies the confidence level at which the data points j are selected as the cluster center by the data points i; A (i, j) quantifies the appropriateness of the data points i to select the data points j as their cluster centers. In the process of iterative updating, the cluster centers and the corresponding clustering center of each data point are determined by R (i, j) and A (i, j) together.
In localization, responsibility and availability are denoted as R(s i , s j ) and A(s i , s j ), where s i has the fingerprint format as Formula (4) shows, and the calculation formulas of R(s i , s j ) and A(s i , s j ) are as follows: A(s i , s j ) = min{0, R(s j , s j ) + n max{0, R(s n , s j )}} (8) where S Shepard (s i , s j ) represents the Shepard similarity. ∀i, j, n ∈ {1,2, . . . ,N}, and i j, n j in Equation (7), and n i, j in Equation (8), and N is the total number of RP. R(s i , s j ) and A(s i , s j ) are updated as follows during the iteration: where λ is a damping coefficient to prevent numerical oscillation, determining the robustness of the iterative process. When two or more data points can be selected as the clustering center of the same class, the clustering center of this class may oscillate between these data points, leading to the failure of the algorithm to converge. The value is generally set in (0.5, 1). The specific process of the APC is presented as Algorithm 1: Cluster: V = {C 1 , C 2 , . . . , C k }; # Initialization: preference ρ; damping coefficient λ; Shepard similarity matrix S according to Formula (6); Availability matrix A = [0] n × n; Responsibility matrix R = [0] n × n; for k = 1:num_iteration if stop condition is not satisfied calculating the R (s i , s j ) and A (s i , s j ) according to Formula (7), (8); updating the R (s i , s j ) and A (s i , s j ) according to Formula (9), (10); end end # Exemplar identification: Classifying the RP k as cluster center; end end # Cluster assignments: Classifying other RP to corresponding cluster according to similarity; The APC algorithm includes following steps: (1) Initialization: Formula (6) is used to measure the similarity and calculate the similarity matrix S, and the damping coefficient λ is set to 0.9. The responsibility and availability matrices are initialized to zero matrix. The last parameter preference is defined thus: where ρ indicates preference, meaning the possibility of RP i becoming a cluster center. Since all data points can be used as potential clustering centers in the initialization, data points have the same preference values at the beginning, which is usually a ratio of the median or minimum value of a similar matrix. The preference determines the number of clusters: the larger the preference, the more clusters will be generated. The coefficient γ is used to control the value of preference. (2) Iteration process: including the responsibility matrix and availability matrix update and iteration stop condition judgment. (3) Cluster center judgment and classification: determining whether it is a cluster center according to A(s i , s j ) + R(s i , s j ) > 0, and classifying others' RP according to similarity.

The SVR Model and its Main Parameters
Due to the complexity of the indoor environment and the interference from noises such as people and multipath effects, the mapping between the high-dimensional signal strength space and the physical position is highly nonlinear in a localization algorithm based on fingerprints. The SVR algorithm was selected as the localization model in the fine positioning stage, performing well to solve nonlinear and high dimension problems.
The basic idea of SVR is to create a hyperplane, so that the distance from each point to the hyperplane is the shortest. For a given sample set (x i , y i ), i = 1, 2, · · · , m , where x i ∈ R n is the n dimensions input vector and y i ∈ R is the corresponding output, the goal of SVR is to explore the mapping relationship that correctly reflects the existing relationship between input and output: where f (x) is the predictive output expression, Ñ is the adjustable weight vector and b denotes bias. Φ(x i ) : R n → R d is the nonlinear mapping of the input data in the sample set x i from the original space R n to the high-dimensional feature space R d .
In the SVR, the insensitive coefficient ε is used as the loss function, representing the tolerance of the error between the predicted value and the true value. An error "band" must be set up in order to contain all samples; it must be as narrow as possible. The middle line in the band is the regression line. The insensitive coefficient ε is shown in Figure 2. and multipath effects, the mapping between the high-dimensional signal strength space and the physical position is highly nonlinear in a localization algorithm based on fingerprints. The SVR algorithm was selected as the localization model in the fine positioning stage, performing well to solve nonlinear and high dimension problems.
The basic idea of SVR is to create a hyperplane, so that the distance from each point to the hyperplane is the shortest. For a given sample set {( , ), 1,2, , } is the n dimensions input vector and i y R ∈ is the corresponding output, the goal of SVR is to explore the mapping relationship that correctly reflects the existing relationship between input and output: (12) where f (x) is the predictive output expression, ɷ is the adjustable weight vector and b denotes bias.
is the nonlinear mapping of the input data in the sample set xi from the original space n R to the high-dimensional feature space d R . In the SVR, the insensitive coefficient ε is used as the loss function, representing the tolerance of the error between the predicted value and the true value. An error "band" must be set up in order to contain all samples; it must be as narrow as possible. The middle line in the band is the regression line. The insensitive coefficient ε is shown in Figure 2.  Parameter ε plays an important role in performance: it directly determines the complexity and generalizability of the model, as well the number of support vectors used to construct it. In order to avoid over-fitting, parameter ε must be specially selected: the smaller the value, the more robust is the model, although there may be zero tolerance for training error. On the other hand, the bigger the value, the higher the tolerance to training error. To achieve a balance in this regard, it is necessary to use a set of relaxation variables ξ, ξ * for the sample points outside the "band"-that is, which predictive error is greater than the parameter ε, so that the model can tolerate noises or outliers ( Figure 2).
In order to obtain the mapping relationship between physical position and fingerprint vector, the kernel function is introduced to map the original fingerprint data from original space into a higher dimension space, making it linearly separable in this new space. The mathematical expression of the model is then: where C is the penalty factor, a constant that has been pre-set and C > 0; it determines the punishment degree of samples whose predictive error is greater than parameter ε. This is a convex quadratic optimization problem which can be solved by constructing a Lagrange function: where, α k , α * k (k = 1, 2, . . . , n) is the Lagrange multiplier. This is also a convex quadratic optimization problem. According to Karush-Kuhn-Tucker (KKT) conditions, we can derive: After replacing Formula (14) with the value of each parameter variable in Formula (15), this dual optimization problem can be obtained: Linear expressions in high dimensional space can be determined in order to solve the target position coordinates: In Formula (17), the kernel function K(s k , s) = Φ(s k )·Φ(s) is introduced which is the inner product of Φ(s k ), Φ(s). The kernel function can map the linear inseparable data in the input space (low-dimensional) into the linear separable data in the high-dimensional feature space (regenerated kernel Hilbert space) through a feature map. There are already many kinds of kernel functions. This paper adopted a Gaussian kernel function, which has strong anti-interference ability against noises or outliers in samples. The Gaussian kernel function is represented as: where σ is a user-specified parameter which indicates the width of the Gaussian kernel and |·| 2 denotes the L 2 distance between two input RSSI vectors. The regression model of the vector and position information can thus be established through the constant optimization of parameters in the offline stage. In the online stage, the coordinates of the real-time position can be obtained by entering the collected real-time fingerprint data into the trained regression model.
It is thus evident that three parameters-insensitive coefficient ε, penalty factor C and width of Gaussian kernel σ-are critical during the regression process.

The PSO Algorithm
The PSO algorithm is one of the most widely used optimization algorithms. It is characterized by simulating the cluster intelligence of birds and realizes global searching through cooperation and competition among individuals. It is widely used in neural network training, SVM parameter optimization and objective function optimization [36].
The mathematical description of the PSO algorithm is: let in one-dimensional space, a swarm consisting of m particles X = (x 1 , x 2 , · · · , x m ) T , the q th particle position is According to the principle of following the optimal particle, particle x i updates its position and velocity as follows: where c 1 and c 2 are the learning factor or acceleration constant and rand 1 (), rand 2 () ∈ (0, 1) is a random number. The l th dimension position and velocity of the particle q in the k th iteration are x q l (k) and v q l (k) respectively. The individual extreme position of the particle q in the l th dimension is p i l (k). The global extreme position of the swarm in dimension l is p g l (k). The main steps in PSO are shown in Table 1: Table 1. The particle swarm optimization (PSO) algorithm.
Step Content

1
Initializing the swarm, including the population size M, the position x q and velocity v q of each particle. 2 Calculating the fitness value of all particles: fit (q). 3 Comparing the fit (q) in its current position with that in the best position it has passed through. If the new value is greater, replace the individual extreme. 4 Comparing the fit (q) in its current position with that in the best position all particles have passed through. If the new value is greater, take the current position as the global best position. 5 Updating the position and velocity with Formulae (19) and (20). 6 Repeating steps (2)(3)(4)(5) until the stop condition is met or the iteration reaches the maximal number. 7 Obtaining every particle's best position, local optimal solution, global position and global optimal solution.

The SVR Optimized by PSO
Since the SVR performance depends on its kernel function width σ, penalty factor C and insensitive coefficient ε, the selection of these three parameters directly affects the learning and generalizability of SVR. The search rules for PSO are simple and easy to implement and the convergence speed is also fast. Based on these advantages, the PSO was adopted to optimize the parameters of the SVR model in this paper. The optimization process is as follows: Step 1: Determining the fingerprint training sample as the input of this algorithm; Step 2: Using the PSO algorithm to determine the SVR optimal parameters: Step 2.1: Initializing the swarm: the population size is m, the q th particle p q = (C q , ε q , δ q ) and its dimension n = 3, so that the particle dimension equals 3. A Gaussian kernel function was adopted in this paper; Step 2.2: Calculating the fitness value of each particle, expressed as: where f (s k ) is the predicted value of the k th fingerprint sample tested; y k is the actual value of the k th test sample; h is the number of test samples; Step 2.3: In each iteration process, updating the individual extremum p k d and global extremum p g d of each particle according to the fitness value of each particle; Step 2.4: Updating the velocity and position of particles according to Formula (19) and (20); Step 2.5: If the fitness value has met some requirements or the maximum iteration number has been reached, the algorithm stops and global extreme value is output; otherwise, return to Step 2.1.
Step 3: Take the optimal value determined by the algorithm as the optimal parameter of the model.
The flowchart of the PSO-SVR algorithm is shown in Figure 3.
value of the k test sample; h is the number of test samples; Step 2.3: In each iteration process, updating the individual extremum k d p and global extremum g d p of each particle according to the fitness value of each particle; Step 2.4: Updating the velocity and position of particles according to Formula (19) and (20); Step 2.5: If the fitness value has met some requirements or the maximum iteration number has been reached, the algorithm stops and global extreme value is output; otherwise, return to Step 2.1.
Step 3: Take the optimal value determined by the algorithm as the optimal parameter of the model.
The flowchart of the PSO-SVR algorithm is shown in Figure 3.

Verification in a Real-Life Application
The proposed algorithm is applied to a dangerous area monitoring system in a chemical factory area. In order to verify its performance, the traditional positioning algorithm and the positioning algorithm proposed in this paper are experimentally analyzed by using the fingerprint data collected in the real environment.

Factory Dangerous Area Monitoring System
Building a smart factory in a coking company faces many challenges, including complicated processes, a variety of production equipment, an extensive production workshop and numerous

Verification in a Real-Life Application
The proposed algorithm is applied to a dangerous area monitoring system in a chemical factory area. In order to verify its performance, the traditional positioning algorithm and the positioning algorithm proposed in this paper are experimentally analyzed by using the fingerprint data collected in the real environment.

Factory Dangerous Area Monitoring System
Building a smart factory in a coking company faces many challenges, including complicated processes, a variety of production equipment, an extensive production workshop and numerous hazardous sources such as sulfuric acid. A chemical workshop of this coking company as shown in Figure 4a was selected for this study. By deploying a new WiFi system, the company designed a high-precision indoor positioning system to monitor dangerous areas and assist in emergency evacuation.
hazardous sources such as sulfuric acid. A chemical workshop of this coking company as shown in Figure 4(a) was selected for this study. By deploying a new WiFi system, the company designed a high-precision indoor positioning system to monitor dangerous areas and assist in emergency evacuation.  The application scenario is shown in Figure 4(b). An array of wireless routers was arranged above the dangerous area. Each worker was equipped with a specially designed mobile device that contained a WiFi module ESP8266 and thus could read RSSI signals, which is showed in Figure 5. Please note that this mobile device is being redesigned and will be eventually embedded in a badge ID card. In the offline stage, all fingerprint data were sent to a server to be clustered and used to train an SVR prediction model. Then this trained model was downloaded to the mobile devices. In the online stage, the WiFi circuit in the mobile device was receiving RSSI signals, and always localizing itself according to the position prediction model which was running totally in the mobile device. When a person entered the dangerous area, an alarm would be triggered in his/her mobile device and the actual position information would be sent to the server. If the mobile device was not authorized, an alert would be issued.

Indoor Environment
The floor plan of this experimental chemical workshop is depicted in Figure 6. It covered an area of approximately 677.22 m 2 , of which the length was 31.75 m and the width 21.33 m. A rectangular coordinate system was established within the northwest corner of the workshop as the origin of coordinates. One hundred and sixty RPs (blue points in Figure 6 were evenly deployed in the room, with an interval of two meters. There were 11 APs, and the sampling interval was set to one second, thus obtaining a total of 9600 samples within the acquisition time of 60 s. For simulating the actual scenario and comparing different algorithms, a wireless scanning tool inSSIDer and laptop ThinkPad The application scenario is shown in Figure 4b. An array of wireless routers was arranged above the dangerous area. Each worker was equipped with a specially designed mobile device that contained a WiFi module ESP8266 and thus could read RSSI signals, which is showed in Figure 5. Please note that this mobile device is being redesigned and will be eventually embedded in a badge ID card. In the offline stage, all fingerprint data were sent to a server to be clustered and used to train an SVR prediction model. Then this trained model was downloaded to the mobile devices. In the online stage, the WiFi circuit in the mobile device was receiving RSSI signals, and always localizing itself according to the position prediction model which was running totally in the mobile device. When a person entered the dangerous area, an alarm would be triggered in his/her mobile device and the actual position information would be sent to the server. If the mobile device was not authorized, an alert would be issued. hazardous sources such as sulfuric acid. A chemical workshop of this coking company as shown in Figure 4(a) was selected for this study. By deploying a new WiFi system, the company designed a high-precision indoor positioning system to monitor dangerous areas and assist in emergency evacuation.  The application scenario is shown in Figure 4(b). An array of wireless routers was arranged above the dangerous area. Each worker was equipped with a specially designed mobile device that contained a WiFi module ESP8266 and thus could read RSSI signals, which is showed in Figure 5. Please note that this mobile device is being redesigned and will be eventually embedded in a badge ID card. In the offline stage, all fingerprint data were sent to a server to be clustered and used to train an SVR prediction model. Then this trained model was downloaded to the mobile devices. In the online stage, the WiFi circuit in the mobile device was receiving RSSI signals, and always localizing itself according to the position prediction model which was running totally in the mobile device. When a person entered the dangerous area, an alarm would be triggered in his/her mobile device and the actual position information would be sent to the server. If the mobile device was not authorized, an alert would be issued.

Indoor Environment
The floor plan of this experimental chemical workshop is depicted in Figure 6. It covered an area of approximately 677.22 m 2 , of which the length was 31.75 m and the width 21.33 m. A rectangular coordinate system was established within the northwest corner of the workshop as the origin of coordinates. One hundred and sixty RPs (blue points in Figure 6 were evenly deployed in the room, with an interval of two meters. There were 11 APs, and the sampling interval was set to one second, thus obtaining a total of 9600 samples within the acquisition time of 60 s. For simulating the actual scenario and comparing different algorithms, a wireless scanning tool inSSIDer and laptop ThinkPad

Indoor Environment
The floor plan of this experimental chemical workshop is depicted in Figure 6. It covered an area of approximately 677.22 m 2 , of which the length was 31.75 m and the width 21.33 m. A rectangular coordinate system was established within the northwest corner of the workshop as the origin of coordinates. One hundred and sixty RPs (blue points in Figure 6 were evenly deployed in the room, with an interval of two meters. There were 11 APs, and the sampling interval was set to one second, thus obtaining a total of 9600 samples within the acquisition time of 60 s. For simulating the actual scenario and comparing different algorithms, a wireless scanning tool inSSIDer and laptop ThinkPad L440 with an intel Core i5-4300M (2.6 GHz/L3 3 M) processor and a wireless network card were used to collect fingerprints and other information such as physical addresses, corresponding MAC addresses etc. The preprocessed information formed a fingerprint sample after relevant information was extracted. Each location fingerprint sample included its position coordinates and RSSI values collected at this point. The processed fingerprint data were stored in the database for further task and its format is shown in Table 2. L440 with an intel Core i5-4300M (2.6 GHz/L3 3 M) processor and a wireless network card were used to collect fingerprints and other information such as physical addresses, corresponding MAC addresses etc. The preprocessed information formed a fingerprint sample after relevant information was extracted. Each location fingerprint sample included its position coordinates and RSSI values collected at this point. The processed fingerprint data were stored in the database for further task and its format is shown in Table 2.

Evaluation Criteria
The positioning performance of the indoor positioning system was mainly evaluated from the aspects of accuracy and precision. This paper evaluated the advantages and disadvantages of the positioning algorithm from two factors: the average positioning error and the cumulative error distribution function.
(1) Mean absolute error (MAE): MAE is the most commonly used indicator to measure positioning precision. The larger the value, the lower the accuracy of the positioning algorithm: (21) where,  ( )

Evaluation Criteria
The positioning performance of the indoor positioning system was mainly evaluated from the aspects of accuracy and precision. This paper evaluated the advantages and disadvantages of the positioning algorithm from two factors: the average positioning error and the cumulative error distribution function.
(1) Mean absolute error (MAE): MAE is the most commonly used indicator to measure positioning precision. The larger the value, the lower the accuracy of the positioning algorithm: where,p(i) and p(i) are the estimated and actual position of the i th tested point respectively. (2) Cumulative distribution function (CDF): The positioning distance error distribution was measured by CDF, which can intuitively reflect the percentage of positioning times in which the error is under a certain threshold: where N(e i ≤ e) is the number of tested points e i ≤ e and N is the total number tested points for positioning. By comparing the accuracy of the different positions of the algorithm, it can be seen that the higher value of the CDF under a certain position error threshold, the higher the positioning accuracy and the better the effect.

Verification of the Influence of Fingerprint Density on Positioning Accuracy
Since the fingerprint density has an important influence on the positioning accuracy, a special experiment was designed to verify this effect. Fifty additional RPs were uniformly distributed in the target location area ( Figure 6) and used to construct a larger fingerprint database together with the original 160 RPs. These 50 RP fingerprints could be obtained either by measuring actually or by the GPR method mentioned above. It was verified that the difference between the RP fingerprints with these two methods is very small. The algorithm KNN (k = 5) was used to compare the positioning accuracies before and after 50 RPs were added. As shown in Table 3 and Figure 7, it can be seen that the average absolute error of the positioning was 2.954 m and 2.364 m respectively. Although the average positioning error of both scenarios were above 2 m, the positioning error decreased 19.97% given a denser fingerprint database. The CDF curve of position errors in the denser fingerprint scenario was always on the left of that with original fingerprint database. In the case of positioning error less than 2 m, the probability in the denser fingerprint scenario was 53%, and it was about 33% in the original fingerprint scenario. All of the results mean that the denser the RPs, the higher the positioning accuracy. target location area ( Figure 6) and used to construct a larger fingerprint database together with the original 160 RPs. These 50 RP fingerprints could be obtained either by measuring actually or by the GPR method mentioned above. It was verified that the difference between the RP fingerprints with these two methods is very small. The algorithm KNN (k = 5) was used to compare the positioning accuracies before and after 50 RPs were added. As shown in Table 3 and Figure 7, it can be seen that the average absolute error of the positioning was 2.954 m and 2.364 m respectively. Although the average positioning error of both scenarios were above 2 m, the positioning error decreased 19.97% given a denser fingerprint database. The CDF curve of position errors in the denser fingerprint scenario was always on the left of that with original fingerprint database. In the case of positioning error less than 2 m, the probability in the denser fingerprint scenario was 53%, and it was about 33% in the original fingerprint scenario. All of the results mean that the denser the RPs, the higher the positioning accuracy.

The Effect of Clustering Technology on Location Performance
In this paper, the APC algorithm was firstly used to conduct partitioning in RP to narrow the positioning area; PSO-SVR was then used to achieve accurate positioning. The positioning algorithm APC-PSO-SVR is compared with the algorithm without clustering PSO-SVR in Table 4 and Figure 8. Table 4 indicates that the localization accuracy of APC-PSO-SVR was, on average, 1.478, while that of PSO-SVR was 2.174. The APC-PSO-SVR improved the localization accuracy over 30%. Thus, the clustering in the coarse localization step played an important role in localization accuracy. As it can

The Effect of Clustering Technology on Location Performance
In this paper, the APC algorithm was firstly used to conduct partitioning in RP to narrow the positioning area; PSO-SVR was then used to achieve accurate positioning. The positioning algorithm APC-PSO-SVR is compared with the algorithm without clustering PSO-SVR in Table 4 and Figure 8. Table 4 indicates that the localization accuracy of APC-PSO-SVR was, on average, 1.478, while that of PSO-SVR was 2.174. The APC-PSO-SVR improved the localization accuracy over 30%. Thus, the clustering in the coarse localization step played an important role in localization accuracy. As it can be seen from Figure 8, the APC-PSO-SVR curve was on the left side of the PSO-SVR curve, indicating that, if the same localization error (x-axis value) was set in APC-PSO-SVR and PSO-SVR, the ratio of the former was higher (y-axis value), meaning that the overall localization accuracy of APC-PSO-SVR was thus higher.

Effect of Similarity Selection on Positioning Performance
To further evaluate the influence of different fingerprint similarity calculation methods on localization performance, the Euclidean distance similarity, the cosine similarity and the Shepard similarity discussed in Section 3.2.1 were respectively used as the similarity metric, and the localization performance was compared (Table 5, Figure 9).

Effect of Similarity Selection on Positioning Performance
To further evaluate the influence of different fingerprint similarity calculation methods on localization performance, the Euclidean distance similarity, the cosine similarity and the Shepard similarity discussed in Section 3.2.1 were respectively used as the similarity metric, and the localization performance was compared (Table 5, Figure 9).  The average position errors of the indoor localization algorithm based on clustering improvement with Euclidean distance, cosine similarity and Shepard similarity were 1.709 m, 1.503 m and 1.478 m. In Figure 9, the CDF curve from Shepard similarity was on the left side of the curve from the two other similarity metrics. The experimental results showed that the Shepard similarity could better measure the similarity between real-time and stored fingerprints and could improve the localization accuracy compared with the other two similarity metrics.

Analysis of Algorithm in Fine Positioning Phase
In order to analyze the effectiveness of the positioning method proposed in this paper, the performance of system using KNN and back-propagation neural network (BPNN) was compared with that of the system using PSO-SVR in the fine localization stage. The experimental results are shown in Table 6 and Figure 10.  The average position errors of the indoor localization algorithm based on clustering improvement with Euclidean distance, cosine similarity and Shepard similarity were 1.709 m, 1.503 m and 1.478 m. In Figure 9, the CDF curve from Shepard similarity was on the left side of the curve from the two other similarity metrics. The experimental results showed that the Shepard similarity could better measure the similarity between real-time and stored fingerprints and could improve the localization accuracy compared with the other two similarity metrics.

Analysis of Algorithm in Fine Positioning Phase
In order to analyze the effectiveness of the positioning method proposed in this paper, the performance of system using KNN and back-propagation neural network (BPNN) was compared with that of the system using PSO-SVR in the fine localization stage. The experimental results are shown in Table 6 and Figure 10. In order to analyze the effectiveness of the positioning method proposed in this paper, the performance of system using KNN and back-propagation neural network (BPNN) was compared with that of the system using PSO-SVR in the fine localization stage. The experimental results are shown in Table 6 and Figure 10.  As it can be seen from Table 6, the mean localization accuracy of PSO-SVR was improved by 22.4% and 29.3% respectively than that of KNN and BPNN. In Table 6, when the error threshold was set to 50%, the position errors of PSO-SVR, BPNN and KNN were 1. As it can be seen from Table 6, the mean localization accuracy of PSO-SVR was improved by 22.4% and 29.3% respectively than that of KNN and BPNN. In Table 6, when the error threshold was set to 50%, the position errors of PSO-SVR, BPNN and KNN were 1.28 m, 1.732 m and 1.755 m respectively, which also shows that the algorithm of PSO-SVR had better performance than the other two algorithms. In Figure 10, the curve of CDF based on PSO-SVR was on the left side of the curve based on the other two algorithms as well as that shown in Figures 8 and 9. Experimental results of Figure 10 demonstrate that performance based on BPNN and KNN was nearly similar and, compared to BPNN and KNN, the localization algorithm PSO-SVR could significantly reduce the localization error and improve the localization accuracy of the system.

Algorithm Complexity and Storage Requirements
In Reference [34], the algorithm complexity was compared in detail among the SVR algorithm, the ANN algorithm and the probability-based algorithm. It concluded that the SVR algorithm has the advantage of minimal complexity and storage requirements, while the other two algorithms are so complicated and require such large storage that they cannot run in a mobile device. Thus, when transferred to services, they will aggravate the communication loads, reduce the battery life and slow down the localization process. The algorithm complexity of the proposed algorithm in the offline phase was a little higher than the SVR algorithm mentioned above because of the loop computations in PSO. However, because it could be placed in the background server, and its total number of computations was not large, the PSO-SVR algorithm in the offline phase was not the bottleneck of the system. The advantage of two-level positioning was to reduce the computational complexity and storage requirements at the online stage. The computational complexity can be represented with Formula (24) given in Reference [9]: N sim = N c + N i (24) where N c is the number of RP clusters and N i is the number of RPs in the cluster. In our scenario, the RPs were often clustered into six to twelve classes, and the maximal number of RPs in a cluster was around 40, so N sim was almost a constant which is around 52. In the actual operation, the complexity can be further reduced since only a coarse positioning would be run to compare the fingerprint between the cluster center and the mobile terminal. If the value was less than some threshold, the further fine positioning will not be carried out.
In Reference [28], the response time results of a two-level localization algorithm similar to ours are reported; some online prediction algorithms, such as Kernel based method, SVR and ANN are compared. In normal conditions, their running time and the difference among them are very little, which are less than 1 ms. It is enough for a system to respond in real time. Therefore in a two-level localization algorithm, the computation complexity of the online fine positioning algorithm no longer has a big impact on the overall system performance.
In addition, the storage requirement and communication load of this algorithm was very small. In contrast to some algorithms, which compare the fingerprint between the tested point and all RPs, the proposed algorithm only stores the prediction model, and its size was less than 100 K bytes, far smaller than the entire fingerprint database (about 6 M bytes in our application). At the same time, only some warning and request messages were sporadically transmitted between the mobile terminal and the server.

Localization Accuracy
In Table 4 and Figure 8, the positioning error of the algorithm was 1.478 m on average; and 1.28 m under the accuracy threshold at 50%. In Reference [35], some popular algorithms for indoor fingerprint localization are listed. It can be noted that the position error was related to the RP interval and that the error was mostly 1.5 m in the case of 2 m RP interval and 50% accuracy. In Reference [34], some localization algorithms were reviewed and a similar result can be observed, and most position errors with these algorithms are 1~2 m. Even though there is some difference in verification circumstances, it can be concluded that the proposed algorithm was advantageous in accuracy.
In our application scenario, the worker uses a self-made mobile device containing a WiFi module to measure RSSI and localization, in which the battery power was limited. From the perspective of battery life and positioning accuracy, the SVR-based and two-level positioning method was the most suitable for monitoring dangerous areas.

Conclusions and Future Work
One of the core technologies for disaster management is the indoor positioning system, which can help rescue more people and prevent property loss due to the accurate and rapid localization of people and assets at the beginning of a disaster.
However, due to the fluctuation of wireless signals, the change in environments and monitoring areas, and the increase of reference points, indoor fingerprint positioning systems often face the contradiction of positioning accuracy, computational complexity and storage requirements. This paper proposed a two-level positioning method to address the challenge: in the coarse positioning stage, an APC algorithm based on Shepard similarity was used to realize clustering; then, in the fine positioning stage, an SVR with PSO parameter optimization was used for precise positioning. Both experiment results and actual implementations proved that the proposed method was more suitable than other methods in term of algorithm complexity, storage requirements and localization accuracy in dangerous area monitoring Future work will focus on the miniaturization of the mobile device and the integration of fingerprint indoor positioning with pedestrian dead reckoning (PDR) positioning. In fire scenarios, where the environment changes quickly, the fusion of indoor fingerprint positioning and PDR can be used for disaster management. Other sensors, such as magnetometers, gyroscopes and accelerometers can also be considered for integration with indoor RSSI fingerprint positioning to improve positioning performance.
Author Contributions: F.L. was responsible for methodology, validation and original draft writing, and M.L. guided the whole process, and also reviewed the manuscript. Y.Z. worked in software development and experimental data collection, W.S. instructed the validation, reviewed and re-edited the manuscript.