On the Statistical Errors of RADAR Location Sensor Networks with Built-In Wi-Fi Gaussian Linear Fingerprints

The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis.


Introduction
The significantly growing interest in ubiquitous computing and context-awareness applications has required reliable, accurate and real-time localization technologies to locate the users' positions in high-speed, seamless and heterogeneous wireless personal networks (WPNs), especially in the in-building environments where the Global Navigation Satellite System (GNSS) and geolocation in cellular system are not accurate enough [1][2][3]. Ranging from the military to public uses, from the urban to rural regions, and from the outdoor to indoor areas, location based services (LBSs) were already widely favored and popularized in the recent decade [4][5][6]. The typical LBSs mainly involve human navigation in unfamiliar buildings, robot path planning and guidance, health care inside modern hospitals, location-based enhanced sensing, entity and storage tracking and management. In GNSS, the Global Positioning System (GPS) can provide 10 m accuracy for the standard positioning service [7]. Global Navigation Satellite System (GLONASS) can achieve 1.5 m accuracy for civilian use [8]. The Galileo Positioning System is supposed to provide the highest 1 m accuracy for civilian applications [9]. The Beidou System is planned to offer the 20 m accuracy with only three satellites [10]. However, the accuracy in all these systems will be seriously deteriorated in closed in-building environments by the infrastructure barriers, body shadowing, RSS attenuation and multi-path interference [11][12][13].
In the meantime, there is also a large body of in-building localization or navigation systems which can be commonly categorized into fingerprint, model and measurement-based systems [14][15][16]. With the help of RSS sensing from each visible Wi-Fi access point (AP) or the Wi-Fi wireless router, the world's first and most representative fingerprint-aided RADAR system was presented by Microsoft Research in 2000 [17]. Cambridge's Active Bat can find users' positions by calculating the time difference of arrival (TDOA) between the ultrasound and radio frequency signals based on a multi-lateration algorithm [18]. Carnegie Mellon's CMU-PM and CMU-TMI location systems, respectively, rely on the Manhattan distance and offset mapping relations of each fingerprint [19]. UCLA' Nibble system can be suggested as the first signal to noise ratio (SNR)-based localization system in Bayesian networks [20]. The Horus system invented by The University of Maryland can be recognized as a practical and efficient solution to the small-scale attenuation compensation problem with the help of continuous space estimation and location clustering algorithms [21]. MIT's Cricket system has been used to build interactive video games through the interaction between different pervasive computing devices and has also achieved better performance in location privacy, scalability and tracking agility [22]. In addition, The Pitt's Voronoi system [23] and RWTH's hidden Markov localizer [24] have also provided some preliminary analyses of how to improve the accuracy of fingerprint-aided localization.
Among them, the Wi-Fi fingerprint-based localization system outperforms the other two systems because of the following three advantages: (1) RSS variations in real in-building areas cannot be easily characterized by a simple attenuation model due to the changes of directions or angles. Therefore, the construction of effective and reliable model-based localization systems will always involve high labor and time costs [25]; (2) meanwhile, measurement-based systems (e.g., the arriving time, time of difference and angles) will require special infrastructure and deployment which consequently results in higher maintenance and energy consumption [26]; (3) fingerprint-based systems rely on existing lower-priced Wi-Fi devices, non-registered 2.4 GHz ISM band and free 802.11 b/g protocol licenses [27].
The Wi-Fi fingerprint-based localization in RLSNs generally consists of the following three steps [28][29][30]: (1) in the off-line (or calibration) phase, the Wi-Fi APs are deployed to provide sufficient and seamless RSS coverage in the target location areas, which means, at any physical position, the user can detect and sense the continuous-time RSS from two or more visible APs; (2) the coordinates of pre-calibrated RPs (or the physical locations for fingerprint matching) and their associated RSS samples will be saved as the fingerprints in the radio map. At this point, the fingerprints can be suggested as the mapping relationships between the physical coordinates and pre-sensed RSS values. For example, the fingerprints in RLSNs can be defined as the mapping relations between the 2-D coordinates and user datagram protocol (UDP) RSS samples; (3) in the on-line (or estimation) phase, by matching the new sensed RSS to the pre-stored fingerprints (fingerprint matching), the users' positions will be estimated by the equal or unequal-weighted sum of the (K) neighbors' coordinates.
Therefore, we can observe that the statistical errors in RLSNs depend significantly on the fingerprint recording in the off-line phase and fingerprint matching in the on-line phase. To the best of our knowledge, three typical models are commonly used for studying the statistical errors in fingerprint-based RLSNs, known respectively as the experimental model, node-pair model and random model. The first model always involves significant labor and time cost, but it can be suggested as the simplest way to evaluate the performance and satisfy the industrial requirements [17]. The second one involves the idea of examining the RSS difference in each RPs' pair. In this case, the bigger the overlap of the RSS distributions, the larger the statistical errors that will be probably induced [23]. The last one normally relies on computer simulations (e.g., the Monte Carlo method) with lower practical similarities [31].
This paper is divided as follows: Section 2 provides an overview of the in-building RADAR system in RLSNs and some related work on the statistical errors. In Section 3, with a general idea of the simple linear distribution model, the mathematical relations about the expected linear errors in the RLSNs are significantly discussed using the assumption of a logarithmic Gaussian strength-varying model. In Section 4, some numerical and experimental results in the equal and unequal-weighted RLSNs are addressed. Finally, the conclusions and challenges for our future extended work are summarized in Section 5.

Architecture of RADAR System in RLSNs
As we know, the fingerprint-based RADAR system in Wi-Fi RLSNs is also called the K nearest neighbors (KNN) or weighted K nearest neighbors (WKNN) localization, shown in Figure 1. Moreover, RADAR localization system can be recognized as a global matching process between the new sensed RSS and pre-stored fingerprints in a radio map, and find the front RPs with smaller RSS difference for the coordinates' estimation. However, by the KNN or WKNN location algorithm, although the pre-sensed RSS-mean at the RPs can be normally characterized by some distance dependence models, the on-line new recorded RSS will always vary a lot. Therefore, if we assume the Gaussian model satisfied at the TP, the larger standard deviations will consequently result in larger confidence probabilities of selecting the physically distant RPs as the neighbors.  Figure 1, we can observe that although the radio map construction involves a high labor cost and cumbersome work for the deployment of RPs and associated RSS sensing, it should be done during the off-line phase. Further, the fingerprint matching by KNN and WKNN algorithms will also significantly influence the accuracy performance of the RADAR system in RLSNs. The estimated positions KNN where AP and RP are the numbers of APs and RPs; and respectively denote the expectation of the RSS at TP and RSS-mean at the RP ; is the number of elements of the set ; 1, , represents the probability of to be selected as the most RSS-adjacent RP.

Previous Work on Statistical Errors of RADAR System in RLSNs
As mentioned before, there are three types of modeling for evaluating the localization errors in the RSS-based RLSNs as follows: (1) Experimental model. This model is the simplest one for wide industrial applications and overall system performance evaluation. However, during the modeling, we need to experimentally discuss the performance of each technical parameter under different fingerprint conditions to find the best system architecture. Therefore, this model will consequently involve of a large amount of labor and time costs in the off-line phase [12,17,21] and [29].
(2) Node-pair model. In this model, we always assume the Wi-Fi APs are located symmetrically in the location area, RPs are calibrated uniformly as grids, and the logarithmic Gaussian attenuation channel is satisfied. The most representative work about this model can be found in [23]. Although there are some preliminary analytical results addressed in that paper that can be applied to 2D areas, the accuracy in RLSNs cannot be effectively guaranteed when the number of neighbors is larger than 1 because this model mainly focuses on the RSS relations in each pair of RPs. Meanwhile, the overlapping of the RSS distributions in Gaussian model is suggested as the reason for localization errors. At this point, if we make an assumption there are two RPs, and respectively with RSS-mean , , , , AP and , , , , AP , and is addressed as our target (3) Random model. Under this model, the RPs are assumed to be located randomly (or distributed by the Monte Carlo simulations), the parameter "physical distance to each AP" in logarithmic channel will be normally simplified to some other special measurements [31] (e.g., the log-attenuation property is supposed to be satisfied in every direction from each RP, not just in the direction "from AP to RP").
Overall, as a general and simple model of the in-building straight corridors, the analytical analysis on the statistical errors in linearly distributed RPs environment seems to be much more necessary and important than the other environments, such as the offices, washrooms and meeting rooms. Further, the differences with the previous work about statistical error discussions can generally be summarized as follows: (1) A better RADAR system in RLSNs with higher expected localization accuracy can be designed only with the help of the analytical relations addressed in the paper, but not involving any cumbersome work for the experimental evaluations. Therefore, the labor and time costs can be saved; (2) There are K (K > 2) neighboring RPs to be considered together in the on-line estimation phase, not just one pair of two RPs. Meanwhile, the linear model introduced in this paper can also be verified to perform much more effectively because the increase of K will also improve the localization accuracy; (3) Last but not least, the deployment of RPs with distance interval r is more practical and reasonable compared to the random models and it will also achieve lower cost for the fingerprint recording.
As far as we know, the RSS-based in-building localization in RLSNs has been widely favored in applications ranging from the military to public uses. For example, for both travelers and robots in unfamiliar environments, it is necessary to locate their positions in real-time and provide navigation services or ease the complexity of path planning. Compared to the traditional ultra-sound and LADAR location sensor networks in the in-building areas, the Wi-Fi fingerprint-based RLSNs will provide a better alternative way to locate the users in the aspects of infrastructure cost and accuracy performance. Currently, in lots of modern hospitals, schools and health care centers, the elderly, disabled people or children will always need to be located or tracked by their doctors or parents. If there is an emergency or someone is out of his/her permitted area, the doctors or parents will be notified in real-time and acknowledged with the help of the interaction between the service centers, APs and Wi-Fi RSS sensors attached on the people's body.

Notations and Parameters in RLSNs
In the results that follow, the notations and parameters used are listed in Table 1. Physical position of the -th RP.

1, , RP
Physical distance between the -th RP and AP (in meter).
Physical distance between the TP and AP (in meter).
Interval of distance-adjacent RPs (in meter).
Physical distance between the -th RP and TP (in meter).
Transmit power of AP (in dBm).
Expectation of the new sensed RSS at TP in Gaussian distribution (in dBm).
Standard deviation of the new sensed RSS at TP (in dBm).
Path loss in the first meter in logarithmic attenuation channel (in dBm).
Path loss exponent in logarithmic attenuation channel. , , Expectation of the function with respect to the variable .

General Idea of a Simple Linear Distribution Model
As shown in Figure 2, there are RP RPs uniformly calibrated in the linear location area with distance to the AP. The pre-sensed RSS-mean at each RP and the new sensed RSS-mean at TP are respectively calculated by the logarithmic channel 10 lg and 10 lg . The geometrical relations and associated RSS distributions to be discussed are also presented in Figure 2. By the assumption of Gaussian RSS variations at TP, the confidence probability of i R to be selected as the most RSS-adjacent neighbor is calculated by the integral √ d .

Figure 2.
Linear distribution model in logarithmic Gaussian variation channel.  1)

Errors in Equal-Weighted RLSNs
In the first step, we need to discuss the following three cases in different neighbor sets conditions, respectively. Based on the Gaussian variations of the new sensed RSS at TP, the probability of each case will significantly depend on the values of ( 1, 2, , RP and RP 1).
(1) Case 1: 1 with the neighbor set , , . In this case, because there is no RP between the AP and , the new sensed RSS at TP should fall in the range of , ∞ , as shown in Figure 3. The confidence probability of set , , is calculated by Equation (2)  . As shown in Figure 4, the confidence probability in this case equals to the cumulative probability from to (see Equation (3)) by the Gaussian RSS variations.
Finally, in the last step, the linear expected errors in the equal-weighted RLSNs equals to: where, Prob ⁄ and ⁄ respectively stand for the values of Prob and when 2 ⁄ .

Errors in Unequal-Weighted RLSNs
From Equation (1), the difference between the equal and unequal-weighted RLSNs is about the weights' distribution in the neighbor set. The estimated position in equal-weighted RLSNs KNN is located at the geometric center of the neighbor set , , because the equal weight 1 ⁄ is distributed to each neighbor. However, in the unequal-weighted RLSNs, the weight of each neighbor significantly relies on the confidence probability to be selected as the most RSS-adjacent RP prob (for RP ). Therefore, we will also need to discuss the following three cases in different values of conditions: (1) Case 1: 1 with the neighbor set , , . In this case, the confidence probability of 1, , to be selected as the most RSS-adjacent RP prob can be calculated by Equation (7).  . Similarly, the confidence probability of selecting 0, , 1 as the most RSS-adjacent RP prob and the associated errors are respectively calculated by Equations (9) and (10): Finally, based on Equations (2)(3)(4) and Equations (7)(8)(9)(10)(11)(12), the linear expected errors in the unequal-weighted RLSNs will be calculated by Equation (13) where, ⁄ stands for the value of when 2 ⁄ .

Numerical and Experimental Results
From the analytical discussions in the previous section, we can find that the linear expected errors and in equal and unequal-weighted RLSNs rely significantly on the parameters K, σ and RP . Therefore, in this section, we will firstly present some numerical results about the relations addressed in Section 3, and then, the real experimental evaluations will also be discussed to verify the results of interest through this paper. In the numerical results that follow, we let α = 2.

Error Performance with Variations of K
In Figures 6 and 7, we show the linear expected errors of the equal and unequal-weighted RLSNs, respectively, given that the TP is actually located with the physical distance to the AP. These are derived from Equations (6) and (13) and plotted as the functions of the number of RPs RP . Clearly, as RP increases, the linear expected errors will also increase. However, the errors will not vary a lot with the increase of K. Take equal RLSNs for example. At RP = 25 and σ = 3 dBm, the decreasing rates of errors from K = 1 to K = 3, and from K = 3 to K = 5 are 11.7% and 7.5%.

Error Performance with Variations of r and σ
In this section, the linear expected errors in the equal and unequal-weighted RLSNs are respectively plotted as the functions of RP for various conditions with r = 0.5 m, 1 m, 2 m and σ = 1 dBm, 3 dBm in Figures 8 and 9. The condition of r = 2 m and σ = 3 dBm has the largest error, and there are always smaller errors with smaller r and σ as expected. Furthermore, based on the Figures 6-9, we can observe the influence degree on the linear expected errors in the equal and unequal-weighted RLSNs should be . Therefore, in the following results (in Figures (8-10)), we fix the value of K = 3.

Error Comparisons of Equal and Unequal-Weighted RLSNs
The error comparisons between the equal and unequal-weighted RLSNs are shown in Figure 10. It can be easily observed that the errors in equal-weighted RLSNs are slightly smaller the unequal-weighted ones, which can be suggested as another interesting observation from this paper. However, this result can be interpreted based on the following two reasons: (1) In the linearly distributed RPs with logarithmic Gaussian RSS variations, the RSS difference of the distance-adjacent RPs will be significantly diminished as the distance to the AP increases. Then, it cannot be easily guaranteed that the larger weights will be distributed to the distance closer RPs.

Experimental Setup
In this section, some realistic experimental results about the localization errors in Wi-Fi RLSNs will be carried out in a typical straight corridor environment, as shown in Figure 11. The dimensions of this area are 31 m × 2 m and the RPs are linearly distributed along the corridor with the same 1 m interval.  Figure 12. Obviously, if we record the RSS-mean and associated coordinates as the fingerprints, the pre-assumed logarithmic Gaussian channel can be effectively satisfied.

Experimental Evaluations in Real In-Building Environments
In this section, we will pay significant attention to the following two observations: (1) The error variations with the increase of K, σ and r in the RLSNs; (2) The error comparisons of equal and unequal-weighted RLSNs.
From Figures 16-21, we can conclude that: (1) Although the error variations will become irregular under significantly large r conditions (e.g., r = 4 m), the errors will generally decrease as K increases and r decreases; (2) The errors are more sensitive to the σ values compared to the values of K; (3) The errors in equal-weighted RLSNs are slightly smaller compared to the unequal-weighted ones, which is also accordance with our previous analytical and numerical results.    The relations of the localization errors and standard deviations in equal and unequal RLSNs are also presented in Figure 21. Further, there are three categories of TPs to be considered: , and . The categories and consist of the front and last 11 TPs respectively with the largest and smallest standard deviations from AP1, while the other 11 TPs are included in . The errors are more sensitive to the variations of σ, especially from categories to . Generally, larger localization errors will result as the standard deviation always increases as expected.

Conclusions and Challenges
This paper has offered a preliminary analysis of the linear expected errors in the equal and unequal-weighted RLSNs with in-building Wi-Fi Gaussian linear fingerprints, and also introduced the mathematical relations of the linear expected errors ( or ), number of neighbors (K), number and interval of calibrated RPs ( RP and r) and standard deviations of the new sensed RSS at TP σ. The objective of this paper is that the suggested relations can be employed for a better design of the highaccurate, low-cost and real-time fingerprint-based in-building RADAR localization system in RLSNs, either through the judicious recording of the fingerprints, or through the optimal deployment of the system architectures and devices.
From the mathematical relations, numerical and experimental results proposed in this paper, there three observations can be made as follows: (1) In the equal and unequal-weighted RLSNs, the degree of influence on the linear expected errors is with a given value of RP ; (2) The error performance of equal-weighted RLSNs slightly outperforms the unequal-weighted ones in logarithmic Gaussian attenuation channels; (3) The expected error has great linear dependence on the values of RP .
However, the following three challenges should form parts of our ongoing work: (1) Because the ideal logarithmic Gaussian attenuation channel utilized in this paper cannot be always satisfied or approximated in real in-building linear environments (e.g., the straight corridors), much attention will also be paid to some other typical models, like the Rayleigh and Rice distributions in the logarithmic channel with break point(s); (2) If the RPs are not calibrated on one side of the AP, the mathematical relations about the linear expected errors addressed in this paper will be changed because the test point will no longer satisfy the uniform distributions in the target location areas (e.g., the probability of the TP's positions belonging to the range of 0, will be doubled); (3) If there are three or more APs to be considered, for the WKNN localization algorithm, the confidence probability of each neighbor to be selected as the most RSS-adjacent RP should be calculated by a joint probability integral.