WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach

Altaf Khattak, Sohaib Bin; Fawad,; Nasralla, Moustafa M.; Esmail, Maged Abdullah; Mostafa, Hala; Jia, Min

doi:10.3390/s22145236

Open AccessArticle

WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach

by

Sohaib Bin Altaf Khattak

^1,2

,

Fawad

³

,

Moustafa M. Nasralla

^1,*

,

Maged Abdullah Esmail

¹

,

Hala Mostafa

⁴ and

Min Jia

²

¹

Smart Systems Engineering Lab, College of Engineering, Prince Sultan University, Riyadh 11586, Saudi Arabia

²

Communications Research Center, School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China

³

ACTSENA Research Group, Telecommunication Engineering Department, University of Engineering and Technology Taxila, Punjab 47050, Pakistan

⁴

Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Sensors 2022, 22(14), 5236; https://doi.org/10.3390/s22145236

Submission received: 8 June 2022 / Revised: 3 July 2022 / Accepted: 7 July 2022 / Published: 13 July 2022

(This article belongs to the Special Issue Artificial Intelligence (AI) and Machine-Learning-Based Localization)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Location-based services have permeated Smart academic institutions, enhancing the quality of higher education. Position information of people and objects can predict different potential requirements and provide relevant services to meet those needs. Indoor positioning system (IPS) research has attained robust location-based services in complex indoor structures. Unforeseeable propagation loss in complex indoor environments results in poor localization accuracy of the system. Various IPSs have been developed based on fingerprinting to precisely locate an object even in the presence of indoor artifacts such as multipath and unpredictable radio propagation losses. However, such methods are deleteriously affected by the vulnerability of fingerprint matching frameworks. In this paper, we propose a novel machine learning framework consisting of Bag-of-Features and followed by a k-nearest neighbor classifier to categorize the final features into their respective geographical coordinate data. BoF calculates the vocabulary set using k-mean clustering, where the frequency of the vocabulary in the raw fingerprint data represents the robust final features that improve localization accuracy. Experimental results from simulation-based indoor scenarios and real-time experiments demonstrate that the proposed framework outperforms previously developed models.

Keywords:

indoor positioning system; machine learning; WLAN fingerprinting; higher education; learning environment

1. Introduction

Realizing the importance of evolving educational and pedagogical conditions, universities and colleges are aiming to include extensive upgrades and to develop exciting technological services [1,2,3,4]. The concept of Smart Campuses in Smart Cities seeks to integrate beneficial technological interventions such as localization and navigation, asset monitoring, smart attendance, and smart parking systems into the fabric of universities in order to improve learning, research, and managerial and administrative efficiency [5,6,7]. These applications require a robust positioning framework to precisely locate objects in indoor and outdoor environments [8,9]. A significant portion of the campus is disorganized, which can be difficult for not only students, but for faculty, staff, and particularly visitors. Students and visitors can use seamless indoor and outdoor positioning systems to navigate the campus. Outdoor location-based services can be easily achieved by employing global positioning system (GPS). However, in an indoor environment, non-line-of-sight communication, complex building structures, and other obstacles degrade the performance of satellite-based positioning. Thus, there is a strong need for reliable indoor positioning systems (IPSs). Indoor environments are usually highly cluttered, with many obstacles causing signal attenuation and creating blind spots that degrade the localization performance of the IPS [10]. Indoor positioning mechanisms identify the geographical coordinates of objects residing inside complex indoor structures. IPSs can be applied to track children and elderly users with wearable devices inside crowded malls and hospitals [11,12]. These positioning systems can also be used for military applications, mass rapid transit systems, or indoor areas where finding the location of a device user is mandatory.

Numerous approaches have been proposed for IPS without the aid of GPS. Such scenarios take help from anchor nodes. ANs already know their locations; they are either GPS equipped or manually deployed at points with known locations [13]. These localization techniques can be broadly classified into two categories: range-based and range-free [14]. Range-based localization works on radio transmission-based distance estimation, using these distances for position estimations. On the contrary, range-free localization algorithms use network connectivity, previous measurements, or other features that are not dependent on distance estimation [15]. These IPS frameworks mainly rely on three location-dependent parameters: angle, distance, and strength of the received anchor node signal [16]. Angle-based methods depend on the angle-of-arrival (AoA), which is estimated by the AoA of the incoming signals. Time-based methods, including time-of-arrival (ToA) and time-difference-of-arrival (TDoA), depend on source-to-destination signal propagation time. Strength-based methods depend on received signal strength (RSS).

AoA-based localization relies on the azimuth angle relative to the target nodes. AoA-based localization accuracy is highly vulnerable to non-line-of-sight multipath effects, in which reflected signals are received at a wrong azimuth angle and degrade the precision of the AoA method. Moreover, AoA requires complex hardware with proper calibration and synchronization for precise position estimation. The ToA method depends on the propagation time duration of the packet from the anchor node to the target node [17]. ToA-based indoor localization is less complex than the AoA method but is dependent on synchronization of the source and the destination node. The ToA value is extracted from the packet at the destination from the labeled timestamps [16]. Synchronization requires the integration of more hardware in the indoor positioning system, which increases the cost of the IPS. The TDoA framework’s positioning accuracy is also sensitive to synchronization of the anchor and target nodes. In RSS-based indoor positioning, objects are localized by the strength of the signal at the receiving target node [18]. In our work, we choose RSS for position estimation as it is the simplest method and does not require additional hardware for synchronization. The reason for the wide popularity of RSS is that almost all radio-enabled devices can process and display RSS.

When it comes to indoor positioning, range-based techniques have a drawback: they are badly affected by multipath fading and shadowing in cluttered indoor environments. Due to these factors, variability in the values can cause large estimation errors. In comparison, the range-free techniques do not consider these mathematical models and are not prone to error compared to range-based techniques. For these reasons, range-free techniques are preferred indoors, some examples of range-free positioning techniques are cell-identity-based, proximity-based, fingerprinting, etc. Fingerprinting is a range-free technique for localization, and it can be applied in all wireless communication systems. A fingerprint is a unique set of location-dependent signal parameters, so each location has a unique fingerprint associated with that location. It is the most reliable method for localization in RSS-based indoor positioning systems. Fingerprinting is comprised of two stage: offline training and online testing. In the offline training stage, the RSS values of the anchor nodes are recorded at each reference point (RP), a geographic coordinate in the coverage area. During the online testing stage, the RSS values at unknown coordinates are subjected to a robust classifier that estimates the position’s coordinate based on the training data [19]. Although fingerprinting-based IPSs are considered reliable, they face some challenges because of variations in permittivity and permeability of materials in the signal propagation path, creating nonuniform propagation loss. Moreover, the multipath effect in indoor environments also threatens the signal strength at target node positions. Various positioning algorithms using a number of classification models have been adopted to estimate the precise 2D coordinates. These models categorise the observed training data and assign the test data to the best set. The most reliable classification models include the multi-class support vector machine (SVM), K-nearest neighbor (KNN), decision tree (DT), and the ensemble classifier. The classification accuracy of these classifiers depends on the RSS collected during the offline training stage. Moreover, the robustness and distinctiveness of the fingerprint data provide more accuracy in the IPS framework. The data collected during offline training, consisting of RSS from each anchor node, is deficient due to cost constraints on the APs in the finite geography. The limited fingerprint training data at each coordinate affects the performance of the overall IPS framework. To handle this issue, we adopt a pre-processing Bag-of-Features (BoF) strategy that improves the overall strength and robustness of the training data.

In this paper, the proposed approach transforms the raw data into a high-dimensional form and makes it more compatible with the pre-existing classification models. The proposed approach formulates positioning as a pattern-recognition problem, where for each location a featured vector is obtained using a simplified BoF-based technique. It consists of characterizing several WLAN RSS measurements observed at each RP. The BoF model is applied to the raw RSS fingerprint data by accumulating k-mean clustering and collecting the differential vectorization of vocabulary set occurrences. This is the first time that such a technique has been used in WLAN RSS fingerprinting-based indoor localization. Previously, researchers employed pre-processing methods such as spectrogram transformation [20] and interpolation [21]. However, such methods extend the complexity of the model by increasing the data size. In this work, the proposed approach is tested in different simulation scenarios, and a real time experiment is also conducted to validate the proposed strategy. The proposed framework is compared with previously reported work, and the overall results testify of the superiority of our model.

The rest of the paper is organized as follows. Section 2 briefly reviews the literature relevant to the considered problem. Section 3 describes the system model, and Section 4 includes the proposed approach. Section 5 presents the simulation results and the real-time experiment; and finally, Section 6 is the conclusion.

2. Related Work

The existing literature focuses on efficient indoor positioning techniques that are low cost and are accurate in diverse environments. An ideal IPS would work in numerous indoor scenarios. To analyze the best indoor localization framework, a brief comparative investigation of the literature is presented. Existing research shows that machine learning models benefit IPS frameworks both in terms of cost and precision. Extensive research on handcrafted and deep learning models has assisted in achieving noise robustness in indoor positioning systems [22]. With the development of advanced computational devices, the employment of deep neural models and advanced machine learning frameworks has become possible [23,24]. Deep neural networks such as AlexNet, DagNet, GoogleNet, ResNet, InceptionV3, VGG-16, MobileNet, and ZFNet require 2D input data, which is not available in the case of fingerprint base localization [25]. Indoor positioning systems developed for tracking objects broadly fall in two main classes: wireless signal and vision frameworks. Vision-based frameworks employ computer vision algorithms on the images captured by mounted cameras serving in indoor environments. Vision-based methods detect the desired object in non-overlapping camera networks of buildings by classifying the robust visual features extracted from the region of interest [26]. Vision-based methodologies are highly precise but computationally expensive. The high frame-rate and extensive resolution of images requires expensive processors and graphics processing unit for the recurrent neural networks to recognize and annotate the desired object in the input camera feeds. Moreover, vision-based frameworks also suffer from geometric and photometric variations, including occlusions and illumination and viewpoint variations. Compared to vision-based methods, wireless signal-based methods are cost-effective and require fewer computational resources [27].

Wireless signal-based indoor positioning frameworks rely both on geometric and fingerprinting methods. The method developed in [28,29] considers indoor localization as a Gaussian and KNN regression problem. Regression-based models have less complexity, but they focus only on classification of pre-existing fingerprint data and ignore robust feature extraction. The work in [30] considers the importance of features to improve the precision of indoor localization by including continuous wavelet transforms (CWT) on raw RSS fingerprint data. CWT converts the 1D vector RSS data into 2D image data, with which pre-existing deep neural network models are easily compatible. CWT transformation is an additional stage before feature extraction that improve precision at the cost of the computational complexity of the model. Other works can be found in the literature using principal component analysis for feature extraction [31,32].

Deep neural network (DNN) frameworks are much more sensitive to the format of input data, whereas RSS fingerprint data is vectorized data with a limited set of values for each geographical coordinate. Therefore, most IPSs incorporate handcrafted statistical models in their machine learning frameworks for indoor localization. Handcrafted statistical models are less complex and embrace the problem’s specific modifications. The authors in [33] integrated a statistical hypothesis test on asymptotic relative efficiency (ARE) to optimize signal distribution at the site coverage area. Another work [34] introduces multi-output least square support vector machine (M-LS-SVM) regression to improve classification of RSS fingerprint data. Localization in [35] is achieved by fusion of grid-independent and grid-dependent least-square classifications. The authors of [36] used a neural network-based algorithm to correct the camera tilt angle, and they used nueral networks to establish a relationship between LED images and distance. However, the noise generated by reflection, which affects IPS performance, has not been considered. In this work, we focus on the robustness and distinctiveness of RSS fingerprint data with the support of pre-processing to make pre-existing indoor localization frameworks more efficient and accurate. Previously reported frameworks mainly focused on enhancement of RSS fingerprint data by suppressing noise through both pre-processing and post-processing, whereas our proposed approach transforms the raw fingerprint data into a more meaningful shape that is robust and leads to highly accurate localization. BoF and Bag-of-Words (BoW) [37] exist for image and document classification, respectively, whereas here it is implemented and incorporated for RSS-based indoor positioning for the first time.

3. System Model

Fingerprint-based localization systems are divided into two phases: offline and online. Figure 1 depicts the WLAN fingerprinting-based indoor positioning system. A radio map is created offline by dividing the area of interest into grids or RPs. At these RPs, a survey is conducted to collect RSS readings from the accessible APs, and then a database is produced, as illustrated in Figure 1. This database is the radio map, which contains map-like identification features but is based on the RSS of the radio waves. The signature created at each RP serves as the RP’s fingerprint. On the other hand, during the online phase, the user initiates a query from a specific point inside the area of interest. The system uses different matching algorithms to compare the query with the radio map, and then the most comparable fingerprint is returned as the estimated position. The RPs can be mathematically represented as follows

R P_{j} = (x, y), j = (1, . . . . . N),

(1)

where

(x, y)

is the coordinate point of the RP in the grid-based area, and N is the total number of RPs. The fingerprint database or radio map can be mathematically described as follows

λ = [\begin{matrix} R P_{1} & (ψ_{1, 1}, . ., ψ_{1, M}) \\ : & : \\ R P_{N} & (ψ_{N, 1}, . ., ψ_{N, M}) \end{matrix}]

(2)

where

ψ_{i, j}

refers to RSS samples at ith AP from jth RP, and M is the total number of APs.

To simulate signal transmission over the channel between the AP and RP, a log-normal path loss model is employed. This model can be used for a wide range of environments and considers the random shadowing effects caused by different types of obstacles causing signal blockage. Shadowing effects cannot be ignored when modeling real environments. In indoor situations, path loss is affected by a variety of parameters, including distance (D), noise (

ζ

), physical barriers (

γ

), and human presence (

ρ

). Each barrier, whether a wall or a human, must have the resulting attenuation represented in the model. As a result, we use the extended log-normal path loss model in our simulations [38].

P (d) = P (d_{0}) + 10 \cdot N \cdot l o g (d / d_{0}) + ζ + \sum_{ι = 1}^{υ} (γ_{ι}) + \sum_{κ = 1}^{Υ} (ρ_{κ})

(3)

where

P (d)

denotes the RSS at point d in the

(x, y)

coordinate system,

P (d_{0})

denotes the RSS at reference distance (1 m),

N

denotes the path loss coefficient,

ζ

denotes the shadowing effect, and d denotes the distance between AP and RP. In the summation for wall and human attenuation factors,

ι

is the

ι

th physical barrier (walls in particular),

υ

represents the total number of barriers, and

κ

is the

κ

th human in the path, with

Υ

as the total number of humans through which the signal attenuated. The RSS values fluctuate over time due to many factors that contribute to signal fading. To counter temporal variations, RSS readings are taken over a period of time, which is defined mathematically by

R S S (i, j) = (S_{i, j} (τ 1) \dots . . S_{i, j} (τ Γ))

(4)

ψ_{i, j} = \frac{\sum_{τ = 1}^{ξ} S_{i, j} (τ)}{Γ}

(5)

where i is the ith AP, j is the jth RP,

S_{i, j} (τ)

is the RSS sample collected at time instant

τ

, and

Γ

is the total number of collected RSS samples. The average value of RSS samples is used in the fingerprint database.

4. Methodology

In the proposed framework, a BoF model is introduced for pre-processing to achieve noise robustness and distinctiveness in the feature accumulation process of indoor positioning based on RSS fingerprinting. The proposed BoF approach employs RSS fingerprint training data for vocabulary generation based on clustering. The vocabulary here is used to create the feature vectors employed both in training and testing of the classification model. Figure 2 shows how the proposed approach is different from conventional fingerprinting algorithms. RSS fingerprints consist of the coordinates of the RP and the raw RSS values corresponding to each AP. The BoF framework considers the geographical coordinates as labels but the RSS from each AP as raw features.

X = [\begin{matrix} R P_{1} \\ : \\ R P_{N} \end{matrix}]

(6)

Y = [\begin{matrix} (ψ_{1, 1}, . ., ψ_{1, M}) \\ : \\ (ψ_{N, 1}, . ., ψ_{N, M}) \end{matrix}]

(7)

where X in Equation (6) represents the location coordinates of RPs, while

ψ_{i, j}

in Equation (7) denotes the RSS value of the ith AP at jth RP. The BoF generates clusters by employing the raw feature set Y and records the cluster centers as vocabulary features.

\begin{matrix} L = \sum_{v = 1}^{k} \sum_{i = 1}^{n} a_{< i, v >} {\{∥ ψ_{i, j} - μ_{i}^{v} ∥^{2}\}}_{i = 1}^{M} & where \end{matrix} a_{< i, v >} = \{\begin{matrix} 1, & if ψ_{i, j} belong to v \\ 0, & otherwise \end{matrix}

(8)

where

L

in Equation (8) denotes the cost function, and

μ_{i}^{v}

is the vth cluster center of the ith AP RSS. The variable

μ

denotes the cluster mean value, which updates with each iteration. The coefficient a in Equation (8) defines the class of each

ψ

vector associated with each grid point. The variable v denotes the cluster class label in the group of k classes. The k-mean clustering minimizes

L

with respect to a and

μ

, as given bellow:

(a): Initialize $μ_{i}^{1}$ to $μ_{i}^{k}$ arbitrarily;
(b): Choose the optimal a for a fixed $μ$ ;
(c): Choose the optimal $μ$ for a fixed a;
(d): Repeat steps (b) and (c) until convergence.

The BoF model vocabulary set

μ_{1}^{v}

with v ranges from

[1

K]

, with a total of K cluster centers used to create a final feature vector associated with each RP of the known training and the required test data. The final training data T consisting of the final feature

\tilde{Y}

and location labels X is used to estimate the location based on the test RSS value at an unknown location of the serving region.

In Equation (9), the final feature vector

\tilde{Y}

from the raw RSS data Y is calculated for each reference point based on the available vocabulary set

μ_{i}^{v}

. The mathematical form is given by

{\tilde{Y}}_{i} = {\{{\{ψ_{i, j} - μ_{i}^{v}\}}_{v = 1}^{K}\}}_{i = 1}^{M} .

(9)

The training data consisting of the final feature vector and its associated labels

< \tilde{Y}, X >

is used to train the classifier to estimate location coordinates based on the test data. The BoF model extracts the feature vector from the raw data, which remain robust and distinct in the presence of both spatial and spectral noise. The test feature vectors are classified with a trained classifier, and the estimated location is compared with the true location to calculate mean indoor positioning error.

The test features of the BoF models are classified with the distance-based classifier. Using KNN classification provides better precision when integrated with the proposed BoF model. The KNN method identifies the K nearest feature vectors in

\tilde{Y}

with the lowest Euclidean distance d to the existing training features. For a test feature

ξ

in

R^{M}

vector space, the Euclidean distance is shown in Equation (10).

d_{i} = \sqrt{{(ξ_{j} - ψ_{i, j})}^{2}}

(10)

where

d_{i}

in Equation (10) represents the distance of test feature

ξ

and ith feature vector

\tilde{Y}

from the training data. The test

ξ

is assigned to the majority vote of the k-nearest features of the training data

N_{k} (\tilde{Y})

shown in Equation (11). The

N_{k} (\tilde{Y})

are the k-nearest features of the test feature

ξ

in the training set

\tilde{Y})

. The optimal value of k in the KNN algorithm is used differently in the literature [39,40]. Usually, the value of K is between 3 and 5; for larger values the accuracy of the system degrades, as outliers are also included as neighbors.

Fractional probability p is used to assign class label X to test feature

ξ

.

p (X_{i} / ξ) = \frac{1}{k} (\sum_{i ϵ N_{k} (\tilde{Y})} I (X = = i))

(11)

The class label, i.e., the RP coordinates, is assigned to test feature

ξ

on the basis of the fractional probability given in Equation (11). The class label for which

p (X_{i} / ξ)

results in a maximum value is assigned to each test feature during the experiment. The KNN classification model is less complex; hence, it is integrated with the proposed pre-processing approach. The step-by-step functionality of the proposed approach can also be shown in Algorithm 1 and Figure 3.

Algorithm 1: Syntax of the proposed BoF indoor positioning model.

5. Performance Evaluation

In this section, we present a comprehensive performance evaluation of the proposed BoF-based positioning approach, which reveals the positioning accuracy. The experimental results of the proposed BoF indoor positioning model are validated through both simulated virtual and real-time testbeds. We choose two simulation scenarios from our previous work [41,42], and a personal testbed was set up consisting of two apartments in a residential building. All approaches are run on MATLABR2018a installed on a Dell laptop equipped with an Intel Core i7 processor and 8 GB RAM. Classification of raw RSS data features is carried out through KNN [39,43], probabilistic [44], SVM [45], DT [46], ensemble learning-based classification model [47], and the discriminant analysis classifier (DAC) [48], and comparisons are made with the proposed BoF-enabled approach. These are the most common machine learning models used in RSS fingerprinting systems [49]. The KNN method estimates labels based on k neighbor samples in the data. SVM separates test data into two categories based on the hyperplane. DAC classifies raw RSS data based on Gaussian distribution. The probabilistic model classifies test data based on the likelihood of training data samples. Ensemble learning uses multiple decision tree models and estimates the test sample class based on majority voting. However, the best results are obtained by integrating KNN classification with the proposed BoF feature extraction approach.

To calculate positioning error, Euclidean distance is calculated between estimated position

P_{E s t}

and actual position

P_{T e s t}

of the test points (TPs). The overall localization accuracy of the system can be given by two performance metrics: mean absolute error (MAE) and cumulative distribution function (CDF) [40]. Let

N_{T P}

be the number of TPs; MAE is defined as

M A E = \frac{1}{N_{T P}} \sum_{α = 1}^{N_{T P}} \sqrt{{(P_{T e s t} - P_{E s t})}^{2}}

(12)

The CDF plot displays that the probability of positioning error (PE) is equal to or less than a certain distance. It shows the spread of the positioning errors of TPs and presents the comparison in terms of the proposed approach with various reported methods that have already been mentioned.

5.1. Virtual Environments

The model is validated in a virtual environment including walls and humans. The simulation environments resemble typical application scenes with differently sized rooms and corridors. The first virtual environment has the dimensions of 30 m × 30 m with four APs (red squares), while the second simulation scenario has the dimensions of 50 m × 50 m with five APs (red squares), as shown in Figure 4 and Figure 5, respectively. Simulations are performed to test the proposed algorithm in realistic and complex indoor environments; both scenarios include multiple rooms, corridors, and halls to match a typical university setting. Random human presence has also been included in both scenarios. In the first scenario, each installed WLAN AP covers the complete area of interest; some areas may have low signal quality. The second scenario considers a large indoor environment, where each AP cannot provide coverage to all RPs. Their coverage is restricted to a certain number of RPs where they can provide localization services. Hence, only a set of installed APs participate in localization. Both simulation scenarios have a grid spacing of 2 m. The RSS values to each RP have been assigned by the extended path loss model with human [50] and wall attenuation factors [51], and 20 samples are observed at each RP from all APs. The WLAN AP configurations for both simulation scenarios can be seen in Table 1, and the simulation parameters are listed in Table 2.

In the offline stage, BoF features are generated by using cluster dimensions of size 2, which leads to a feature dimension of 10 variables for the considered environment of 5 APs. In the online stage, RSS values from the available WLAN APs are initially translated through BoF, and then the KNN classification algorithm is applied to estimate the indoor location of the test device. For comparison, we employ simple KNN, probabilistic, SVM, DT, ensemble learning-based classification, and DAC, with the proposed BoF-assisted KNN outperforming comparatively.

The virtual 30 m × 30 m environment is tested using the BoF-based indoor positioning approach. The proposed BoF approach provides an MAE of 1.702 m, which is lower than that of the other models. The MAEs of KNN, probabilistic, SVM, DT, ensemble learning-based classification, and DAC are shown in Table 3 and are higher than our proposed BoF approach. The CDF plot of this simulated environment is shown in Figure 6.

In addition, the proposed BoF approach is validated on a virtual environment of dimensions 50 m × 50 m, resulting in an MAE of 2.837 m, which is approximately 1.1 m less than the second-best KNN classification model. The MAEs of the other methods are shown in Table 4. Moreover, the CDF plot of this environment is shown in Figure 7. It can be noted that the CDF plot shows our model outperforms previously reported methods.

5.2. Real-Time Testbed Experiment

All experiments are performed inside a residential building between two adjacent apartments. These two apartments have a living room and a bedroom each. The floor map of the area can be seen in Figure 8. This area is divided into two regions, namely, Region A and Region B, with dimensions of 3.95 m × 11.1 m and 8.4 m × 9.3 m, respectively. In this environment, we label a total of 30 RPs for the site survey, in which 10 RPs are in Region A and 20 RPs are in region B, with grid spacing of two meters, as shown in Figure 9. This experimental environment is less crowded and remains the same throughout the day, with no significant change. A Huawei smartphone (model: KIW-L21) with an android application installed [52] is used to collect Wi-Fi RSS data from five TPlink A1200 APs (notated as AP1 to AP5) for 25 s at each RP with 20 RSS samples. The details of this environmental setup along with photos of the hardware devices used for the real-time experiment can be seen in Figure 10. In a similar way, 10 TPs are also labeled for testing, where 3 TPs belong to Region A, and 7 TPs belong to Region B. The MAE obtained by BoF in the real environment is 1.581 m, which is 0.19 m lower than that of the KNN model. The MAEs of KNN, probabilistic, SVM, DT, ensemble learning-based classification, DAC, and the proposed model are given in Table 5, and the CDF plot can be seen in Figure 11.

6. Conclusions

The indoor positioning framework presented in this article enhances the reliability of positioning accuracy. The proposed BoF model transforms the raw RSS of the access points into robust and distinctive features with reduced localization error. Experimental validation of the proposed BoF integrated with KNN tested on both virtual and real-time testbeds shows promising performance. The proposed approach scores 1.702 m, 2.837 m, and 1.581 m mean absolute error on the simulated 30 m × 30 m and 50 m × 50 m and the real-time residential apartment environment, respectively, indicating lower error than other methods. Moreover, the CDF graphs clearly show the performance of our proposed approach remains more robust and distinct than state-of-the-art models, even in the presence of environmental artifacts. Machine-learning based pre-processing integrated with the simplest of classifiers can outperform conventional classification models and overcome their limitations. In future work, we will explore other ML approaches that can be integrated into the conventional IPS framework to enhance performance in complex indoor scenarios with limited training data.

Author Contributions

Conceptualization, S.B.A.K., M.M.N., M.A.E. and M.J.; Data curation, S.B.A.K., M.M.N. and M.A.E.; Formal analysis, S.B.A.K. and M.M.N.; Funding acquisition, M.M.N. and H.M.; Investigation, F. and M.M.N.; Methodology, S.B.A.K., F. and M.M.N.; Project administration, M.M.N.; Software, S.B.A.K.; Supervision, M.M.N. and M.J.; Validation, M.A.E. and H.M.; Visualization, S.B.A.K. and H.M.; Writing—original draft, S.B.A.K. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R137), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to acknowledge Prince Sultan University and Smart Systems Engineering lab for their valuable support. Further, the authors would like to acknowledge the support of Prince Sultan University for Article Processing Charges (APC) of this publication.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jabbar, W.A.; Wei, C.W.; Azmi, N.A.; Haironnazli, N.A. An IoT Raspberry Pi-based parking management system for smart campus. Internet Things 2021, 14, 100387. [Google Scholar] [CrossRef]
Li, J.; Shen, Y.; Dai, W.; Fan, B. Design of student management system based on smart campus and wearable devices. In Artificial Intelligence in Education: Emerging Technologies, Models and Applications; Springer: Singapore, 2022; pp. 141–150. [Google Scholar]
Haq, I.U.; Anwar, A.; Rehman, I.U.; Asif, W.; Sobnath, D.; Sherazi, H.H.; Nasralla, M.M. Dynamic Group Formation With Intelligent Tutor Collaborative Learning: A Novel Approach for Next Generation Collaboration. IEEE Access 2021, 9, 143406–143422. [Google Scholar] [CrossRef]
Nasralla, M.M.; Al-Shattarat, B.; Almakhles, D.J.; Abdelhadi, A.; Abowardah, E.S. Futuristic Trends and Innovations for Examining the Performance of Course Learning Outcomes Using the Rasch Analytical Model. Electronics 2021, 10, 727. [Google Scholar] [CrossRef]
Rosy, J.A.; Juliet, S. An enhanced intelligent attendance management system for smart campus. In Proceedings of the 6th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 29–31 March 2022; pp. 1587–1591. [Google Scholar]
Martini, M.G.; Hewage, C.T.; Nasralla, M.M.; Ognenoski, O. QoE control, monitoring, and management strategies. In Multimedia Quality of Experience (QoE); Wiley: Hoboken, NJ, USA, 2016; pp. 149–168. [Google Scholar] [CrossRef]
Valks, B.; Arkesteijn, M.H.; Koutamanis, A.; den Heijer, A.C. Towards a smart campus: Supporting campus decisions with Internet of Things applications. Build. Res. Inf. 2021, 49, 1–20. [Google Scholar] [CrossRef]
Aspilcueta Narvaez, A.; Núñez Fernández, D.; Gamarra Quispe, S.; Lazo Ochoa, D. Smart campus IoT guidance system for visitors based on bayesian filters. In Proceedings of the 5th Brazilian Technology Symposium; Springer: Cham, Switzerland, 2021; pp. 463–472. [Google Scholar]
Petcovici, A.; Stroulia, E. Location-based services on a smart campus: A system and a study. In Proceedings of the 2016 IEEE 3rd World Forum on Internet of Things (WF-IOT), Reston, VA, USA, 12–14 December 2016; pp. 94–99. [Google Scholar]
Liu, H.; Darabi, H.; Banerjee, P.; Liu, J. Survey of wireless indoor positioning techniques and systems. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 1067–1080. [Google Scholar] [CrossRef]
Tegou, T.; Kalamaras, I.; Tsipouras, M.; Giannakeas, N.; Votis, K.; Tzovaras, D. A Low-Cost Indoor Activity Monitoring System for Detecting Frailty in Older Adults. Sensors 2019, 19, 452. [Google Scholar] [CrossRef] [Green Version]
Sobnath, D.; Rehman, I.U.; Nasralla, M.M. Smart cities to improve mobility and quality of life of the visually impaired. In Technological Trends in Improved Mobility of the Visually Impaired; Springer: Cham, Switzerland, 2020; pp. 3–28. [Google Scholar]
Khattak, S.B.A.; Jia, M.; Marey, M.; Guo, Q.; Gu, X. A Novel Single Anchor Localization Method for Wireless Sensors in 5G Satellite-Terrestrial Network. Alex. Eng. J. 2021, 61, 5595–5606. [Google Scholar] [CrossRef]
Shit, R.C.; Sharma, S.; Puthal, D.; Zomaya, A.Y. Location of Things (LoT): A review and taxonomy of sensors localization in IoT infrastructure. IEEE Commun. Surv. Tutorials 2018, 20, 2028–2061. [Google Scholar] [CrossRef]
Nagah Amr, M.; ELAttar, H.M.; Abd El Azeem, M.H.; El Badawy, H. An Enhanced Indoor Positioning Technique Based on a Novel Received Signal Strength Indicator Distance Prediction and Correction Model. Sensors 2021, 21, 719. [Google Scholar] [CrossRef]
Zafari, F.; Gkelias, A.; Leung, K.K. A survey of indoor localization systems and technologies. IEEE Commun. Surv. Tutorials 2019, 21, 2568–2599. [Google Scholar] [CrossRef] [Green Version]
Sheikh, S.M.; Asif, H.M.; Raahemifar, K.; Al-Turjman, F. Time difference of arrival based indoor positioning system using visible light communication. IEEE Access 2021, 9, 52113–52124. [Google Scholar] [CrossRef]
Mazuelas, S.; Bahillo, A.; Lorenzo, R.M.; Fernandez, P.; Lago, F.A.; Garcia, E.; Blas, J.; Abril, E.J. Robust indoor positioning provided by real-time RSSI values in unmodified WLAN networks. IEEE J. Sel. Top. Signal Process. 2009, 3, 821–831. [Google Scholar] [CrossRef]
Pinto, B.; Barreto, R.; Souto, E.; Oliveira, H. Robust RSSI-Based Indoor Positioning System Using K-Means Clustering and Bayesian Estimation. IEEE Sensors J. 2021, 21, 24462–24470. [Google Scholar] [CrossRef]
Alhomayani, F.; Mahoor, M.H. Deep learning methods for fingerprint-based indoor positioning: A review. J.-Locat.-Based Serv. 2020, 14, 129–200. [Google Scholar] [CrossRef]
Zhao, Y. An Improved Indoor Positioning Method Based on Nearest Neighbor Interpolation. Netw. Commun. Technol. 2021, 6, 1. [Google Scholar] [CrossRef]
Sinha, R.S.; Hwang, S.-H. Comparison of CNN Applications for RSSI-Based Fingerprint Indoor Localization. Electronics 2019, 8, 989. [Google Scholar] [CrossRef] [Green Version]
Adege, A.B.; Lin, H.-P.; Tarekegn, G.B.; Jeng, S.-S. Applying Deep Neural Network (DNN) for Robust Indoor Localization in Multi-Building Environment. Appl. Sci. 2018, 8, 1062. [Google Scholar] [CrossRef] [Green Version]
Maduranga, M.W.; Abeysekara, R. Supervised machine learning for RSSI based indoor localization in IoT applications. Int. J. Comput. Appl. 2021, 183, 26–32. [Google Scholar] [CrossRef]
Song, J.; Patel, M.; Ghaffari, M. Fusing Convolutional Neural Network and Geometric Constraint for Image-based Indoor Localization. IEEE Robot. Autom. Lett. 2022, 7, 1674–1681. [Google Scholar] [CrossRef]
Fawad; Khan, M.J.; Rahman, M. Person Re-Identification by Discriminative Local Features of Overlapping Stripes. Symmetry 2020, 12, 647. [Google Scholar] [CrossRef] [Green Version]
Poulose, A.; Han, D.S. Hybrid Deep Learning Model Based Indoor Positioning Using Wi-Fi RSSI Heat Maps for Autonomous Applications. Electronics 2021, 10, 2. [Google Scholar] [CrossRef]
Sun, W.; Xue, M.; Yu, H.; Tang, H.; Lin, A. Augmentation of fingerprints for indoor WiFi localization based on Gaussian process regression. IEEE Trans. Veh. Technol. 2018, 67, 10896–10905. [Google Scholar] [CrossRef]
Li, D.; Zhang, B.; Li, C. A feature-scaling-based k-nearest neighbor algorithm for indoor positioning systems. IEEE Internet Things J. 2015, 3, 590–597. [Google Scholar] [CrossRef]
Ssekidde, P.; Steven Eyobu, O.; Han, D.S.; Oyana, T.J. Augmented CWT Features for Deep Learning-Based Indoor Localization Using WiFi RSSI Data. Appl. Sci. 2021, 11, 1806. [Google Scholar] [CrossRef]
Yoo, J.; Park, J. Indoor Localization Based on Wi-Fi Received Signal Strength Indicators: Feature Extraction, Mobile Fingerprinting, and Trajectory Learning. Appl. Sci. 2019, 9, 3930. [Google Scholar] [CrossRef] [Green Version]
Jiang, J.-R.; Subakti, H.; Liang, H.-S. Fingerprint Feature Extraction for Indoor Localization. Sensors 2021, 21, 5434. [Google Scholar] [CrossRef]
Zhou, M.; Li, Y.; Tahir, M.J.; Geng, X.; Wang, Y.; He, W. Integrated statistical test of signal distributions and access point contributions for Wi-Fi indoor localization. IEEE Trans. Veh. Technol. 2021, 70, 5057–5070. [Google Scholar] [CrossRef]
Christy Jeba Malar, A.; Deva Priya, M.; Femila, F.; Peter, S.S.; Ravi, V. Wi-Fi fingerprint localization based on multi-output least square support vector regression. In Intelligent Systems; Springer: Singapore, 2021; pp. 561–572. [Google Scholar]
Guo, X.; Shao, S.; Ansari, N.; Khreishah, A. Indoor localization using visible light via fusion of multiple classifiers. IEEE Photonics J. 2017, 9, 1–6. [Google Scholar] [CrossRef]
Yuan, T.; Xu, Y.; Wang, Y.; Han, P.; Chen, J. A tilt receiver correction method for visible light positioning using machine learning method. IEEE Photonics J. 2018, 10, 7909312. [Google Scholar] [CrossRef]
Deselaers, T.; Pimenidis, L.; Ney, H. Bag-of-visual-words models for adult image classification and filtering. In Proceedings of the 19th International Conference on Pattern Recognition, Tampa, FL, USA, 8–11 December 2008; pp. 1–4. [Google Scholar]
Alshami, I.H.; Ahmad, N.A.; Sahibuddin, S.; Firdaus, F. Adaptive Indoor Positioning Model Based on WLAN-Fingerprinting for Dynamic and Multi-Floor Environments. Sensors 2017, 17, 1789. [Google Scholar] [CrossRef]
Bahl, P.; Padmanabhan, V.N. RADAR: An in-building RF-based user location and tracking system. In Proceedings of the IEEE 19th Annual Joint Conference of the IEEE Computer and Communications Societies, Tel Aviv, Israel, 26–30 March 2000; Volume 2, pp. 775–784. [Google Scholar]
Khalajmehrabadi, A.; Gatsis, N.; Akopian, D. Modern WLAN fingerprinting indoor positioning methods and deployment challenges. IEEE Commun. Surv. Tutor. 2017, 19, 1974–2002. [Google Scholar] [CrossRef] [Green Version]
Khattak, S.B.; Jia, M.; Guo, Q.; Gu, X. Improving Positioning Accuracy Using WLAN Optimization for Location Based Services and Cognitive Radio Networks. In Proceedings of the International Conference on Wireless and Satellite Systems, Online, 31 July–2 August 2021; pp. 607–621. [Google Scholar]
Jia, M.; Khattak, S.B.; Guo, Q.; Gu, X.; Lin, Y. Access point optimization for reliable indoor localization systems. IEEE Trans. Reliab. 2019, 69, 1424–1436. [Google Scholar] [CrossRef]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef] [Green Version]
Youssef, M.; Agrawala, A. The Horus WLAN location determination system. In Proceedings of the 3rd International Conference on Mobile Systems, Applications, and Services, Seattle, WA, USA, 6–8 June 2005; pp. 205–218. [Google Scholar]
Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Routledge: London, UK, 2017. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Huberty, C.J. Discriminant analysis. Rev. Educ. Res. 1975, 45, 543–598. [Google Scholar] [CrossRef]
Polak, L.; Rozum, S.; Slanina, M.; Bravenec, T.; Fryza, T.; Pikrakis, A. Received Signal Strength Fingerprinting-Based Indoor Location Estimation Employing Machine Learning. Sensors 2021, 21, 4605. [Google Scholar] [CrossRef]
Alshami, I.H.; Ahmad, N.A.; Sahibuddin, S. People’s Presence effect ON WLAN-based IPS accuracy. J. Teknol. 2015, 77, 173–178. [Google Scholar] [CrossRef]
Faria, D.B. Modeling Signal Attenuation in IEEE 802.11 Wireless Lans; Computer Science Department, Stanford University: Stanford, CA, USA, 2005; Volume 1. [Google Scholar]
Wifi Fingerprint. Available online: https://play.google.com/store/apps/details?id=com.elearnna.www.wififingerprint&hl=en&gl=US (accessed on 30 April 2022).

Figure 1. WLAN fingerprinting-based indoor positioning system.

Figure 2. Flowchart of conventional fingerprinting-based IPS and proposed BoF-assisted approach.

Figure 3. Step-by-step functionality of the proposed BoF-assisted approach.

Figure 4. Floor map with 30 m by 30 m dimensions.

Figure 5. Floor map with 50 m by 50 m dimensions.

Figure 6. CDF plot for floor with 30 m × 30 m dimensions.

Figure 7. CDF plot for floor with 50 m by 50 m dimensions.

Figure 8. Experimental area floor map.

Figure 9. Experimental area radio map construction.

Figure 10. Basic elements of the fingerprinting experiment. (a) Grid mark; (b) APs used in the experiment; (c) Fingerprint utility home screen; (d) Fingerprint utility RSS sample collection.

Figure 11. CDF Plot for real-time experiment.

Table 1. WLAN access point configurations.

Scenario	Dimensions	APs	Coordinates of APs
Scenario 1	30 m × 30 m	4 APs	(25, 25) (19, 23) (13, 27) (11, 7)
Scenario 2	50 m × 50 m	5 APs	(25, 25) (37, 25) (25, 19) (28, 13) (16, 25)

Table 2. Parameters used in simulation.

Parameter	Value
Path loss exponent n	3
Number of APs	4 and 5
Wall attenuation factor	4 dB
People attenuation factor	3 dB
Reference distance $d_{0}$	1 m
Power at $d_{0}$	−30 dBm
Transmission power	10 dBm
RSS samples collected at RP	20
k in KNN	4
Grid size	2 × 2
No. of position queries in virtual environments	100

Table 3. Mean absolute error of 30 m by 30 m floor.

KNN	Probabilistic	SVM	Discriminant Analysis	Decision Tree	Ensemble Learning	Bag-of-Features
1.922	1.841	2.617	3.729	3.438	3.005	1.702

Table 4. Mean absolute error of 50 m by 50 m floor.

KNN	Probabilistic	SVM	Discriminant Analysis	Decision Tree	Ensemble Learning	Bag-of-Features
3.929	4.447	5.480	5.298	6.531	5.820	2.837

Table 5. Mean absolute error in the real environment.

KNN	Probabilistic	SVM	Discriminant Analysis	Decision Tree	Ensemble Learning	Bag-of-Features
1.772	2.450	2.017	2.029	3.053	4.678	1.581

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Altaf Khattak, S.B.; Fawad; Nasralla, M.M.; Esmail, M.A.; Mostafa, H.; Jia, M. WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach. Sensors 2022, 22, 5236. https://doi.org/10.3390/s22145236

AMA Style

Altaf Khattak SB, Fawad, Nasralla MM, Esmail MA, Mostafa H, Jia M. WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach. Sensors. 2022; 22(14):5236. https://doi.org/10.3390/s22145236

Chicago/Turabian Style

Altaf Khattak, Sohaib Bin, Fawad, Moustafa M. Nasralla, Maged Abdullah Esmail, Hala Mostafa, and Min Jia. 2022. "WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach" Sensors 22, no. 14: 5236. https://doi.org/10.3390/s22145236

APA Style

Altaf Khattak, S. B., Fawad, Nasralla, M. M., Esmail, M. A., Mostafa, H., & Jia, M. (2022). WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach. Sensors, 22(14), 5236. https://doi.org/10.3390/s22145236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

WLAN RSS-Based Fingerprinting for Indoor Localization: A Machine Learning Inspired Bag-of-Features Approach

Abstract

1. Introduction

2. Related Work

3. System Model

4. Methodology

5. Performance Evaluation

5.1. Virtual Environments

5.2. Real-Time Testbed Experiment

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI