ABC-ANN Based Indoor Position Estimation Using Preprocessed RSSI

Unlersen, Muhammed Fahri

doi:10.3390/electronics11234054

Open AccessArticle

ABC-ANN Based Indoor Position Estimation Using Preprocessed RSSI

by

Muhammed Fahri Unlersen

Department of Electrical and Electronics Engineering, Necmettin Erbakan University, 42090 Konya, Turkey

Electronics 2022, 11(23), 4054; https://doi.org/10.3390/electronics11234054

Submission received: 8 November 2022 / Revised: 28 November 2022 / Accepted: 30 November 2022 / Published: 6 December 2022

(This article belongs to the Section Electrical and Autonomous Vehicles)

Download

Browse Figures

Versions Notes

Abstract

:

The widespread use of mobile devices has popularized the idea of indoor navigation. The Wi-Fi fingerprint method is emerging as an important alternative indoor positioning method for GPS usage difficulties. This study utilizes RSSI signals with three preprocessed states (raw, preprocessed with the path loss adapted, and exponential transformed) to train and test an artificial neural network (ANN). A systematic approach to the determination of neuron numbers in the hidden layers and activation functions of ANN is provided. The ANN is trained by the artificial bee colony algorithm. Five ML methods have been employed for estimation. The best performance has been achieved with ABC-ANN by the path loss adapted database with the MAE of 1.01 m. The estimation done using processed RSSI values has better performance than raw RSSI values. In addition, 33% less error occurs with the mentioned method compared to the data set source study.

Keywords:

indoor position estimation; IPE; artificial bee colony; Wi-Fi RSSI; neural network

1. Introduction

In daily life, navigation systems make human life easier regarding location finding. A person who has no information about a city he/she is visiting can easily and quickly reach the point he/she wants to reach thanks to the navigation systems. The navigation systems use Global Positioning System (GPS) based on satellite signals. To determine the position of the GPS receiver, it has to receive at least four different satellite signals [1]. Although the GPS has enough accuracy for outdoor navigation systems, it is always not possible to use GPS receivers to detect indoor positions. Iron and concrete walls surrounding indoor spaces severely attenuate electromagnetic waves from satellites [2]. Although there are some complex studies on this subject, this kind of situation makes it difficult to use GPS in indoor position estimation processes [3,4].

However, indoor location estimation will be a very useful service in many areas such as the location of outpatient clinics, blood test rooms or imaging centers that patients want to go to in hospitals, the locations of counters, customs, boarding gates, and routes of transfer for passengers in large international airports, the presence of a specific shop in shopping centers or the location of a product group sought in large markets. In the literature, there are various approaches for indoor position estimation, such as ultrasonic radars, LIDARs, deep learning-based image recognition, etc. [5,6,7,8,9]. One of the most popular methods for indoor location estimation is Wi-Fi fingerprint-based location estimation. In this approach, RSSI levels of various stationary Wi-Fi access points are used for position estimation. Due to having minimal complexities, RSSI-based indoor location estimation is highly appreciated [10].

In this study, popular machine learning (ML) methods, such as Gaussian processes (GP), linear regression (LR), support vector machine (SVM), k-nearest neighbors (k-NN), K star (K*), random forest (RF), and a novel method—artificial bee colony trained artificial neural network (ABC-ANN)—have been employed for indoor estimation. In such cases where the number of samples is low, the deep neural network cannot be trained sufficiently, which is an important factor in choosing the artificial neural network structure. Additionally, in each ML method, the RSSI values are used raw and two different preprocessed states.

In general, the procedures performed in this study can be summarized as follows. First, three different databases were created with path loss adaptation operation, exponential transformation operation, and no preprocessing. All the data sets were trained and tested separately. Finally, the results were compared. The general concept of this study is explained in Figure 1.

The rest of the article flows as previous studies where a literature review is presented. In the Material and Methods section, the data set used in this study is explained. The investigated methods are described in detail. Then, the experimental results are presented and discussed. Finally, in the conclusions section, the entire article is evaluated.

2. Literature Review

There are various methods based on RSSI values for indoor location estimation in the literature. Papamanthou et al. created a data set via a simulation for indoor location estimation in an area of 50 m × 50 m. The correct probability distribution method used achieved a minimum 3.774 m estimation error [11]. Saxena et al. used RSSI values collected by a CC1000 radio chip for a predetermined area with a 6 m × 3.6 m dimension. The proposed method employed k-NN for estimations had a minimum 1.1 m error [12]. Paul et al. used multiple sensor data in position estimation in a 9.20 m × 7.62 m area. The RMSE values of presented models varied between 7.28 m and 1.81 m depending on the estimator model [13]. Kul et al. employed Bayesian classification, k-NN, decision tree, and neural network for indoor position estimation in a 30 m × 8 m room. They reported that the accuracy of the system was very sensitive to data collection strategy, estimation method, and network traffic [14]. Dogan used the Xbee S2 module to create an RSSI data set in a 7.6 m × 9.2 m room. Maximum likelihood estimation (MLE), Kalman filter (KF), and serial and parallel extended Kalman filter (EKF) methods were employed to compare their position estimation performances. The estimation accuracy was 2.04 m [15]. Alvarez and Las Heras introduced a system that was based on RSSI values collected by ZigBee modules. The accuracy of the presented artificial intelligence-based system was about 1.467 m in a room with dimensions of 6.5 m × 10 m [16]. Maduskar and Tapaswi used a cell phone as a Wi-Fi receiver to estimate the indoor position in a 22 m × 17 m room. The measured RSSI and a previously stored RSSI data set were compared for position estimation. The RMSE value of position estimation was presented as 1.47 m [10]. Li et al. proposed a new path loss model instead of a fixed path loss model. A neural network trained by using PSO was employed for estimations. It was reported that the presented method had a 1.61 m error for a 9 m × 6 m room [17]. Dari et al. used the location fingerprint technique to estimate indoor position. The proposed k-NN method had a 4.26 m error in a room with dimensions of 12 m × 7 m [18]. Landa et al. combined RF fingerprint, odometry, visual clues, and map constraints by using a particle filter. An Android-based application was constructed and presented in the study. It was reported that the proposed system had an 11.7 m error in an area with dimensions of 200 m × 80 m [19]. Chai et al. employed the Kalman filter to calculate the distance between a beacon and a cell phone. Distances between the nearest three beacons and the cellphone triangulation method were used to estimate the position of the cell phone. The accuracy was about 0.3 m in a 2 m × 2 m room [20]. Sadowski et al. inspected k-NN, naïve Bayes, and simple trilateration methods for indoor position estimation. First, they created a data set for three different areas by collecting RSSI values of ZigBee, Bluetooth LE, and Wi-Fi. In the results, the best performance was a k-NN method with four neighborhood values that had RMS error of 1.8376 m, 1.4147 m, and 1.3856 m for scenario 1, scenario 2, and scenario 3, respectively [21]. Ravi and Misra located Wi-Fi AP points, such as classical RADAR antenna placement. Two methods called temporal smoothing and identification and removal of static devices were employed, both separately and together. The median localization error was presented as 4 m for an open area with a dimension of 45 m × 45 m [22]. Ann et al. proposed Boerdijk–Coxeter helix (BC Helix) model geometry for three-dimensional localization. The test area had dimensions of 8 m × 8 m. The MAE of the test data given in the article was determined as 2.15 m [23]. Zheng et al. employed SPSO, k-NN, SVM, linear regression, and random forests methods for location estimation in an indoor area with dimensions of 27 m × 12 m. The minimum error achieved was 2.4583 m [24]. Zhou et al. presented an indoor Wi-Fi fingerprint localization system created by combining a backpropagation neural network and adaptive genetic algorithm with CSI tensor decomposition. The parallel factor (PARAFAC) analysis model-based tensor decomposition algorithm and the iterative alternate least squares were combined. It was reported that the proposed methods had confidence probabilities of 3 m, 3.5 m, and 4 m [25]. Yoo employed k-NN and ANN to estimate indoor location in an area with dimensions of 18 m × 37 m by using RSSI values. It was reported that the minimum error was achieved as 2.29 m by k-NN with one neighborhood [26]. In the study conducted by Hou et al., the Jenks natural breaks algorithm was employed for indoor location estimation. Although no error was explicitly stated, they stated that all errors were less than 4 m and 30% of test points had an error of less than 1 m. In this case, it can be predicted that the mean absolute error was around 2 m. Additionally, it was stated that their proposed method achieved higher positioning accuracy with a time cost not more than existing CNN-based indoor location estimation methods [27].

3. Materials and Methods

In this study, a data set belonging to Sadowski et al. [21] and recently shared online is used for indoor position estimation. Although RSSI values of three different wireless communication devices (ZigBee, Bluetooth LE, and Wi-Fi) exist for each record in the used data set, only Wi-Fi RSSI values are employed for position estimation in this study.

Three scenarios are constructed in the data set. In the first scenario, the environment has low interference. The dimensions of the room presented in Figure 2 are 6 m × 5.5 m [21]. There are three stationary sources in the room. With 0.5 m steps in each direction, for 7 × 7 points RSSI values are recorded to use in the training of machine learning methods. There are 49 records for this scenario in the data set. In addition, RSSI values are measured at 10 randomly determined points that differ from those in the data set. These are used in the test process.

In the second scenario, an environment with high interference is used for data collection. Similar to the first scenario, there are three stationary sources placed. The dimensions of this room presented in Figure 3 are 5.8 m × 5.3 m and it is smaller than the first one [21]. The RSSI measurements are done with regularly separated 4 × 4 points. There are 16 RSSI measurements for training in the data set. In addition, RSSI measurements are done for testing at 6 randomly determined points that are not included in the training data set.

In the third scenario, a larger room that has medium interference is chosen for data collection. This environment is a computer lab, where there are people and wireless equipment in. The laboratory presented in Figure 4 has dimensions of 10.8 m × 7.3 m [21]. Similar to previous measurements three stationary sources are placed. In this environment, the RSSI measurements are conducted at regularly separated 40 points. To use in the test process, 16 RSSI measurement is collected at random points.

The coordinates and RSSI data of some of the points given in the scenarios are presented in Table 1.

For each scenario, training and testing points are collected separately. The training data set consists of points determined at regular distances. The testing data set is the samples taken from random points. The number of samples taken for each scenario in the database is given in Table 2.

3.1. Preprocessing of Raw RSSI Values

In this study, raw data are processed in three ways. First, raw data are used as is. Thus, it will be possible to observe the effect of data processing on the result. Hereinafter, this data set will be referred to as the raw data set.

The second method used in data processing is to create a formula based on the path loss data presented in the database. By visualizing the path loss data versus distance, a formula presented in Equation (1) is created. Since the measurement environment is reflective and noisy, it is expected that the ideal format will not be fully compatible. However, most of the time it is a fact that the change in RSSI with distance creates a decreasing concave graph. For this purpose, a formulation that has the potential to create such a graph is emphasized. In Equation (1), the component a is expected to adjust the offset located at the distance. The b component changes the RSSI offset. The c component changes the amount of effect of the exponential expression, and the d component determines the degree of power of the RSSI value. Then the parameters are optimized.

r = a + {(b + c x)}^{d},

(1)

where r and x are the distance and RSSI values, respectively. The r value to be calculated here is the distance between the point whose RSSI value is known and the access point, which is the relevant signal source. This distance was calculated using the point coordinates in the database and the coordinates of the signal source. The absolute error between the r value calculated by this method and the r value calculated with Equation (1) was used as the objective function. The genetic algorithm is employed for the optimization of the a, b, c, and d parameters. During all these calculations, the training data set is used. As the results of optimization, the parameters of a, b, c, and d are obtained as −0.225, 2.414, 0. 03058, and −2 respectively. The root mean square error for the presented parameters is determined as 0.6494 m. This data set will be called the path loss adapted data set.

The third method is using classical electromagnetic wave equations for a lossless environment. In the rest of the text, this method will be referred to as the exponential transformed data set. RSSI is the indicator for signal power received in milliwatts. It is usually presented in dBm (logarithmic scale) form. The RSSI is calculated by using the power of the electromagnetic field via Equation (2) [28,29].

R S S I = 10 \log \frac{P_{R}}{1 m W},

(2)

P_{R} = 0.001 \times 10^{\frac{R S S I}{10}},

(3)

where P is the field strength at the antenna of receiver. In Equation (4), a well-known Friis transmission equation formula is presented [30,31].

P_{R} = \frac{P_{T} G_{T} G_{R} λ^{2}}{{(4 π r)}^{2}},

(4)

In Equation (4), P_T and P_R represent transmitted and received signal powers, and G_T and G_R represent transmitter and receiver antenna gains. If it is assumed that the λ, P_T, G_T, and G_R are unchanging values, P_R is inversely proportional to the square of the distance (r) between the transmitter and the receiver. By manipulating Equation (4) in terms of r and using the P_R term as presented in Equation (3), a complex expression can be obtained as in Equation (5).

r = \sqrt{\frac{P_{T} G_{T} G_{R} λ^{2}}{{(4 π)}^{2}}} \sqrt{\frac{1}{P_{R}}} = \sqrt{\frac{P_{T} G_{T} G_{R} λ^{2}}{{(4 π)}^{2}}} \sqrt{\frac{1}{0.001 \times 10^{\frac{R S S I}{10}}}} = \sqrt{\frac{P_{T} G_{T} G_{R} λ^{2}}{0.001 \times {(4 π)}^{2}}} \times 10^{- \frac{R S S I}{20}},

(5)

By combining all the unchanging parameters,

\sqrt{\frac{P_{T} G_{T} G_{R} λ^{2}}{0.001 \times {(4 π)}^{2}}}

as a constant k, the final expression that creates proportional distance value is obtained as in Equation (6).

r = k 10^{- \frac{R S S I}{20}},

(6)

With the three methods mentioned here, preprocessing of the data set has been completed. The artificial neural network (ANN) structure, which will be determined in the next step, has been trained separately with each data group obtained here.

3.2. ANN Structure Determination Approach

In the literature, ANNs have been employed for the estimation process in many disciplines, such as medicine, electronics, machinery, etc. [32,33,34,35,36,37]. In this study, ANN is used to estimate indoor position. ANNs are a computational model of neural structures in living organisms. The basic structure of ANNs is a neuron model. It is also based on the biological neuron cell. The biological neuron cell consists of roughly three parts. These are dendrites, soma, and axon. Dendrites are responsible for transmitting information from other neuron cells to the soma. The incoming information is processed in the soma and transferred to the other neuron cells via axon [38].

The computational model of a neuron is similar to a biological neuron cell. Information from the dendrites is summed by multiplying it by a certain weight W. This sum is then transferred to the output, usually through a nonlinear function [38]. The computational neuron model is presented in Figure 5. In addition, the mathematical formulation is presented in Equation (7).

Z = A F (\sum^{} w_{i} x_{i} + B),

(7)

where x_i is the input of the neuron, W_i is the weight multiplied by the inputs of the neuron, and B is the bias of the neurons. Finally, the sum is used as input to the AF (activation function). There are various activation functions like; pure linear, logarithmic sigmoid, tangent sigmoid, signum function, etc.

To create a network structure with these neuron models, layer by layer, neuron models are connected one by one. In this structure, there are some parameters, such as the number of layers, the number of neurons in each layer, and the activation functions of layers, which affect the performance of the network seriously. Due to the characteristics of our data set output, it must be limitless. Thus, the output layer must be pure linear. First, a single hidden layer neural network (NN) structure is investigated. For single hidden layer NN, the parameters that need to be determined are the number of hidden layer neurons and the activation function of the hidden layer. During this search process, the number of neurons in the hidden layer is changed from 1 to 40 and the activation function is changed step by step for each hidden layer change. The best result is determined for the raw, path loss adapted, and exponential transformed data sets separately.

Similarly, as in a single-layer network structure, the following best structure search process is conducted for each data set separately in a two-layer network structure.

Initially, both first and second layers’ activation function is determined as tangent sigmoid function.
Then, the number of neurons in the second hidden layer is fixed to 10.
Afterwards, the number of neurons in the first hidden layer is gradually increased from 1 to 40.
This search process is repeated by changing the first hidden layer activation function.
The parameters determined as the best result of this scanning process are fixed for the first layer.
The same search process is repeated for the second hidden layer.
When the search process is completed for the second hidden layer, the best parameters are fixed for the second hidden layer again.
Next, the process repeats for the first layer.
When the parameters for both layers do not change or begin to change around the same numbers, the search process is terminated.

Separate ANNs have been used for the estimation of X and Y coordinates in the created structure. In addition, separate ANN structures have been determined for 3 separate data sets. At the end of the search processes, it has been determined that the single-layered structure has the best performance for each data set. The parameters of the ANNs are presented in Table 3.

3.3. Adaptation of Artificial Bee Colony for ANN Training

The weights and biases of the ANN structure have been optimized to train the network. In this study, due to their high convergence property, a nature-based algorithm has been preferred in the training of ANN [33]. In the training of the constructed structure, the artificial bee colony (ABC) algorithm is employed. The artificial bee colony algorithm is an heuristic optimization algorithm inspired by the nectar search of honeybee swarms in nature [39]. Food points in ABC represent the possible consequences of the optimization problem. The amount of nectar at the food points refers to the quality of the relevant food point. The quality in ABC characterizes the fitness function in classical optimization methods [40]. The bees in the colony are divided into three groups. These are employed bees, onlookers, and scouts. In the ABC algorithm, random food points are determined first. Each food point is a D-dimensional vector. D is the number of parameters to be optimized. Each point has a fitness function that determines the amount (quality) of nectar. In this study, the fitness function is assumed as the root mean squared error (RMSE) of the ANN. The employed bees inform the scout bees about the quality of the food points in their memory. Scout bees search the periphery of food points within a certain radius, memorize them and come back to the hive. Scout bees are sent to search for random food points. When all bees return to the hive, the food points are sorted according to their nectar amount. The number of food points selected is as many as the population. The stopping criteria of the optimization process are checked and the next iteration is begun [41].

In this study, the number of parameters (D) to be optimized varies with the structure of ANN. All weights and biases in an ANN can be summarized as follows.

i.: Number of weights between input layer and hidden layer

N_{W i h} = N_{i n p u t} \times N_{h i d d e n}

ii.: Number of biases in hidden layer

N_{B h} = N_{h i d d e n}

iii.: Number of weights between hidden layer and output layer

N_{W h o} = N_{h i d d e n} \times N_{o u t p u t}

iv.: Number of biases in output layer

N_{B o} = N_{o u t p u t}

Totally;

N = N_{W i h} + N_{B h} + N_{W h o} + N_{B o} = N_{h i d d e n} (N_{i n p u t} + 1) + N_{o u t p u t} (N_{h i d d e n} + 1)

In this study, ANNs have 3 inputs and 1 output. In this case, the number of parameters that should be optimized for ANN structures given in Table 3 are presented in Table 4.

Complexity analysis of the ABC optimization algorithm has been done in many studies. Nagy et al. reported the ABC algorithm as O(n + log n) in their study [42]. Since the ABC algorithm used is the classical ABC algorithm, the complexity value presented in previous studies is also valid here. In addition, the complexity of the use of the artificial neural network is given in many studies in the literature. In the studies in the literature, it is reported that the complexity value of the single hidden layer artificial neural network is O(n⁵) [43,44,45,46].

3.4. Other Machine Learning Methods

The same data sets are also searched with multiple machine learning methods to make the results comparable to other popular machine learning methods and to detect possible biases in the database. Five of the ML methods investigated (Gaussian processes regression (GPR), linear regression (LR), support vector machine (SVM), k-NN, and random forest) are employed on the WEKA machine learning platform. WEKA (Waikato Environment for Knowledge Analysis) is a GNU-licensed and Java-based platform that collects various machine learning algorithms. It was developed at the University of Waikato, New Zealand. It has a modular structure. It can perform such operations as visualization, clustering, regression, classification, data mining, etc. on data sets [47,48,49].

The (first employed) Gaussian processes regression method is a probabilistic supervised method that is a generalization of the Gaussian probability distribution. It has both classification and regression abilities [50]. The main known disadvantages are computation complexity and memory consumption increase with dimension [51]. When the value to be estimated and all attributes are numerical, the first method that comes to mind is to establish a linear relationship. This method is called linear regression. The estimation is the sum of the attributes weighted with individual coefficients that are obtained during the training stage. The most important advantage of the linear regression method is that it is simple and fast. However, this method is insufficient in cases where there is a nonlinear relationship [49]. SVM is a popular machine learning model possibly used for classification and regression analysis. It is known as one of the most robust methods due to its statistical learning frameworks [52,53]. One of the oldest methods, k nearest neighbor is a supervised machine-learning method used for classification and regression. The widely analyzed method was developed in the early 1950s by Evelyn Fix and Joseph Hodges. The k-NN method is based on the logic that similar things exist in proximity, so to find the closest records, the distance between them needs to be determined. The distance function used for this purpose is a well-known Euclidean distance in this study [30]. The random forest method involves numerous decision trees that work together. The discrete tree groups are relatively uncorrelated. These discrete tree groups are created during the training stage. Random forest solves the problem of overfitting the training sets of the decision tree method [49].

The file format for data sets used in WEKA is a special format called ARFF (attribute relationship file format). To investigate the raw, path loss adapted, exponential transformed data sets used in this study on the WEKA platform, each of the data sets are converted to ARFF file format separately. Then, estimation is made with the mentioned machine learning methods using each database. The results are presented and discussed in the next section.

4. Results and Discussion

In this study, indoor position estimation has been made by using RSSI information of Wi-Fi signals. A database containing records for three different scenarios has been used. The scenarios mentioned here differ from each other in terms of both size and electromagnetic environment. In addition, training and test data are different in each scenario and are separate data sets as training and testing. There are 105 records in the training data set and 32 records in the testing data set.

Data have been subjected to three different preprocesses in each scenario. These preprocesses are raw, path loss adapted, and exponential transformed. Additionally, experiment results for path loss are given in the database. Using these results, a formula has been created for the relationship between RSSI and distance to use in the path loss adapted method. In exponential transformed, it is assumed that the power of received signal strength is associated with the inverse square of distance. A formula has been created in this assumption.

The ANN has been employed for position estimation using the preprocessed data. During the determination of a single-layered ANN hidden layer neuron number, 120 different ANNs with different structures have been trained and tested (3 activation functions × 1 to 40 hidden layer neuron numbers) for each data set. In the double hidden layer ANN structures, achieving the termination criteria cost an average of two rounds. In the meantime, 480 networks, 120 tests for each hidden layer, and 2 rounds have been trained and tested. Additionally, these processes are conducted for each X and Y-axis. With 3 data sets, a total of 3600 ANN structures have been trained and tested with only training data sets to determine the best ANN structure.

In order to avoid overfitting, selection bias, and the effects of the unbalanced data set, the k-fold cross-validation method is employed. In this study, fivefold cross-validation is used. The training set was divided into five parts and four of them were used alternately for training and one was used for validation. A visual representation of the fivefold cross-validation process is presented in Figure 6.

The criterion for termination of training are determined as reaching the maximum epoch. In addition, the network at the point where the validation error started to increase during the training is accepted as the training result. The epoch number limit is chosen as 750 epochs. The loss graph of the training process is presented in Figure 7.

To train ANN structures, the ABC optimization algorithm has been employed due to its high convergence property. Finally, parameters of ANNs that have the best performance in train data sets have been tested with test data sets. To measure the performance of the proposed methods, mean absolute error (MAE) is used. The MAE is calculated with Equation (8).

M A E = \frac{1}{N} \sum_{i = 1}^{N} \sqrt{\begin{matrix} {(\begin{matrix} x_{r} & - & x_{e} \end{matrix})}^{2} & + & {(\begin{matrix} y_{r} & - & y_{e} \end{matrix})}^{2} \end{matrix}},

(8)

where x_r is the real x coordinate value of a point and x_e is the estimated x coordinate value. Similarly, y_r is the real y coordinate value of a point and y_e is the estimated y coordinate value. The Euclidean distance between (x_r, y_r) and (x_e, y_e) is used as an absolute error.

As can be seen in Table 5, in the first, second, and third scenarios, the lowest error has been found as 1.0528 m, 0.8035 m, and 1.1865 m, respectively. The best performance in all scenarios has been achieved when preprocessing is performed using the path loss adapted method. In addition, the best performance in all methods has been achieved in the second scenario. In this scenario, errors of 0.8153 m, 0.8035 m, and 1.1245 m occur in raw, path loss adapted, and exponential transformed data sets, respectively. The scripts for all the operations mentioned above, such as the formulae and optimization processes used during the preprocessing, the ANN structure used, and the ABC algorithm used in ANN training, have been carried out in the Octave-GNU environment.

In addition, the three databases mentioned are trained separately for x and y coordinates with various machine learning methods. For five of them, the trained structures are tested with the test database and the MAE value is obtained. The estimation errors of the five machine learning methods and the proposed ANN method are listed in Table 6.

As can be seen in the results given in Table 6, the lowest error performance is in the ABC-ANN method. The second most successful method is SVM in all databases for scenario 1, all databases for scenario 3, and the exponential transformed database for scenario 2. In the raw and path loss adapted databases of scenario 2, the second-most successful method is determined as k-NN. In the average of the errors of the three scenarios, ABC-ANN is the most successful method, and the second-most successful method is SVM.

5. Conclusions

It is a well-known fact that the precise accuracy and speed of GPS modules have an important role in the high performance of navigation devices, which have become an indispensable element of our daily life. However, serious problems are encountered in the use of GPS signals indoors. Nevertheless, the increase in the place of the Internet of Things (IoT) in our lives offers potential services that create increased interest in indoor location estimation. Various approaches are presented in the literature to meet this need. Among these, the RSSI fingerprint method, which stands out due to its low hardware cost and easy applicability, has been examined in this study. We have examined how the use of preprocessed instead of raw RSSI data affects performance. We apply path loss adapted and exponential transformed for RSSI values to compare the results with the predictions made with raw RSSI values. In addition, this study presents a systematic approach to determining the number of neurons in the hidden layer of ANN. Finally, in ANN training, we used an adapted optimization algorithm called ABC whose fast convergence feature has been presented in the literature. In order to present a clear comparison, numerous machine learning methods have been investigated and five machine learning methods’ results are presented. While in all the cases, ABC-ANN has the best performance, the second-best performance belongs to SVM in many cases. On the other hand, another interesting outcome is that the lowest error in the average error of the three scenarios occurs in the path loss adapted database for both ABC-ANN and SVM methods, which are the best two methods. The strength of the signal received by an RF receiver in an indoor environment is not only affected by the objects between the transmitter and the receiver but also by the surrounding wall, metal objects (such as tables, cabinets, etc.), and even the vehicle or person carrying the receiver. Considering this situation, the 1.07 m error obtained has 33% less error compared to the 1.60 m obtained in the data set source study. The most important outcome of this study is to show that proper preprocessing has an important effect on estimation performance. Many studies also use CSI data with RSSI data. This method is also very suitable for internal space position estimation with CSI-RSSI data by creating a suitable preprocessing equation. The author thinks that a precise indoor location estimation has significant benefits in many areas from health to trade and hopes that this study will shed light on future studies.

Funding

This research received no external funding.

Data Availability Statement

In this study, the RSSI-Data set-for-Indoor-Localization-Fingerprinting database was used. This data set is available on GitHub as open source. https://github.com/pspachos/RSSI-Dataset-for-Indoor-Localization-Fingerprinting, access date: 21 February 2022.

Conflicts of Interest

The author declares no conflict of interest.

References

Global Positioning System Standard Positioning Service Performance Standard; by John G. Grimes in 2008. Available online: https://rosap.ntl.bts.gov/view/dot/16930 (accessed on 5 November 2021).
Chen, J.; Song, S.; Yu, H. An indoor multi-source fusion positioning approach based on PDR/MM/WiFi. AEU-Int. J. Electron. Commun. 2021, 135, 153733. [Google Scholar] [CrossRef]
Khudhuragha, M. Indoor Location Estimation Through Redundant Lateration for Indoor Positioning System. Master of Science Thesis, Kadir Has University, Istanbul, Turkey, 2017. Available online: https://acikbilim.yok.gov.tr/handle/20.500.12812/638391 (accessed on 6 November 2022).
Park, M.; Han, J.H.; Kim, O.J.; Kim, J.; Kee, C. One-way deep indoor positioning system for conventional GNSS receiver using paired transmitters. Navig. J. Inst. Navig. 2021, 68, 601–619. [Google Scholar] [CrossRef]
Ijaz, F.; Yang, H.K.; Ahmad, A.W.; Lee, C. Indoor positioning: A review of indoor ultrasonic positioning systems. In Proceedings of the International Conference on Advanced Communication Technology, ICACT, PyeongChang, Korea, 27–30 January 2013; pp. 1146–1150. [Google Scholar]
Chen, Y.; Liu, J.; Jaakkola, A.; Hyyppä, J.; Chen, L.; Hyyppä, H.; Jian, T.; Chen, R. Knowledge-based indoor positioning based on LiDAR aided multiple sensors system for UGVs. In Proceedings of the 2014 IEEE/ION Position, Location and Navigation Symposium—PLANS 2014, Monterey, CA, USA, 5–8 May 2014; pp. 109–114. [Google Scholar]
Weyand, T.; Kostrikov, I.; Philbin, J. PlaNet—Photo geolocation with convolutional neural networks. In Proceedings of the European Conference on Computer Vision; Springer Verlag: Amsterdam, The Netherlands, 2016; pp. 37–55. [Google Scholar]
Kahe, G.; Masoumi Ganjgah, F. MAKAN: A low-cost low-complexity local positioning system. Navig. J. Inst. Navig. 2019, 66, 401–415. [Google Scholar] [CrossRef]
Zhang, Y.; Lu, L.; Wang, Y.; Chen, C. WLAN indoor localization method using angle estimation. AEU-Int. J. Electron. Commun. 2017, 76, 11–17. [Google Scholar] [CrossRef]
Maduskar, D.; Tapaswi, S. RSSI based adaptive indoor location tracker. Sci. Phone Apps Mob. Devices 2017, 3, 1–8. [Google Scholar] [CrossRef] [Green Version]
Papamanthou, C.; Preparata, F.P.; Tamassia, R. Algorithms for location estimation based on RSSI sampling. In Proceedings of the International Symposium on Algorithms and Experiments for Sensor Systems, Wireless Networks and Distributed Robotics, Reykjavik, Iceland, 12 July 2008; pp. 72–86. [Google Scholar]
Saxena, M.; Gupta, P.; Jain, B.N. Experimental analysis of RSSI-based location estimation in wireless sensor networks. In Proceedings of the 3rd IEEE/Create-Net International Conference on Communication System Software and Middleware, COMSWARE, Bangalore, India, 6–10 January 2008; pp. 503–510. [Google Scholar]
Paul, A.S.; Wan, E.A. RSSI-Based indoor localization and tracking using sigma-point kalman smoothers. IEEE J. Sel. Top. Signal Process. 2009, 3, 860–873. [Google Scholar] [CrossRef]
Kul, G.; Özyer, T.; Tavli, B. IEEE 802.11 WLAN based real time indoor positioning: Literature survey and experimental investigations. Procedia Comput. Sci. 2014, 34, 157–164. [Google Scholar] [CrossRef] [Green Version]
Dogan, M. Indoor Localization and Tracking Based on RSSI and Accelerometer Measurements. Master of Science Thesis, Middle East Technical University, Ankara, Turkey, 2015. Available online: https://open.metu.edu.tr/handle/11511/25283 (accessed on 2 November 2022).
Alvarez, Y.; Las Heras, F. ZigBee-based Sensor Network for Indoor Location and Tracking Applications. IEEE Lat. Am. Trans. 2016, 14, 3208–3214. [Google Scholar] [CrossRef]
Li, G.; Geng, E.; Ye, Z.; Xu, Y.; Lin, J.; Pang, Y. Indoor positioning algorithm based on the improved RSSI distance model. Sensors 2018, 18, 2820. [Google Scholar] [CrossRef] [Green Version]
Dari, Y.E.; Suyoto, S.; Pranowo, P. CAPTURE: A mobile based indoor positioning system using wireless indoor positioning system. Int. J. Interact. Mob. Technol. 2018, 12, 61–72. [Google Scholar] [CrossRef]
Landa, V.; Ben-Moshe, B.; Hacohen, S.; Shvalb, N. GoIn—An accurate 3D indoor navigation framework based on light landmarks. Navig. J. Inst. Navig. 2019, 66, 633–642. [Google Scholar] [CrossRef]
Chai, S.; An, R.; Du, Z. An Indoor Positioning Algorithm using Bluetooth Low Energy RSSI. In Proceedings of the The 2016 International Conference on Advanced Materials Science and Environmental Engineering; Atlantis Press: Chiang Mai, Thailand, 2016; pp. 276–278. [Google Scholar]
Sadowski, S.; Spachos, P.; Plataniotis, K.N. Memoryless Techniques and Wireless Technologies for Indoor Localization with the Internet of Things. IEEE Internet Things J. 2020, 7, 10996–11005. [Google Scholar] [CrossRef]
Ravi, A.; Misra, A. Practical server-side WiFi-based indoor localization: Addressing cardinality & outlier challenges for improved occupancy estimation. Ad Hoc Netw. 2021, 115, 102443. [Google Scholar]
Ann, A.G.C.; Foong, S.; Ahmed, A.; Soh, G.S. Adapting the Boerdijk–Coxeter helix as node configuration for GPS-denied localization in three dimensions. Navig. J. Inst. Navig. 2021, 68, 485–506. [Google Scholar] [CrossRef]
Zheng, J.; Li, K.; Zhang, X.; Zheng, J.; Li, K.; Zhang, X. Wi-Fi Fingerprint-Based Indoor Localization Method via Standard Particle Swarm Optimization. Sensors 2022, 22, 5051. [Google Scholar] [CrossRef] [PubMed]
Zhou, M.; Long, Y.; Zhang, W.; Pu, Q.; Wang, Y.; Nie, W.; He, W. Adaptive Genetic Algorithm-Aided Neural Network With Channel State Information Tensor Decomposition for Indoor Localization. IEEE Trans. Evol. Comput. 2021, 25, 913–927. [Google Scholar] [CrossRef]
Angrisani, L.; Fotiou, N.; Butun, I.; Yoo, J. Multiple Fingerprinting Localization by an Artificial Neural Network. Sensors 2022, 22, 7505. [Google Scholar]
Hou, C.; Xie, Y.; Zhang, Z. An improved convolutional neural network based indoor localization by using Jenks natural breaks algorithm. China Commun. 2022, 19, 291–301. [Google Scholar] [CrossRef]
Sauter, M. From GSM to LTE: An Introduction to Mobile Networks and Mobile Broadband; Wiley: Ravensburg, Germany, 2011. [Google Scholar]
Srinivasan, K.; Levis, P. RSSI is Under Appreciated. In Proceedings of the Third Workshop on Embedded Networked Sensors (EmNets), Cambridge, MA, USA, 30–31 May 2006; pp. 239–242. [Google Scholar]
Shi, W.; Du, J.; Cao, X.; Yu, Y.; Cao, Y.; Yan, S.; Ni, C. IKULDAS: An Improved kNN-Based UHF RFID Indoor Localization Algorithm for Directional Radiation Scenario. Sensors 2019, 19, 968. [Google Scholar] [CrossRef] [Green Version]
Jiang, J.R.; Liao, J.H. Efficient Wireless Charger Deployment for Wireless Rechargeable Sensor Networks. Energies 2016, 9, 696. [Google Scholar] [CrossRef] [Green Version]
Kandiri, A.; Mohammadi Golafshani, E.; Behnood, A. Estimation of the compressive strength of concretes containing ground granulated blast furnace slag using hybridized multi-objective ANN and salp swarm algorithm. Constr. Build. Mater. 2020, 248, 118676. [Google Scholar] [CrossRef]
Ustun, D.; Balci, S.; Sabanci, K. A parametric simulation of the wireless power transfer with inductive coupling for electric vehicles, and modelling with artificial bee colony algorithm. Meas. J. Int. Meas. Confed. 2020, 150, 107082. [Google Scholar] [CrossRef]
Gultekin, S.S.; Uzer, D.; Dundar, O. Calculation of circular microstrip antenna parameters with a single artificial neural network model. In Proceedings of the Progress in Electromagnetics Research Symposium, Kuala Lumpur, Malaysia, 27–30 March 2012; pp. 545–548. [Google Scholar]
Singh, R.; Kainthola, A.; Singh, T.N. Estimation of elastic constant of rocks using an ANFIS approach. Appl. Soft Comput. 2012, 12, 40–45. [Google Scholar] [CrossRef]
Yelken, E.; Uzer, D. Artificial Neural Network Model with Firefly Algorithm for Seljuk Star Shaped Microstrip Antenna. Eur. J. Sci. Technol. 2020, 251–256. [Google Scholar] [CrossRef]
Sabanci, K.; Aydin, N.; Sayaslan, A.; Sonmez, M.E.; Fatih Aslan, M.; Demir, L.; Sermet, C. Wheat Flour Milling Yield Estimation Based on Wheat Kernel Physical Properties Using Artificial Neural Networks. Int. J. Intell. Syst. Appl. Eng. 2020, 8, 78–83. [Google Scholar] [CrossRef]
Graupe, D. Principles of Artificial Neural Networks, 2nd ed.; World Scientific Publishing Company: Chicago, IL, USA, 2006; ISBN 9789812706249. [Google Scholar]
Karaboga, D.; Akay, B. A comparative study of Artificial Bee Colony algorithm. Appl. Math. Comput. 2009, 214, 108–132. [Google Scholar] [CrossRef]
Ozturk, C.; Karaboga, D. Hybrid Artificial Bee Colony algorithm for neural network training. In Proceedings of the 2011 IEEE Congress of Evolutionary Computation, CEC, New Orleans, LA, USA, 5–8 June 2011; pp. 84–88. [Google Scholar]
Unlersen, M.F. FPGA Kullanılarak Dizi Anten Performansının Iyileştirilmesi—Improving of Array Antenna Performance Using FPGA; Institute of Science and Technology—Electrical and Electronics Engineering Department. Selcuk University: Konya, Turkey, 2015. [Google Scholar]
Nagy, Z.; Werner-Stark, Á.; Dulai, T. An Artificial Bee Colony Algorithm for Static and Dynamic Capacitated Arc Routing Problems. Mathematics 2022, 10, 2205. [Google Scholar] [CrossRef]
Yu, H. Network complexity analysis of multilayer feedforward artificial neural networks. In Book Cover Book Cover Applications of Neural Networks in High Assurance Systems; Schumann Johann, L.Y., Ed.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 41–55. [Google Scholar]
Russell, S.J.; Stuart, J.; Norvig, P.; Davis, E. Artificial Intelligence: A Modern Approach; Prentice Hall: Hoboken, NJ, USA, 2010; ISBN 0136042597. [Google Scholar]
Lux Luna Computational Complexity Of Neural Networks. Available online: https://lunalux.io/computational-complexity-of-neural-networks/ (accessed on 5 November 2022).
Kon Mark, A.P.L. Complexity of Predictive Neural Networks. In Proceedings of the International Conference on Complex Systems; New England Complex Systems Institue, Nashua, NH, USA, 21–26 May 2000; pp. 1–8. [Google Scholar]
Weka 3: Data Mining Software in Java 2018. 2018. Available online: https://www.cs.waikato.ac.nz/ml/weka/ (accessed on 23 March 2022).
Koklu, M.; Sabanci, K.; Unlersen, M.F. Classification of Heuristic Information by Using Machine Learning Algorithms. Int. J. Intell. Syst. Appl. Eng. 2016, 4, 252–254. [Google Scholar] [CrossRef]
Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques; Elsevier Inc.: Amsterdam, The Netherlands, 2016; pp. 1–621. ISBN 9780128042915. [Google Scholar]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; The MIT Press: Cambridge, MA, USA; London, UK, 2006; pp. 715–719. ISBN 026218253X. [Google Scholar]
Wang, J. An Intuitive Tutorial to Gaussian Processes Regression; Queen’s University: Kingston, ON, Canada, 2021. [Google Scholar]
Huang, C.-L.; Liao, H.-C.; Chen, M.-C. Prediction model building and feature selection with support vector machines in breast cancer diagnosis. Expert Syst. Appl. 2008, 34, 578–587. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]

Figure 1. The general concept of the study.

Figure 2. The environment of scenario 1. (a) Training data points. (b) Testing data points. The three black dots are AP locations, and the red dots are measuring points.

Figure 3. The environment of scenario 2. (a) Training data points. (b) Testing data points. The three black dots are AP locations, and the red dots are measuring points.

Figure 4. The environment of scenario 3. (a) Training data points. (b) Testing data points. The three black dots are AP locations, and the red dots are measuring points.

Figure 5. Computational neuron model.

Figure 6. Fivefold cross-validation method: 1—the first section is reserved for validation and rest sections are for train, 2—the second section is reserved for validation and rest sections are for train, 3—the third section is reserved for validation and rest sections are for train, 4—the forth section is reserved for validation and rest sections are for train, 5—the fifth section is reserved for validation and rest sections are for train.

Figure 7. Loss function graph.

Table 1. Sample records from the RSSI data set.

#	Scenario	X	Y	RSSI A	RSSI B	RSSI C
1	Scenario 1	1.00 m	0.50 m	−58 dBm	−57 dBm	−62 dBm
2	Scenario 1	1.50 m	2.00 m	−52 dBm	−60 dBm	−61 dBm
3	Scenario 2	0.65 m	0.78 m	−40 dBm	−42 dBm	−45 dBm
4	Scenario 2	3.27 m	2.56 m	−43 dBm	−45 dBm	−37 dBm
5	Scenario 3	0.60 m	0.62 m	−31 dBm	−43 dBm	−48 dBm
6	Scenario 3	3.00 m	1.87 m	−32 dBm	−38 dBm	−39 dBm

Table 2. Number of records in the data sets.

	Training Data Set (pcs)	Testing Data Set (pcs)
Scenario 1	49	10
Scenario 2	16	6
Scenario 3	40	16
Total	105	32

Table 3. ANN parameters.

		Raw Data Set	Path Loss Adapted Data Set	Exponential Transformed Data Set
X-axis	Activation function	Tangent Sigmoid	Tangent Sigmoid	Tangent Sigmoid
X-axis	Number of neurons	19	4	24
Y-axis	Activation function	Tangent Sigmoid	Tangent Sigmoid	Tangent Sigmoid
Y-axis	Number of neurons	19	5	26

Table 4. Number of parameters in each ANN to be optimized.

	Raw Data Set (pcs)	Path Loss Adapted Data Set (pcs)	Exponential Transformed Data Set (pcs)
X-axis	96	21	121
Y-axis	96	26	131

Table 5. Estimation accuracy of ANN in terms of MAE.

	The Results in Reference [21]	Raw Data Set	Path Loss Adapted Data Set	Exponential Converted Data Set
Scenario 1	1.8303 m	1.5844 m	1.0528 m	1.4405 m
Scenario 2	1.4147 m	0.8153 m	0.8035 m	1.1245 m
Scenario 3	1.3856 m	1.3064 m	1.1865 m	1.3869 m
Average	1.6020 m	1.2824 m	1.0729 m	1.3544 m

Table 6. Estimation accuracy of all methods in terms of MAE.

	PreProcess	Gaussian Processes (m)	Linear Regression (m)	SVM (m)	k-NN (m)	Random Forest (m)	ABC-ANN (m)
Scenario 1	Raw	1.8936	1.8134	1.7655	2.0828	1.8640	1.5844
	Path loss adapted	1.7947	1.8070	1.6733	1.8752	1.8319	1.0528
	Exponential transformed	1.7866	1.8770	1.7705	1.8599	1.8519	1.4405
Scenario 2	Raw	1.5128	1.7137	1.5416	1.3029	1.7277	0.8153
	Path loss adapted	1.4888	1.8907	1.6481	1.4000	1.7180	0.8035
	Exponential transformed	1.5705	1.8969	1.5049	1.6879	1.7131	1.1245
Scenario 3	Raw	1.6845	2.9436	1.5928	2.7219	2.5790	1.3064
	Path loss adapted	2.6020	3.8453	1.5560	2.5702	2.5417	1.1865
	Exponential transformed	2.8341	2.4684	1.5083	2.5831	2.6597	1.3869
Average	Raw	1.6970	2.1569	1.6333	2.0358	2.0569	1.2354
	Path loss adapted	1.9618	2.5144	1.5925	1.9485	2.0305	1.0143
	Exponential transformed	2.0638	2.0808	1.5946	2.0436	2.0749	1.3173

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Unlersen, M.F. ABC-ANN Based Indoor Position Estimation Using Preprocessed RSSI. Electronics 2022, 11, 4054. https://doi.org/10.3390/electronics11234054

AMA Style

Unlersen MF. ABC-ANN Based Indoor Position Estimation Using Preprocessed RSSI. Electronics. 2022; 11(23):4054. https://doi.org/10.3390/electronics11234054

Chicago/Turabian Style

Unlersen, Muhammed Fahri. 2022. "ABC-ANN Based Indoor Position Estimation Using Preprocessed RSSI" Electronics 11, no. 23: 4054. https://doi.org/10.3390/electronics11234054

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ABC-ANN Based Indoor Position Estimation Using Preprocessed RSSI

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. Preprocessing of Raw RSSI Values

3.2. ANN Structure Determination Approach

3.3. Adaptation of Artificial Bee Colony for ANN Training

3.4. Other Machine Learning Methods

4. Results and Discussion

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI