^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

The paper presents an indoor navigation solution by combining physical motion recognition with wireless positioning. Twenty-seven simple features are extracted from the built-in accelerometers and magnetometers in a smartphone. Eight common motion states used during indoor navigation are detected by a Least Square-Support Vector Machines (LS-SVM) classification algorithm, e.g., static, standing with hand swinging, normal walking while holding the phone in hand, normal walking with hand swinging, fast walking, U-turning, going up stairs, and going down stairs. The results indicate that the motion states are recognized with an accuracy of up to 95.53% for the test cases employed in this study. A motion recognition assisted wireless positioning approach is applied to determine the position of a mobile user. Field tests show a 1.22 m mean error in “Static Tests” and a 3.53 m in “Stop-Go Tests”.

Nowadays, with the explosive growth of the capabilities in handheld devices, various components are embedded into smartphones, such as GPS, WLAN (a.k.a. Wi-Fi), Bluetooth, accelerometers, magnetometers, cameras,

To address positioning and navigation in GNSS-degraded or denied areas, various technologies are broadly researched [

Benefiting from the existing infrastructure, RF-based technologies, such as WLAN, Bluetooth, cellular network, and RFID, are definitely one of highest potential alternatives. RADAR [

Meanwhile, human physical activity recognition using MEMS sensors has been extensively applied for health monitoring, emergency services, athletic training, navigation,

Since mobile devices are becoming smarter and smarter nowadays, the smartphone already contains the potential for indoor navigation and positioning within the existing infrastructures [

Related research indicates that utilizing opportunistic signals of, e.g., WLAN, is an efficient locating alternative in GPS-denied environments. However, in order to minimize a smartphone's battery drain, the WLAN scanning interval is always limited. For instance, most of the Nokia mobile phones refresh the scanned WLAN information proximately every 8–10 s. The default scanning interval of most Android devices is 15 s. On the other hand, other built-in sensors such as accelerometers are always turned on, in order that the physical orientation of the smartphone is always known to the system. These sensors provide an alternative for positioning while WLAN positioning is unavailable.

During the gaps where no wireless signal is updated, the most essential elements for navigation are the movement speed and orientation (

Human motion has been widely studied for decades, especially in recent years using computer vision technology. Poppe gives an overview of vision based human motion analysis in [

Unlike the solution with sensors fixed on the body, a smartphone in hand has more degrees of freedom (DOF) during the navigation process. Even if we only consider the case where the user holds the phone in hand, the motion behaviour is still complicated. For this reason, we defined eight most common motion states during pedestrian navigation in this paper. In order to classify the motion states, twenty-seven features are investigated in this section.

The motion states, as defined in

S-series motion states (

W-series is relevant to walking. After observing the walking behaviour of the user when navigating, three types of walking motion states have been defined. As shown in the left image of

T-series is related to turning motions. UT represents so-called U-turning, which is a spot turn without any horizontal displacement. As shown in

V-series concerns motions in the vertical dimension. In

When using tri-axis accelerometer sensors, the sensor orientation determines the local coordinate system of each (x, y, z) reading. Most previous research work on motion recognition has used body-worn accelerometer sensors,

To avoid this orientation problem, the magnitude of the accelerometer signal (see the ‘Acc Filter’ line of

The gravity vector is denoted as:
_{x}, g_{y}, g_{z}

The acceleration vector can be expressed as:
_{x}, a_{y}, a_{z}

The projection of _{p}

Then the horizontal component _{h}

The direction of _{h}

In addition to the accelerometer sensors, a magnetometer, also known as a digital compass, is another data source that can be utilized for motion recognition in a smartphone. The magnetometer, however, has some significant drawbacks. Indeed, magnetic disturbances are numerous, particularly in indoor environments.

After analyzing the physical characteristics of the motion behavior, twenty-seven features are defined for the motion state estimation, including time-domain features of acceleration (Features 1–18) and heading (Features 19–21) and frequency domain features of acceleration (Features 22–27). Note that in

The motion recognition method presented in this paper aims at determining which of the eight motions have caused the above twenty-seven simple features. The possible classification algorithms include k-Nearest Neighbour (kNN), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Naïve Bayesian Classifier (NBC), Bayesian Network (BN), Decision Tree (DT), Artificial Neural Networks (ANN), Support Vector Machines (SVMs) and so forth. Thanks to the efficient pattern reorganization performance for the non-linear multi-class scenarios, in this study, we adopt the Least Square-Support Vector Machines (LS-SVM) [

The concept of SVMs, which was originally developed for binary classification problems, is the use of hyperplanes to define decision boundaries separating data points of different classes. SVMs are able to handle both simple, linear classification tasks, as well as more complex,

When the data are linearly separable, the separating hyperplane can be defined in many ways. SVMs are based on the maximal margin principle, where the aim is to construct a hyperplane with maximal distance between the two classes. In most of real life applications, however, data of both classes overlap, which makes a perfect linear separation impossible. Therefore, a restricted number of misclassifications should be tolerated around the margins. The resulting optimization problem for SVMs, where violation of the constraints is penalized, is written as:
_{1}_{i}_{i}_{i}

_{i}_{i}^{T}φ_{i}_{i}

0 < _{i}_{i}

_{i}_{i}

Typically, the problem formulation in _{i}

The solution for the Lagrange multipliers is obtained by solving a quadratic programming problem. The SVM classifier takes the form:
_{i}^{T}φ_{i}

_{1}, κ

_{2}

The classification technique used in this work is the LS-SVM. LS-SVM tackles linear systems rather than solving convex optimization problems, typically quadratic programs, as in standard support vector machines (SVM) [

These parameters can be found by solving the following optimization problem having a quadratic cost function and equality constraints:
_{1}_{N}^{T}^{d}^{dh}_{h}, ω_{i}

Taking the conditions for optimality, we set:

Whereas the primal problem is expressed in terms of the feature map, the linear optimization problem in the dual space is expressed in terms of the kernel function:
_{1}_{N}^{T}α_{1}_{N}^{T}_{n} = [1…1]^{T}_{1×N}^{N×N}_{ij}_{i}y_{j}φ_{i}^{T}φ_{j}_{i}^{T}φ_{i}_{i}

As shown in ^{2}

In this paper, the motion recognition assisted indoor navigation solution interpolates the locations calculated by wireless positioning which uses the fingerprinting approach described below. Provided with the discrete locations from wireless positioning and recognized motion states, a grid-based filter based on the hidden Markov model is applied to compute the continuous positions of a smartphone user.

Received signal strength indicators (RSSIs) are the basic observables in this approach. The process consists of a training phase and a positioning phase. During the training phase, a radio map of probability distributions of the received signal strength is constructed for the targeted area. The targeted area is divided into a grid, and the central point of each cell in the grid is referred to as a reference point. The probability distribution of the received signal strength at each reference point is represented by a Weibull function [

During the positioning phase, the current location is determined using the measured RSSI observations in real time and the constructed radio map. The Bayesian theorem and Histogram Maximum Likelihood algorithm are used for positioning [

Given the RSSI measurement vector _{1}, O_{2}_{k}

We assume that the mobile device has equal probability to access each reference point, thus

Now it becomes a problem of finding the maximum conditional probability of:
_{n}

The grid-based filter of hidden Markov model (HMM) is implemented to produce an optimal estimation based on the previous state. The transit probability matrix of HMM is computed according to the travelled distance which can be estimated by the knowledge about the motion over time. For instance, the travelled distance is zero if the current motion mode is static. The user travel distance while navigating can be calculated each second as:
_{t}

The velocity estimation models vary in different motion states. These estimations are out of the scope of this paper. More details about velocity estimation can be found in [

The grid-based filter produces an optimal estimation if the state space is discrete and consists of a finite number of states. If a numerical approximation is employed to obtain a discrete and finite state space, the grid-based filter produces a suboptimal estimation [

Given measurements up to epoch _{i}^{i}_{t-}_{1}_{|t-}_{1}, that is ^{i}_{t-}_{1}_{|t-}_{1} _{t}_{-1} = _{i}_{1},…,_{t}_{-1}), _{i}

The grid-based filter consists of prediction and update stages as follows, similar to those used in other recursive Bayesian filters.

Prediction stage:

Update stage:

Once the posterior probabilities of all states are estimated, the filter solution is given by the state with the maximum probability.

In the tests described in the following section, we applied magnetometer readings via map-matching instead of using the heading directly obtained from a magnetometer since the magnetic disturbances are numerous, particularly in indoor environments. With the heading input from the magnetometer and current position estimate, matched direction is derived from the segment vector in the topological network of the fingerprint database. In addition, the cumulative travel distance over the duration without WLAN positioning is used as an observation in the HMM grid-based filter for determining the position.

To verify the solution proposed in this paper, some field tests were carried out in the Finnish Geodetic Institute (FGI) office building, which has three floors. A smartphone application was developed for collecting sensor data, labeling the motion state, and locating the smartphone position. Five persons collected the data for motion recognition in one day. Each person performed eight motion states respectively. Testers made marks at the beginning and end of the motion to separate the samples of motion states. All the collected data were divided into two groups. One was selected as a training data set. The other was utilized as a testing data set. The training data sets were used for learning the parameters of the classification algorithm. The testing data sets were used for validating the recognition rate of a classifier.

In order to evaluate the performance of LS-SVM classifier for motion recognition, the same data sets are also applied in four other classification algorithms for comparison: Bayesian Network using the Gaussian Mixture Model (BN-GMM), Decision Tree (DT), Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA).

The test results indicate that:

LS-SVM classifier has the best performance in different feature combinations.

Including all features does not help the recognition rate.

Feature 4 and 5 are the most efficient features for the tested motion recognition.

Another test was carried out by a tester who travelled around in the FGI office building for 20 min. As shown in

To prove the advantage of wireless positioning combined with motion recognition, two positioning tests were conducted in the FGI building. A WLAN fingerprint database covers three floors of FGI was beforehand generated and used in the following tests. The first test, called a “Static Test,” was carried out in a static state—a user stood on a reference point while holding the phone in hand (ST) for ten minutes. The results are summarized in

The second test is called the “Stop-Go Test”. In the FGI office building, a tester stopped at each reference point to obtain the wireless positioning estimation, then moved to another reference point while randomly performing varying motion states between two stops.

In this paper, the motion recognition assisted wireless positioning method is presented. The raw data from the accelerometer and magnetometer on a smartphone are processed into twenty-seven features. Then eight motion states are predicted by separately applying the LS-SVM, BN-GMM, DT, LDA and QDA classifiers. The test results indicate that the LS-SVM classifier has an efficient performance of motion recognition rate compared with the other four classifiers. The recognition rates of T-series and V-series motions are lower than those of S-series and W-series motions. Furthermore, both positioning accuracy and floor detection rate are significantly improved by applying motion recognition in the wireless positioning algorithms.

Despite the fact that the motion recognition solution proposed in this paper provides correct motion recognition for up to 95.53% of the test cases, the motion behavior varies from person to person. In the future we will involve more persons for testing the motion recognition algorithms and determine the most useful features for classification. In addition, more motion states will be considered for indoor navigation. For instance, we currently only consider the “using-stairs” motion in the V-series motions. Other V-series motions such as “using-elevator” will be studied in the future. Lastly, the T-series motions introduce much more confusion because it is possible to combine them with the other motions simultaneously. Therefore, more efforts will be concentrated on the T-series motions in the future. For instance, we are currently studying the use of gyroscopes inside smartphones, which provide heading change rate.

This work is a part of the Indoor Outdoor Seamless Navigation for Sensing Human Behavior (INOSENSE) project, funded by the Academy of Finland.

S-series (left and middle: ST, right: SS).

W-series (left: WH, middle: WS, and right: WF).

T-series (UT).

V-series (left: US and right: DS).

Accelerometer readings.

Magnetometer readings.

Mapping of the data from the input space to a high-dimensional feature space.

The projections of LS-SVM hyperlanes in the original feature space. (Class 1: ST, 2: SS, 3: WH, 4: WS, 5: WF, 6: UT, 7: US, 8: DS).

LS-SVM Motion state predictions (Motion state 1: ST, 2: SS, 3: WH, 4: WS, 5: WF, 6: UT, 7: US, 8: DS).

Motion state definition.

S | A state where a user keeps a phone in hand without any movement. | |

S | User's location does not change but the phone is in a swinging. | |

W | Walking with a small arm swinging. | |

W | Using the navigation application on the handset while walking. | |

W | Fast walking with a significantly arm swinging. | |

T | Making a U-turn. | |

V | Going up stairs. | |

V | Going down stairs. |

Feature definition.

MeanAccX | Mean value of the acceleration along x-axis. | |

MeanAccY | Mean value of the acceleration along y-axis. | |

MeanAccZ | Mean value of the acceleration along z-axis. | |

MeanAcc | Mean value of the acceleration. | |

MeanDynAccV | Mean value of the dynamic acceleration in the vertical plane. | |

MeanDynAccH | Mean value of the dynamic acceleration in the horizontal plane. | |

MeanAccH | Mean value of the horizontal acceleration. | |

MeanAccV | Mean value of the vertical acceleration minus gravity acceleration. | |

MeanDynAcc | Mean value of the dynamic acceleration. | |

VarAccX | Variance of the acceleration along x-axis. | |

VarAccY | Variance of the acceleration along y-axis. | |

VarAccZ | Variance of the acceleration along z-axis. | |

VarAcc | Variance of the acceleration. | |

VarDynAccV | Variance of the dynamic acceleration in the vertical plane. | |

VarDynAccH | Variance of the dynamic acceleration in the horizontal plane. | |

VarAccH | Variance of the horizontal acceleration. | |

VarAccV | Variance of the vertical acceleration. | |

VarDynAcc | Variance of the dynamic acceleration. | |

MeanMag | Mean value of the heading. | |

DiffMag | Heading change. | |

VarMag | Variance of the heading. | |

1stFreqAcc | 1st dominant frequency of the acceleration. | |

Amp1stFreqAcc | Amplitude of the1st dominant frequency of the acceleration. | |

2ndFreqAcc | 2nd dominant frequency of the acceleration. | |

Amp2ndFreqAcc | Amplitude of the 2nd dominant frequency of the acceleration. | |

FreqDiffAcc | Difference between two dominant frequencies. | |

AmpScaleAcc | Amplitude scale of two dominant frequencies. |

Classifier

| |||||
---|---|---|---|---|---|

^{1} |
67.04 | 77.66 | 75.98 | 74.30 | |

53.07 | 56.43 | 62.01 | 63.13 | ||

73.74 | 83.80 | 86.59 | 86.03 | ||

79.33 | ^{2} |
86.03 | |||

88.83 | 84.36 | 83.24 | |||

64.80 | 85.48 | 73.74 | null^{3} | ||

75.98 | 88.83 | 74.86 | null | ||

73.18 | 85.48 | 73.18 | 87.71 | ||

78.21 | 83.80 | 83.24 | |||

77.10 | 85.48 | 77.10 | null | ||

68.72 | 86.03 | 83.24 | null | ||

74.86 | 82.12 | 84.92 | null | ||

77.10 | 83.24 | ^{4} |
80.45 | ||

81.01 | 84.92 | 86.03 | null | ||

67.60 | 65.36 | null | null | ||

76.54 | 82.68 | null | null | ||

72.07 | 82.12 | null | null | ||

47.49 | 76.54 | 64.25 | 64.80 | ||

48.04 | 70.39 | 59.78 | 68.72 | ||

80.45 | 51.40 | null | null | ||

42.46 | 53.07 | 52.51 | 58.10 | ||

53.63 | 64.25 | null | null |

The bold and italic number indicates the best recognition rate in each feature combination.

The bold and underlined number indicates the best recognition rate in each classifier.

The null value is caused by the features which do not satisfy the requirements of the classifier.

The recognition rates of combination feature 4, 5, 21 and feature 4, 5, 22 are the equally best in LDA classifier.

Confusion Matrix for the motion recognition from LS-SVM classifier (Unit: %).

| ||||||||
---|---|---|---|---|---|---|---|---|

81.25 | 0 | 0 | 0 | 0 | 18.75 | 0 | 0 | |

0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | |

0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | |

0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | |

0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | |

0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | |

0 | 0 | 0 | 0 | 0 | 22.22 | 77.78 | 0 | |

0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 |

Static Test (Unit: m).

3.43 | 1.22 | |

5.98 | 2.55 | |

21 | 9 | |

0 | 0 |

Stop-Go Test (Unit: m).

4.38 | 3.53 | |

6.02 | 4.55 | |

18 | 9 | |

0 | 0 |

Confusion matrix for floor detection using ML wireless positioning (Unit: %).

| |||
---|---|---|---|

^{st} |
^{nd} |
^{rd} | |

^{st} |
93.94 | 6.06 | 0 |

^{nd} |
4.00 | 92.00 | 4.00 |

^{rd} |
0 | 17.95 | 82.05 |

Confusion matrix for floor detection using motion-assisted HMM wireless positioning (Unit: %).

| |||
---|---|---|---|

^{st} |
^{nd} |
^{rd} | |

^{st} |
96.97 | 3.03 | 0 |

^{nd} |
4.00 | 96.00 | 0 |

^{rd} |
0 | 5.13 | 94.87 |