Smartphone Mode Recognition During Stairs Motion

Noy, Lioz; Bernard, Nir; Klein, Itzik

doi:10.3390/ecsa-6-06572

Open AccessProceeding Paper

Smartphone Mode Recognition During Stairs Motion^†

by

Lioz Noy

¹,

Nir Bernard

^1,*

and

Itzik Klein

²

¹

Department of Electrical Engineering, The Technion - Israel Institution of Technology, Haifa 32000, Israel

²

Department Of Marine Technology, University of Haifa, Haifa 32000, Israel

^*

Author to whom correspondence should be addressed.

^†

Presented at the 6th International Electronic Conference on Sensors and Applications, 15–30 November 2019; Available online: https://ecsa-6.sciforum.net/.

Proceedings 2020, 42(1), 65; https://doi.org/10.3390/ecsa-6-06572

Published: 14 November 2019

(This article belongs to the Proceedings of The 6th International Electronic Conference on Sensors and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Smartphone mode classification is essential to many applications, such as daily life monitoring, healthcare, and indoor positioning. In the latter, it was shown that knowledge of the smartphone location on pedestrians can improve the positioning accuracy. Most of the research conducted in this field is focused on pedestrian motion in a horizontal plane. In this research, we use supervised machine learning techniques to recognize and classify the smartphone mode (text, talk, pocket and swing) while accounting for the movement up and downstairs. We distinguish between the going up and the down motion, each with four different smartphone modes, making eight states in total. This classification is based on the use of an optimal set of sensors that varies according to battery life and the energy consumption of each sensor. The classifier was trained and tested on a dataset constructed from multiple user measurements (total of 94 min) to achieve robustness. This provided an accuracy of more than 90% in the cross validation method and 91.5% if the texting mode is excluded. When considering only stairs motion, regardless of the direction, the accuracy improves to 97%. These results may assist many algorithms, mainly in pedestrian dead reckoning, in improving a variety of challenges such as speed and step length estimation and cumulative error reduction.

Keywords:

supervised machine learning; mode recognition; stairs movement classification; smartphone sensors; pedestrian dead reckoning

1. Introduction

The need for identifiying the “smartphone mode”, i.e., the way a person is holding the smartphone, is becoming more and more significant in many applications such as healthcare services, commercial usages, emergency and safety applications, etc. [1]. One of the main usages of smartphone mode recognition is to improve the capabilities of indoor navigation algorithms, particularly pedestrian dead reckoning (PDR) approaches. Such approaches are based on the measurement of step length and heading calculation. The former is based on empirical or biomechanical models which are highly affected by the smartphone mode [2]. Hence the importance of a fast and accurate recognition algorithm that can classify between multiple possible smartphone modes.

The smartphone mode is characterized by the relative location of the phone, the phone movement along periods of time, different relative angles of the phone (yaw, pitch, roll), sound levels, luminous intensity, and more. Those measurements are calculated by the phone physical sensors, e.g., accelerometer, gyroscope, magnetometer, etc. [3]. Those can be presented to the user and sorted on the device with a variety of applications. In this paper we aim to classify four smartphone modes—texting, talking, swing and pocket—while the pedestrian is going up or down the stairs.

Each of the smartphone modes above was divided to two groups: Walking up the stairs and walking down the stairs, as presented in Figure 1, resulting overall in eight different modes of classification. While the recognition of the four common modes was explored in the past, the ascent and descent separation is a relatively unexplored field. Results show an accuracy of 90.25% for the eight states and over 96.75% for the four main modes mentioned.

The rest of the paper is organized as follows: Section 2 presents the problem and our approach to solving it. Section 3 describes the data collection process, experimental setup and results, while Section 4 provides the conclusions.

2. Methodology

2.1. Problem Formulation

Let

x_{t} \in R^{d}

a d dimension vector represent the data collection calculated by the phone sensors in time t. Corresponding to

x_{t}

is the label

y_{t}

which is determined by the smartphone mode at that time. We define a time window of size n that will be the data gathered from the sensors from time

t - n + 1

to time t, the window label

y_{n}

must be similar to all data samples composing this window. So, we have the following matrix:

\begin{matrix} W_{n} = {(\begin{matrix} x_{t - n + 1} \\ x_{t - n + 2} \\ ⋮ \\ x_{t} \end{matrix})}_{n \times d} \overset{}{\to} y_{n} \equiv y_{t - n + 1} = \dots = y_{t} \end{matrix}

(1)

The quantity of time windows that are extracted from the data is a function of window size n, number of samples of data T, and the overlap percentage between the time windows

\frac{1}{n}

. We notice that the overlap between the time windows can only be executed backwards due to the fact that we observe a causal system and we cannot use future windows in real life applications. Our goal is to design an algorithm that can classify a given time window to the correct smartphone mode label.

2.2. Feature Extraction

The time window

W_{n}

is composed from d time vectors

{u_{i}}_{1}^{d}

of length n respectively to the d data points in

x_{t}

. On each of these vectors, we will calculate a collection of features to achieve more information on our data and improve the classification process. We distinguish between two groups of features [4]:

Statistical features: Will be calculated by executing statistic analysis on each vector. Examples: Mean, standard deviation, median, max, min, bias, etc.
Time features: Will be calculated by counting and searching for specific conditions on the data points in the vector. Examples: Peaks count, mean/median crossing, amount of similar argument, zeros count, etc.
Frequency features: Statistical and counting features above calculating the absolute value and the angle of a Fourier transform that been executed in each time window.
Cross measurements features: Statistical and counting features above calculating the magnitude ( $\sqrt{f_{x}^{2} + f_{y}^{2} + f_{z}^{2}}$ ) of three axes measurements, i.e., acceleration measurements, gyroscope measurements, and magnetic field measurements.

Let

σ

represent the number of statistic features and

τ

represent the amount of time features. Thus, from each time window

W_{n, i}

we can produce a feature vector

χ_{i}

with the corresponding label of that time window

y_{i}

. Eventually, the full feature matrix is obtained as shown in Equation (2) for T data samples, time window size n, and overlap between the windows of

n - 1

samples:

\begin{matrix} \hat{X} = {(\begin{matrix} χ_{1} \\ χ_{2} \\ ⋮ \\ χ_{\frac{T}{n}} \end{matrix})}_{\frac{T}{n} \times d \cdot (σ + τ)} \overset{}{\to} (\begin{matrix} y_{1} \\ y_{2} \\ ⋮ \\ y_{\frac{T}{n}} \end{matrix}) \end{matrix}

(2)

2.3. Classification

In our research, we compare a wide variety of machine learning algorithms to determine which is the best classifier for this problem. Each classifier was trained and tested on our dataset which contains sensors output from six different pedestrians with six different smartphones, and the accuracy was calculated using the cross validation method as shown in Section 3. After the initial classification test, we wish to improve the results and the robustness of the algorithm using the following methods.

First, we optimize the hyper-parameters of each classifier to the parameters that best suit the nature of our data. The next step is to perform a feature selection process in order to extract the main features from the feature

\hat{X}

matrix. This action is used to avoid over-fitting and spurious correlations, shorten training and testing time and improve the results. All physical sensors except thermometer, barometer and sound level meter generate cross-sensor measurements. For example, the combination of the accelerometer and the gyroscope generates linear acceleration and gravity. Because of the massive amount of features we first used the feature selection method which takes into account the best subset of measurements for each physical sensor, once all combinations have been tested. Afterwards, the “tsfresh” dedicated module was used, which propose a p-value based approach that inspects the significance of the features individually [5].

The optimal subset of sensors and measurements to use in the classification process is also calculated. Filtering the sensors subset is significant to achieve the optimal results for many reasons. First, some of the sensor output may be not relevant for our type of classification and moreover might impair the results due to over-fitting. Second, the application of our experiment, specifically closed space navigation using a smartphone, required the classification to be executed quickly, accurately, and the energy consumption to be minimized, i.e., using as few sensors as possible while preserving the best results.

3. Setup and Results

3.1. Data Collection and Processing

As mentioned in Section 2, the data from the smartphone sensors was collected by six users considering all eight different modes. Each user used a different smartphone with an Android operating system; the application for reading the sensors output was identical for all measurements. The sensors output sampling rate was set to 10 Hz, if a device had a faster sampling rate the data was re-sampled accordingly. In the datasets, windows with different lengths and no overlap were applied with the purpose of executing the feature extraction process. The total amount of data collected and windows extracted for each mode is presented in Table 1.

3.2. Classification Process

To perform the classification process, feature extraction (see Section 2.2) is applied on the time windows (Section 2.1) constructed on the sensor raw measurements. The proposed algorithm performance is evaluated by performing a series of classification tests and comparisons. We compare four types of machine-learning classifying algorithms: K Nearest Neighbours (K-NN) [6], Decision Tree [7], Random Forest [8], and XGBoost [9].

The accuracy performances of the classifiers were determined by the portion of correct label prediction out of all the labels in the test set of each fold in the cross validation process. For validating the proposed algorithm, the accuracy of the main modes without the stairs partition was also tested. Table 2 shows the best results of each classifier as produced with the parameters of the experiment. We notice that all the Random Forest classifiers showed superior results relative to the other classifiers −

90.26 %

. Furthermore, we see that the accuracy for the four main modes classification gave results of over

96 %

, i.e., most of the false classifications occurred in the upstairs vs. downstairs labelling and not between main smartphone modes. We can verify the last statement in Figure 2, showing the confusion matrices of the Random Forest classifier. Moreover, the texting mode is the most inaccurate for the assent and descent division; if we test the other three modes, we achieve accuracy of

91.5 %

. A possible reason could be that in texting mode the user keeps the phone in a relatively static position which makes the classification more challenging than the other modes.

The accuracy of the classifiers is based significantly on the features that were used in the classification process. Figure 3 presents the feature importance attribute of the random classifier. It is shown that the most important features in the classification are related to the specific force measurements produced by the accelerometer. Other dominant features were related to the acceleration calculation produced by the accelerometer and the atmospheric pressure produced by the barometer sensor. The influence of the barometer sensor was expected since it is often used for measuring elevation in many devices, thus the impact on the stairs movement classification process. Figure 4 shows the influence of the number of sensors used in the classification process on its performance. We see that the best results were produced by using 8 out of the 10 sensors in the phone. It is also shown that we can obtain small regression with less battery consumption by using a different sensor subset.

4. Conclusions

In this research we addressed the problem of smartphone classification of four modes (texting, talking, swing and pocket) during stairs movement. A classification accuracy of 90.25% was obtained while classifying four main smartphone modes and divided them into climbing and ascending stairs—a relatively unexplored area of smartphone modes. In future research we aim to use neural network based approaches to improve the classification accuracy.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Klein, I.; Solaz, Y.; Ohayon, G. Pedestrian Dead Reckoning with Smartphone Mode Recognition. IEEE Sensors J. 2018, 18, 7577–7584. [Google Scholar] [CrossRef]
Klein, I.; Solaz, Y.; Ohayon, G. Smartphone Motion Mode Recognition. Proceedings 2017, 2, 145. [Google Scholar]
Rinaldi, S.; Depari, A.; Flammini, A.; Vezzoli, A. Integrating Remote Sensors in a Smartphone: The project `Sensors for ANDROID in Embedded systems’. In Proceedings of the 2016 IEEE Sensors Applications Symposium, Catania, Italy, 20–22 April 2016; pp. 1–6. [Google Scholar]
Christ, M.; Braun, N.; Neuffer, J.; Kempa-liehr, A.W. Neurocomputing Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh—A Python package). Neurocomputing 2018, 307, 72–77. [Google Scholar] [CrossRef]
Guyon, I. An Introduction to Variable and Feature Selection 1 Introduction. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
Zeng, Y.; Wang, B.; Zhao, L.; Yang, Y. The extended nearest neighbor classification. In Proceedings of the 27th Chinese Control Conference (CCC), Kunming, China, 16–18 July 2008; pp. 559–563. [Google Scholar]
Kamiński, B.; Jakubczyk, M.; Szufel, P. A framework for sensitivity analysis of decision trees. Cent. Eur. J. Oper. Res. 2018, 26, 135–159. [Google Scholar] [CrossRef] [PubMed]
Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Chen, T.; Guestrin, C. Diagnosis of tuberculosis–newer tests. XGBoost: A Scalable Tree Boosting System; ACM: New York, NY, USA, 2016; p. 10. [Google Scholar]
Khan, I.; Khusro, S.; Ali, S.; Ahmad, J. Sensors are Power Hungry: An Investigation of Smartphone Sensors Impact on Battery Power from Lifelogging Perspective. Bahria Univ. J. Inf. Commun. Technol. 2016, 9. [Google Scholar]
Horvath, Z.; Jenak, H. Battery consumption of smartphone sensors. In Proceedings of the 11th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Bangkok, Thailand, 23–27 November 2015; pp. 48–52. [Google Scholar]

Figure 1. Smartphone modes illustrations. Total of eight different smartphone modes divided to four main groups: (a) Phone in hand, (b) phone in pocket, (c) walking while talking on the phone, and (d) walking while texting.

Figure 2. Random Forest confusion matrices. The accuracy of each state out of 8 labels with the stairs division according to the labels in Table 1 (a) and out of 4 labels (b) for only the main smartphone modes (1—swing, 2—pocket, 3—talking, 4—texting).

Figure 3. Feature importance. Ten most important features for the classification process.

Figure 4. Sensors and measurement reduction. The accuracy of the classifiers as a function of number of sensors and phone measurements used. The color represents the battery usage percentage of the specific sensor subset used in the classification process. The power consumption was evaluated with the number of sensors and their average power consumption [10,11].

Table 1. Data collection and the division to time windows of

1 [s]

with no overlap.

Table 1. Data collection and the division to time windows of

1 [s]

with no overlap.

	Upstairs			Downstairs
Description	Label	Minutes	Time Windows	Label	Minutes	Time Windows
Phone in hand	1	$14.98$	896	2	$11.19$	669
Phone in pocket	3	$12.30$	736	4	$10.91$	652
Talking on the phone	5	$9.70$	580	6	$10.94$	653
Texting	7	$12.85$	768	8	$11.16$	668
All labels	-	$49.83$	2980	-	$44.2$	2642

Table 2. Best results of each classifier.

Classifier	Accuracy [%] - with Up Down Division (8 labels)	Accuracy [%] - Main Modes (4 labels)
KNN	74.83	92.16
Decision Tree	78.61	90.90
Random Forest	90.25	96.75
XGBoost	90.22	95.74

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Noy, L.; Bernard, N.; Klein, I. Smartphone Mode Recognition During Stairs Motion. Proceedings 2020, 42, 65. https://doi.org/10.3390/ecsa-6-06572

AMA Style

Noy L, Bernard N, Klein I. Smartphone Mode Recognition During Stairs Motion. Proceedings. 2020; 42(1):65. https://doi.org/10.3390/ecsa-6-06572

Chicago/Turabian Style

Noy, Lioz, Nir Bernard, and Itzik Klein. 2020. "Smartphone Mode Recognition During Stairs Motion" Proceedings 42, no. 1: 65. https://doi.org/10.3390/ecsa-6-06572

APA Style

Noy, L., Bernard, N., & Klein, I. (2020). Smartphone Mode Recognition During Stairs Motion. Proceedings, 42(1), 65. https://doi.org/10.3390/ecsa-6-06572

Article Menu

Smartphone Mode Recognition During Stairs Motion^†

Abstract

1. Introduction