Activity Recognition Using Fusion of Low-Cost Sensors on a Smartphone for Mobile Navigation Application

: Low-cost inertial and motion sensors embedded on smartphones have provided a new platform for dynamic activity pattern inference. In this research, a comparison has been conducted on different sensor data, feature spaces and feature selection methods to increase the efﬁciency and reduce the computation cost of activity recognition on the smartphones. We evaluated a variety of feature spaces and a number of classiﬁcation algorithms from the area of Machine Learning, including Naive Bayes, Decision Trees, Artiﬁcial Neural Networks and Support Vector Machine classiﬁers. A smartphone app that performs activity recognition is being developed to collect data and send them to a server for activity recognition. Using extensive experiments, the performance of various feature spaces has been evaluated. The results showed that the Bayesian Network classiﬁer yields recognition accuracy of 96.21% using four features while requiring fewer computations.


Introduction
Human activity recognition aims to recognize the motion of a person from a series of observations from the user's body and environment.With the advances in wireless communications and Micro-Electro-Mechanical System (MEMS) sensor technologies on mobile devices (e.g., accelerometer, gyroscope, magnetometer), collecting a vast amount of information about the user is feasible in an automatic way; however, it is still difficult to organize and aggregate such information into a coherent, expressive and semantically-rich representation of the user's physical activity [1][2][3][4].In other words, there is a gap between low-level sensor readings and their high-level activity descriptions.Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications.The topic of activity recognition has been motivated by a number of important applications including healthcare, sports, security agencies and context-aware services (e.g., navigation and location-based services).One of the important applications of activity recognition is healthcare.Pollack et al. [5] used an activity recognition system to help the elderly deal with cognitive decline associated with sickness by sending personalized activity reminders.In one work [6], motion recognition was applied to detect symptoms of Parkinson's disease.Chen et al. [7] implemented a cellphone based system for multiple vital signs monitoring.Another application of activity recognition is in detecting abnormal human activity for security monitoring [8].In ubiquitous computing, with the help of accurate activity recognition, researchers are now capable of providing various personalized support for many real-world applications [1].Activity recognition has been employed to predict transportation modes [9][10][11].Lamming and Flynn [12] utilized physical context information such as location in the interaction between different Personal Digital Assistants (PDAs) as retrieval keys.Motion sensors offer an emerging means of capturing body movement in different sport applications which are an alternative to traditional methods [13].In this research, activity recognition information has been investigated in the context of navigation services.Several opportunities can arise from the availability of such information to flexibly adapt the services to different circumstances [14].For example, a smartphone navigation application is capable of switching between different navigation solutions once it recognizes that the user is changing his mode from stationary to walking or driving [15].
Recent popularity of mobile devices such as smartphones has resulted in considerable research directed towards the recognition and monitoring of dynamic activity patterns using the low-cost sensors [3].In this research, a comparison has been conducted on using different sensor data and pattern recognition methods to identify the most efficient components of an activity recognition system for smartphones.The main objective of this paper is to provide an experimental guideline for selecting the most meaningful features of motion sensor data and eliminating the redundant information.To do so, various feature extraction techniques have been explored including time-domain, frequency-domain, and time-frequency-domain features.Then, a large number of features are extracted, some of which not only provide irrelevant information for activity recognition but also increase the computation cost and training time.Therefore, feature selection algorithms have been used to find the optimum and independent set of features for each activity.The feature selection methods attempt to detect the information that are proven to minimally produce a correct response by the activity classifier.Finally, an effective activity recognition method is proposed using a useful set of sensors to identify the placement of the device with respect to the user as well as a user's activities.These activities are further used as context information to demonstrate the capabilities of context-aware ubiquitous pedestrian navigation services [16].

Background
Conventionally, physical activity recognition research focused on using different kinds of ambient sensors including cameras, RFID tags and infra-red (IR) motion detectors [17]; however, most works used acceleration sensors because they are small, cheap, light-weight, and consume little power [18].Randell and Muller [19] used a single biaxial accelerometer for classifying six activities (i.e., walking, running, sitting, walking upstairs, walking downstairs, and standing) by utilizing an artificial neural network (ANN).Mäntyjärvi et al. [20] also applied ANNs for human motion recognition using a pair of tri-axial sensors attached to the left and right hips.In their research, PCA (Principal Component Analysis) and ICA (Independent Component Analysis) was applied to extract the feature vector.Lee and Mase [21] developed an activity and location recognition system using a combination of a biaxial accelerometer, compass, and gyroscope.Their classification technique was based on a fuzzy-logic reasoning system.Ward et al. [22] investigated the use of wrist worn accelerometers and microphones in a wood workshop to detect activities such as hammering or cutting wood.The aforementioned studies relied upon wired sensors, which could be uncomfortable, unrealistic, and difficult to perform outdoor or long-term experiments.Recently by development of wireless technology, wireless accelerometers have become available, enabling measurements in more realistic approaches.Bao and Intille [23] conducted an extensive study to detect twenty activities (such as cycling, walking and scrubbing the floor) using five body-worn wireless biaxial accelerometers under real-world conditions.They used Fast Fourier Transform (FFT) to extract means, energy, frequency-domain entropy, and correlation features.Then, activity recognition was performed using decision tables, instance-based learning, decision trees, and naïve Bayes classifiers and decision tree classifications delivered the best results on the derived feature vector.Ravi et al. [24] used only a single tri-axial accelerometer worn in the pelvic region.They investigated the base-level classifier algorithms and the meta-classifiers including voting, stacking, and cascading frameworks using WEKA (Waikato Environment for Knowledge Analysis, is free popular software for machine learning algorithms written in Java, developed at the University of Waikato, New Zealand, http://www.cs.waikato.ac.nz/ml/weka/) toolkit [24].In another set of research, multiple sensors have been used to improve activity recognition.For example, accelerometer and meta-information from audio was employed for contextual cues in live life recording using a laptop and a wired sensor-based system [18].Yi et al. [25] conducted a context awareness study to detect changes in mobility under various lighting conditions using a single tri-axial accelerometer attached to a handheld computer.To support continuous recording of activity, a real-time recognition system was developed by Györbíró et al. [26].They attached the devices to the wrist, hip, and ankle to recognize the activities including resting, typing, gesticulating, walking, running, and cycling.The feed-forward neural network was chosen for offline supervised learning of the activities.
For activity recognition, fusing of multiple sensor information not only improves the results but is rather mandatory, as noted by Kern et al. [18].Activity recognition systems, which employ the fusion of different sensors, typically follow a hierarchical approach [13].Figure 1 illustrates the activity recognition steps.These steps are discussed in details in the following sections.
First, the sensors' providers collect and track useful data and information about the user's motions.The next step is to extract features and characteristics of the raw measurements using statistical techniques.Finally, a classification or pattern recognition algorithm is used to recognize the user's activity based on the comparison of the extracted features with those that are already extracted for each mode [27].Although there are several research on this topic, there are still two open questions: (a) What are the optimum features for recognition of activities using the experimental tests in this research?There are no theoretical guidelines that suggest the appropriate features to use in specific classification situation.Therefore, a careful investigation of the available features and transformations can significantly improve the performance of the recognition method.A good feature space can often yield a simple and easily understood classification techniques; a poor feature space may yield a complex classification techniques whose true structure is difficult or impossible to discern.Table 1 provides a comprehensive list of features commonly found in the literature for analyzing the sensor signals [13]; however, there is no single set of features that would work well for all activities.
(b) What is the most accurate classification method for recognition of activities?Classification is a process of grouping data items based on a measure of similarity so it is a subjective process; the same set of data items often needs to be partitioned differently for different applications.Because a single algorithm or approach is not adequate to solve every classification problem, this subjectivity makes the process of classification difficult.A possible solution lies in reflecting this subjectivity in the form of knowledge.This knowledge is used either implicitly or explicitly in one or more phases of classification.With advances in machine learning and pattern recognition, a variety of algorithms have been explored for classifying different movements [28].Summary of the research adopted in the activity recognition literature are presented in Table 2.
Micromachines 2015, 6 1103 activity based on the comparison of the extracted features with those that are already extracted for each mode [27].Although there are several research on this topic, there are still two open questions: (a) What are the optimum features for recognition of activities using the experimental tests in this research?There are no theoretical guidelines that suggest the appropriate features to use in specific classification situation.Therefore, a careful investigation of the available features and transformations can significantly improve the performance of the recognition method.A good feature space can often yield a simple and easily understood classification techniques; a poor feature space may yield a complex classification techniques whose true structure is difficult or impossible to discern.Table 1 provides a comprehensive list of features commonly found in the literature for analyzing the sensor signals [13]; however, there is no single set of features that would work well for all activities.
(b) What is the most accurate classification method for recognition of activities?Classification is a process of grouping data items based on a measure of similarity so it is a subjective process; the same set of data items often needs to be partitioned differently for different applications.Because a single algorithm or approach is not adequate to solve every classification problem, this subjectivity makes the process of classification difficult.A possible solution lies in reflecting this subjectivity in the form of knowledge.This knowledge is used either implicitly or explicitly in one or more phases of classification.With advances in machine learning and pattern recognition, a variety of algorithms have been explored for classifying different movements [28].Summary of the research adopted in the activity recognition literature are presented in Table 2.   Step Detection, Vertical or Horizontal Acceleration Projection

Literature Review of Activity Recognition Utilizing Smartphones
Smartphones have recently been used in monitoring human activities because of their portability (small size and light weight), substantial computing power, embedding various sensors, ability to send Table 1.The most widely used features [13].

Literature Review of Activity Recognition Utilizing Smartphones
Smartphones have recently been used in monitoring human activities because of their portability (small size and light weight), substantial computing power, embedding various sensors, ability to send and receive data, and their nearly ubiquitous use in today's life.Although there is a wide variety of research in activity recognition using wearable sensors; a limited number of studies use a smartphone to collect data for activity recognition.Miluzzo et al. [29] explored the use of various sensors (such as a microphone, accelerometer, GPS, and camera) available on the smartphones for activity recognition and mobile social networking applications.They developed a phone-centric sensing system CenceMe, which is body-fixed (e.g., in a pocket).According to the body positions, training stage can be done to address the activity recognition.They collected accelerometer data from ten users to train the activity recognition model for walking, running, sitting, and standing.This model had difficulty distinguishing between the sittings and standing activities.Yang [30] also developed an activity recognition system using the Nokia N95 phone to distinguish between sitting, standing, walking, running, driving, and bicycling.Although this study achieved relatively high accuracies, just a few activities have been investigated and the training and test data was collected from only four subjects.Brezmes et al. [2] also used the Nokia N95 phone to develop a real-time system for recognizing six user activities.In their system, an activity recognition model is trained for each user, meaning that there is no universal model that can be applied for new users.Kwapisz et al. [4] recognized some of the daily activities using a tri-axial accelerometer sensor on the Nokia N95 phone, by keeping it in a fixed position.They achieved accurate results with some activities; however, they did not consider all activities and the impact of carrying the smartphone in different locations.In another work [31], authors have discussed that if only accelerometer is available, the best possible result is to identify the segments of the signal dominated by the gravity component and make recognition based on the vertical component.In a research by Pei et al. [32], physical motion recognition has been used in the indoor navigation solution (combining wireless local area network and pedestrian dead-reckoning positioning) on a smartphone.A set of simple time-domain features has been used to recognize the pattern of six common motions during indoor navigation (e.g., static, standing with hand swinging, normal walking with holding the phone in hand, normal walking with hand swinging, walking, and U-turning).An accuracy of 95% was achieved in this study by using a decision tree classifier [32].In another recent research study, a similar feature based classification method is used to distinguish between walking, running, cycling and land-based vehicles modes [33] using smartphones.Table 3 summarized the most significant research on activity recognition using smartphone low-cost sensors.
Most of the work listed in Table 3 used accelerometer sensors to identify basic movements.They have employed different features for activity classification, varying from raw data to time and frequency-domain features.The sets of activities that are included in most of the previous work is standing, sitting, walking, lying and running together with more complex activities such as driving, bicycling or ascending or descending stairs.With respect to the positions where the device may be placed, some of the analysis let the user choose a predetermined position for all the experiments, while others determine more than a fixed position to gather training data [34].Yang et al. [27] computes the vertical and horizontal projection over gravity of the acceleration to reduce the effect of the position on the signals gathered from the accelerometer.However, none of the works intend to estimate where the user is carrying the mobile device inferred from the signal.Those algorithms based on DT or BN classifiers divide the processing algorithm among device computation and an external server.In other words, more complex classification methods have been applied while needs systems gather and process data outside the mobile device.In this research, the above challenges which play a key role in pattern recognition of sensor data using low-cost MEMS sensors have been deeply investigated.

Developing an Activity Recognition Using MEMS Sensors
Information gathered by a single source can be limited and may not be fully reliable, accurate and complete; therefore, in this research, a feature-level multi-sensor integration scheme [69] is used to improve the accuracy and robustness of the activity recognition system by using features instead of raw sensor measurements.This method that was also used in many other research (e.g., [27,34]) increased the robustness of activity recognition by using various features that are less sensitive to the sensors aspects such as noises and alignment.Activity recognition modules follow a hierarchical approach for fusing various sensors (shown in Figure 1).

Preprocessing and Sensor Calibration
Accelerometer and gyroscope sensors commonly embedded on smartphones have a drift and offset on every axis.The calibration procedure is to accurately determine the scaling factor and offset parameters of the three independent, orthogonal axes.From a practical point of view in most mobile applications, calibration is needed to assure sensor data quality and get accurate readings.In this research, the axes' misalignments (non-orthogonalities) have not been considered, as scale factor and drift calibration is already good enough for context detection application [15].The six-position static is one of the most commonly used calibration methods [70].The six position method requires the inertial system to be mounted on a levelled table with each sensitive axis of every sensor pointing alternately up and down.For example, let Ñ a " pa x , a y , a z q be a vector of raw accelerometer reading from mobile devices, and g = 9.81 m/s 2 is the earth gravity which can be used as a reference signal when the device is in static mode.Therefore, as shown in Figure 2, the user gets the samples along the positive and negative direction for each of the three axes.To do this, the user has to hold the accelerometer sensor in six different orientations and make the corresponding axis strictly point to Ñ g direction.Bias and scale factor can be estimated by summing and differencing combinations of the sensor measurements.For example, to estimate bias and scale factor of the Z-accelerometer, the measurements are: a Zup " b a ´p1 `Sa q g (1) where a Zup and a Z down are accelerometer measurements when the device is located in position shown in Figures 2a,c.b a , S a and g represent bias, scale factor and gravity, respectively.Then, the bias and scale factors can then be calculated using the following equations: Similarly for the gyroscopes, the device is oriented in static mode with the axis pointing vertically upward and downward.For example, in the case of calibrating Z-axis gyroscope, the device is located in position shown in Figure 2a.Then, the average of 10-15 min measurements ω Zup is taken [70]: where b ω , S ω , φ and ω e represent bias, scale factor, latitude of the gyroscope location and Earth's rotation rate, respectively.Next, the sensor is rotated by 180 ˝(Figure 2c) such that the same axis is pointing vertically downward and another average measurement ω Z down is obtained: Then, the bias and scale factor of gyroscope can then be calculated using the following equations: S ω " Z up ´ωdown ´2ω e sinφ 2ω e sinφ ( 8) Micromachines 2015, 6 1109 where bω, Sω, ϕ and ωe represent bias, scale factor, latitude of the gyroscope location and Earth's rotation rate, respectively.Next, the sensor is rotated by 180° (Figure 2c) such that the same axis is pointing vertically downward and another average measurement ω  down is obtained: Then, the bias and scale factor of gyroscope can then be calculated using the following equations: However, in low grade gyros such as MEMS sensors which suffer from bias instability and bigger noise levels, the earth's reference signal is not observable.Moreover, it is difficult for the end user to determine the exact direction of the gravity and to hold the sensor accordingly.Usually, the procedure is carefully performed several times and the average values needs to be used.Therefore, multi-position calibration methods have been used to combine three axes effect of the local gravity and earth rotation as references for calibration [71].Using a redundant number of placements, the IMU errors can then be estimated using a least squares adjustment.Since calibration is a one-time operation, computation is not a big concern here.
After sensor calibration, accelerometers and gyroscopes are preprocessed for noise reduction using low-pass filter [17].Low-pass filters provide a smoother form of signal which removes the short-term However, in low grade gyros such as MEMS sensors which suffer from bias instability and bigger noise levels, the earth's reference signal is not observable.Moreover, it is difficult for the end user to determine the exact direction of the gravity and to hold the sensor accordingly.Usually, the procedure is carefully performed several times and the average values needs to be used.Therefore, multi-position calibration methods have been used to combine three axes effect of the local gravity and earth rotation as references for calibration [71].Using a redundant number of placements, the IMU errors can then be estimated using a least squares adjustment.Since calibration is a one-time operation, computation is not a big concern here.
After sensor calibration, accelerometers and gyroscopes are preprocessed for noise reduction using low-pass filter [17].Low-pass filters provide a smoother form of signal which removes the short-term oscillations, leaving only the long-term trend of the signal.After noise reduction signals have been normalized.In some cases, gravitational acceleration has to be extracted from accelerometer data in order to analyze only useful dynamic acceleration.For this purpose, high-pass filters can be used to distinguish body acceleration from gravitational acceleration [72].In this research, a mean filter was used for each patch to remove the impact of gravity as a feature.

Feature Extraction
In general, features can be defined as the abstractions of raw data and the purpose of feature extraction is to find the main characteristics of a data segment that accurately represents the original data [62].In other words, this process is defined as a process of identifying valid, useful, and understandable patterns in data.The main outcome is to keep the most meaningful features of data and eliminate the redundant features.There are no theoretical guidelines that suggest the appropriate feature set to be used for specific classification situations.A good feature space can often yield a simple and easy to understand classification technique; on the other hand, a poor feature space may yield a complex classification technique where true structure is difficult or impossible to discern.The following features have been applied for activity detection.The features are divided into time, and frequency domains [69].
Time-Domain Features: These features include basic waveform characteristics and signal statistics and they are directly derived from the data.Some of the examples of time-domain features are mentioned here: Mean: The mean value of the signal over a window segment is considered as a feature according to the equation below: Root Mean Square Error (RMSE): RMSE of a signal can be considered over a defined widow using following formula: Mean of absolute Error (MAE): It is the average of the absolute value of the residuals and is given by: Inter-axis Correlation: It is especially useful for discriminating the activities that involve translation in just one dimension.Bao et al. [23] used the correlation between axes and achieved good results for distinguishing cycling from running.The following formula describes the correlation coefficient between two vectors, Zero or Mean Crossing Ratio (ZCR): ZCR is defined as the number of times that a signal changes signs in a frame.This feature has been used heavily in both speech recognition and music information retrieval [73]: where s is a signal of length N and the indicator function I tAu is 1 if its argument A is true and 0 otherwise.Quartile feature: It is a measure of the distribution of the signal values.Quartile is computed by partitioning the signal over a given window into four quarters of the data (Q1 = 25%, Q2 = 50% and Q3 = 75%).In this research, the upper quartile, which is equal to the 75th percentile (splits off the highest 25% of data from the lowest 75%) is considered as a feature.This is achieved by sorting the signal values and finding the value of 75% on the window length.
Orientation-invariant feature: For a tri-axial accelerometer, the raw readings are measured according to the current sensor orientation.Orientation-invariant features have been used in some of the resent research to avoid the effect of accelerometer orientation [27].By using the concept of gravity three features is defined: (1) the summation of three axial accelerometers, (2) the approximate vertical projection and (3) the approximate horizontal projection of the accelerometer signal vector.The second and third feature can be used to roughly translate the local coordinate system to the global vertical and horizontal plane based on a method described by Mizell [74].In this method in a window of 256 samples, the gravity vector, Ñ g " pg x , g y , g z q, is roughly estimated using the average of all the measurements: Ñ g " pg x , g y , g z q " ˆři a xi N , where Ñ a i " pa xi , a yi , a zi q , i " 1, 2, . . ., N are acceleration measurements.Then, the vertical component Ñ a Vertical i of each sample of accelerometer is computed using the following equation: Then, the horizontal component ).The extracted vertical and horizontal components provide two orientation invariant features, which are considered as an approximation of the horizontal and vertical movements.
Frequency-Domain Features: These features focus on the periodic structure of the signal, such as coefficients derived from Fourier transforms.
Frequency Range Power: This feature is based on computing the power of the discrete FFT components for a given frequency band.An FFT computes the discrete Fourier transform (DFT) and produces exactly the same result as evaluating the DFT definition directly; the only difference is that an FFT is much faster: x pjq ω pj´1qpk´1q N (16) where ω N = e (´2πi)/N is the N-th primitive root of unity, k = 1, ..., N.For example, the frequency of human walking can be considered in the range of 2-5 Hz [71]; therefore, this frequency band can efficiently separate different activities (such as walking, running, driving, and so on).
Spectral Energy: Spectral energy density describes how the energy of a signal is distributed over the different frequencies.Energy features can be used to capture periodicity of the data in the frequency domain and it can be used to distinguish inactive activities from dynamic activities.The energy feature was calculated as the sum of the squared discrete FFT component magnitudes of the signal.For the signal in discrete form, energy can be calculated using the following equations: where ω is the angular frequency and x pωq is Fourier Transform of the signal.When the feature is computed over a window, the sum of the above equation over the window is divided by the window length for normalization.Spectral Entropy: To discriminate the activities with similar energy values.The frequency entropy is calculated according to the following formula: Entropy " ´P px i q log pP px i qq (18) where x i are the frequency components of the signal for a given frequency band and P(x i ) the probability of x i This feature is a measure of the distribution of the frequency components in the frequency band.Time-Frequency-Domain Features: These features are used to investigate both time and frequency characteristics of the signal and they generally use wavelet transformation [13].
Discrete Wavelet Transform Coefficients: While Fourier Transform shows the frequency content of a stationary signal, wavelet analysis provides spectral information of non-stationary signals, where the frequency content changes over time.In the discrete wavelet transform, a signal y(t) is split into an approximation signal (a 2 j ) and a detail signal (d 2 j ) using the coefficients of a discrete low-pass filter and a high-pass filter.These base filters are called scaling function (φ j,k ) and mother wavelet functions (ψ j,k ) as shown below: This is an iterative procedure using the approximation signal for decomposition.In fact, variation in scale levels (j) of the base functions enables frequency resolution and the shifting of the scale (k) in the base function provides the time information [15].The original signal y(n) can be reconstructed from the wavelet coefficients using the following formula: The choice of mother wavelet is crucial in any application.In this research, Daubechies mother wavelet of order 8 is used for extracting the detail signals [75].Daubechies wavelet is asymmetric and its scaling filters are minimum-phase filters.The first order Daubechies wavelet is also known as the Haar wavelet, in which a wavelet function resembles a step function.The order of the wavelet functions can be compared to the order of a linear filter.These wavelets are compactly supported orthogonal wavelets.
The ratio between the power of the detail signals in specific scales and the total power of the details of the acceleration was calculated as the wavelet coefficient feature.This measure is defined below: where α =3, β = 4 and j = 8 in based on our experimental results.

Feature Selection
In order to increase robustness of activity recognition and reduce computations, a feature selection method is applied.In other words, if the dimensionality of a feature set is too high, some features might be irrelevant and do not even provide useful information for classification, and therefore the computation is slow and training is difficult.The feature selection approach consists of detecting and discarding the features that are demonstrated above, to provide a correct classification results by minimum number of features.This reduces the dimensionality of the feature space and can result in faster and more efficient learning algorithms.Feature subset selection has long been a research area within statistics and pattern recognition [76].
A feature selection algorithm can be seen as the combination of a search technique for proposing new feature subsets, along with an evaluation measure which scores the different feature subsets.Two main categories of feature selection algorithms include wrappers and filters methods described in the literature.The wrapper is tuned to the specific interaction between an induction algorithm and its training data.However, filter methods are much faster as they do not involve repeatedly invoking a learning algorithm.In some cases, a subset of features is not selected explicitly; instead, features are ranked with the final choice left to the user.In general, feature weighting does not reduce the dimensionality of the original data.Other algorithms require features to be transformed in such a way that actually increases the initial number of features and hence the search space.In general, feature weighting does not reduce the dimensionality of the original data.In this paper, four different feature evaluation methods have been used including Correlation Feature Selection (CFS), Principal Component Analysis (PCA), Support Vector Machine (SVM) and gain ratio.Table 4 summarized these methods.

Classification Algorithms
The selected features are used as inputs for classification methods.A number of features from the pre-selected feature set is used to train and test different classifiers [15].Given a set of objects, each of which belongs to a known class, and each of which has a known vector of features, the aim is to construct a set of rules which assign future objects to a class, given only the vectors of feature describing the future objects.Problems of this kind, called problems of supervised classification.Supervised classification can be formally defined as follows.Given a data set Z = {z 1 , z 2 , ..., z p , ..., z Np }.where z p is a pattern in the N-dimensional feature space, and Np is the number of patterns in Z, then the classification of Z is to partition it into K classes, C = {C 1 , ..., C k }, satisfying the following conditions: ‚ Each pattern should be assigned to a class, i.e., Y K j"1 C j " Z. ‚ Each pattern is assigned to one and only one class (in case of hard classification only), i.e., C k X C j = 0, where k " j.
In fact, however, various classification techniques may consider different factors which mean that the assumption is not as detrimental as it might seem.Different methods have been developed for supervised classification.Some of the popular classification methods in the activity recognition research community that have been used in this study are listed in Table 5.
The details of each classification method are described in the next section.In order to select the most accurate technique, these classifiers have been evaluated using various data sets and by applying the WEKA toolbox [77].
Table 4. Feature selection methods used in this paper.

CFS
A filter algorithm to ranks feature subsets (f ) in the search space of all possible feature subsets according to a correlation based heuristic evaluation measure (M f ).
Ranking features that are correlated with the class and uncorrelated with each other.M f " where k: features, f : feature subset, R f ´c: mean feature-class correlation, R f ´f : average feature-feature intercorrelation.

PCA
A transformation to convert a set of features into a set of linearly uncorrelated variables called principal components.This can result in a loss of meaning from the original features representation and the interpretation of induced models.
Normalizing feature-space and calculating the covariance matrix, finding the eigenvectors and eigenvalues and then selecting the eigenvectors corresponding to the first m largest eigenvalues and denote these eigenvectors as new feature space and comparing the BN classification accuracy for these two sets of features.

SVM
A wrapper techniques for feature selection using greedy algorithms (start with none or all features and remove/add until the error doesn't improve).SVM with linear kernels used as a model and select features which improve the error returned by the SVM classifier.
For each feature p with E (´p) (α, σ) = 1 the following criteria can be computed t determine the irrelevant features: where VAL is the Validation subset and x v l and y v l are the objects and labels of this subset, respectively.
means training object i (validation object l) with feature p removed.E (´p) (α, σ) is the number of errors in the Validation Subset when feature p is removed, using the currently selected features as indicated by σ.

Gain Ratio
The algorithm is applied recursively to form sub-trees, terminating when a given subset contains instances of only one class.A C4.5 uses gain ratio which applies normalization to information gain [78].
The gain ratio is defined as Gain"Ratio Attribute " Gain" Attribute {SplitInfo" Subset Attribute .Gain ratio is used as one of disparity measures and the high gain ratio for selected feature implies that the feature will be selected as the splitting attribute and useful for classification.NB is a simple probabilistic classifier which uses Bayes' theorem with naive independence assumptions.This assumption simplifies the estimation of P(ActivityClass|feature) from the training data.

Bayesian Network (BN)
BN is a probabilistic graphical model that encodes probabilistic dependencies among the corresponding variables of interest by using training dataset.BN is used to learn relationships between activity classes and feature space to predict the class labels for a new sample [79,80].

Decision Tree (DT)
DT is a classifier that predicts the activity classes (dependent variable) of a new sample based on features values.The internal nodes of a decision tree denote the different features; the branches between the nodes tell us the possible values that these features can have in the observed samples, while the terminal nodes tell us the final value (classification) of the dependent variable.The algorithm used to generate a decision tree is information entropy [81].

Artificial Neural Network (ANN)
ANNs are capable of "learning" patterns by a number of known training patterns.In this research the used ANN has three layers; input layer, hidden layer and output layer.A simple back propagation algorithm (using RMSE) is used as the learning process.

Support Vector Machine (SVM)
SVMs are binary classifiers, derived from statistical learning theory and kernel-based methods.In this research using (Gaussian) radial basis function, a non-linear learning model is adapted for different activities [82].

Experiment and Results
This experiment aims at finding the useful set of sensors and features that contributes to an accurate activity recognition module.The activity recognition module is designed for detecting different user's modes and motions.Using extensive experiments, the performance of activity recognition module has been evaluated.

Training and Test Data Collection
A Samsung Galaxy Note smartphone was used for the purpose of data collection for this study.This smartphone has a built-in tri-axial accelerometer (K3DH Sensor: 0.25 mA by STMicroelectronics, Geneva, Switzerland), a tri-axial gyroscope (K3G Sensor: 6.1 mA by STMicroelectronics) and a tri-axial magnetometer (AK8975 Sensor: 6.0 mA by Asahi Kasei Microdevices, Tokyo, Japan) that can record the user's motions.Since most of the computations are accomplished on the server, the sensor's data is sent a DB on the server and then other services such as data mining and navigation can have access to these data.As shown in Figure 3, an application is developed to capture and send the data of the smartphone to the server.This application can be used in real time to collects data with a timestamp.For automatically sending the sensor's data to the server, another application was used to update the database on the server.
Activity data was collected from four subjects consisting of two males and two females, their age ranging from 26 to 40.Each activity with a different device placement mode was performed for one minute except for the elevator mode, which was carried out three times for each subject to capture enough data in a four-story building (Figure 4).In total, 30 min of data per person was collected for each subject and stored in a database (DB) on the server.To build the reference data, subjects were asked to annotate main activities with start and finishing times.Therefore, the reference data which is the true user activity is recognized by users based on their input value.Five seconds were removed from the beginning and end of most activities to ensure the data truly corresponded to the pure activity being recorded.Then, a cross folding method has been used to consider 70% of the data for training the classification algorithm and the remaining 30% of the sample data were used for the testing procedure.

Micromachines 2015, 6 1116
and stored in a database (DB) on the server.To build the reference data, subjects were asked to annotate main activities with start and finishing times.Therefore, the reference data which is the true user activity is recognized by users based on their input value.Five seconds were removed from the beginning and end of most activities to ensure the data truly corresponded to the pure activity being recorded.Then, a cross folding method has been used to consider 70% of the data for training the classification algorithm and the remaining 30% of the sample data were used for the testing procedure.and stored in a database (DB) on the server.To build the reference data, subjects were asked to annotate main activities with start and finishing times.Therefore, the reference data which is the true user activity is recognized by users based on their input value.Five seconds were removed from the beginning and end of most activities to ensure the data truly corresponded to the pure activity being recorded.Then, a cross folding method has been used to consider 70% of the data for training the classification algorithm and the remaining 30% of the sample data were used for the testing procedure.In order to collect the test data, the smartphone was loosely placed in specific orientations including in the bag, in the jacket pocket, on the belt, in hand close to the ear for talking, and down at one's side while the arm is swinging.No special requirement has been imposed on how to wear the smartphone except for its location on the body.Different activities and device location contexts considered in this research based on the pedestrian navigation application.

Preprocessing and Calibration
The preprocessing of the inertial sensors is the first step before using any activity recognition algorithm [83].Each reading of the sensor consists of three components along the X-axis, Y-axis, and Z-axis according to the current phone orientation.Preprocessing includes calibration, signal normalization, low-pass filtering and resampling to a required sampling rate.After accelerometers and gyroscopes sensor calibration, all the signals are preprocessed for noise reduction using low-pass filter.Almost always, high frequency noise in data needs to be removed.Therefore, non-linear, low-pass Gaussian filters [71] can be employed for removal of high-frequency noise.After that, noise reduction signals are normalized.In some cases, gravitational acceleration has to be extracted from accelerometer data in order to analyze only useful dynamic acceleration.For this purpose, a mean filter can be used to remove the impact of gravity.For the segmentation of the signal, a sliding window algorithm is applied that is popular for activity recognition as a simple and online algorithm [84].Figure 5 presents an example of inertial sensors' output in different placement scenarios after sensor calibration and low-pass filtering.Some modes are easy to identify, such as the dangling mode in which one axis of the gyroscope has a significantly large magnitude due to the arm swing.However, other modes are quite similar to each other and require pattern recognition algorithms for classification.

Micromachines 2015, 6 1117
In order to collect the test data, the smartphone was loosely placed in specific orientations including in the bag, in the jacket pocket, on the belt, in hand close to the ear for talking, and down at one's side while the arm is swinging.No special requirement has been imposed on how to wear the smartphone except for its location on the body.Different activities and device location contexts considered in this research based on the pedestrian navigation application.

Preprocessing and Calibration
The preprocessing of the inertial sensors is the first step before using any activity recognition algorithm [83].Each reading of the sensor consists of three components along the X-axis, Y-axis, and Z-axis according to the current phone orientation.Preprocessing includes calibration, signal normalization, low-pass filtering and resampling to a required sampling rate.After accelerometers and gyroscopes sensor calibration, all the signals are preprocessed for noise reduction using low-pass filter.Almost always, high frequency noise in data needs to be removed.Therefore, non-linear, low-pass Gaussian filters [71] can be employed for removal of high-frequency noise.After that, noise reduction signals are normalized.In some cases, gravitational acceleration has to be extracted from accelerometer data in order to analyze only useful dynamic acceleration.For this purpose, a mean filter can be used to remove the impact of gravity.For the segmentation of the signal, a sliding window algorithm is applied that is popular for activity recognition as a simple and online algorithm [84].Figure 5 presents an example of inertial sensors' output in different placement scenarios after sensor calibration and low-pass filtering.Some modes are easy to identify, such as the dangling mode in which one axis of the gyroscope has a significantly large magnitude due to the arm swing.However, other modes are quite similar to each other and require pattern recognition algorithms for classification.

What Is the Best Sampling Frequency?
The gravity component is usually found below 0.5 Hz while human body's movements have frequencies below 20 Hz (99% of the signal energy is below 15 Hz frequencies).In fast walking, the step time upper band is about 0.35 s/step [16,83], so even with a data rate of 6.25 Hz steps can be detected

What Is the Best Sampling Frequency?
The gravity component is usually found below 0.5 Hz while human body's movements have frequencies below 20 Hz (99% of the signal energy is below 15 Hz frequencies).In fast walking, the step time upper band is about 0.35 s/step [16,83], so even with a data rate of 6.25 Hz steps can be detected with at least several samples of each step.There is a trade-off between sampling frequency from one side and sampling precision (sampling precision is defined as the percentage of uneven sampling periods compared to the average sampling period) and battery consumption from the other side.Moreover, a higher data rate means that more samples are gathered in each window and the calculation of the features becomes more demanding.Therefore, selection of the proper sampling frequency, which is providing sampling accuracy as well keeping the battery budget, is an important issue.One of the settings options on the sensor_reading application developed for gathering data is the sampling rate (using Android SensorManager which includes: listing sensors, sampling, some processing functions).The sampling rate is available through the Android API (Application Programming Interface) and it has four options: normal, UI (user interface), game and fastest sampling.The Android API has provided sampling frequency in symbolic categories, and there is not a specific sampling frequency stated in the sensors' specifications.Therefore, the measurements gathered in the different experiments have been analyzed to characterize this parameter.The sampling frequency is not constant on smartphones (measurements are not perfectly periodic due to device multitasking).The sampling is not the most important activity of the phone and there are always interruptions from other applications.The lower the sampling frequency is, the smaller the sampling period variation is going to be.
Table 6 illustrates the investigation of different sampling frequencies of the smartphone.The variations in sampling period are about milliseconds in Samsung Galaxy Note 7000 (Table 6).It has to be considered that there are significant differences in the sampling frequency values of different phone models.There is significant battery consumption associated with high sampling rate.NORMAL sampling frequency has significantly lower battery consumption than the others.Moreover, a higher data rate means that more samples are gathered in each window and the features calculation becomes more complex.Consequently, the normal sampling is enough to detect changes in orientation and movement to recognize activity.To measure sampling rates precisely, the time stamp that comes with the sensor event is used and interpolated.The sample rate used in data collection was either the normal option on the android sensor event.The sampling rate of the data can be set using the time stamp in the preprocessing GUI (Graphical User Interface) of the activity recognition module in MATLAB and Table 6 describe the effect of sampling frequency on battery consumption.

What Is the Useful Sensor Information?
When facing activity recognition with mobile sensors, it is relevant to determine which set of sensors are providing the better accuracy and offering enough quality.In this work, a Samsung Galaxy Note smartphone, equipped with accelerometer, gyroscope, magnetometer, proximity and light sensors, is used.Accelerometer sensors have been widely used for motion detection [23], and step detection (for detection of walking mode).Gyroscopes are another useful sensor for activity recognition that captures user's motion and device orientation changes.Orientation determination is a significant feature to distinguish among sets of on-body device positions: vertical (pocket), horizontal (in hand while reading the phone) or random (bag, backpack).Magnetometer sensor also helps determining the orientation as well as absolute heading information (deviation from earth magnetic field).In addition to such physical hard-sensors, orientation soft-sensor provided by android API can be used to estimate the device orientation using the fusion of accelerometer, gyroscope and magnetometer signals.The orientation information includes the angles (roll, pitch, and yaw) which describe the orientation of the device coordinate system with respect to the local navigation reference frame.The output of the orientation soft-sensor can be either used as an independent sensor or as a means to project other sensor data from device's coordinate system to the reference navigation system.Another signal which can be used is the projection of the gravity vector onto the coordinate axis.This signal approximately measures the orientation of the device, so it is used for the same purpose.
The useful set of sensors is the one which has the most correlation with the activity classes.There is a processing stage to analyze which sensor signals have the most useful information for activity recognition.The classification accuracy and time efficiency of different sensors have been investigated using all the activity and device orientation classes, which are listed in Figure 6.
In this research, different motion sensors were used including: tri-axial accelerometer, gyroscope magnetometer, orientation sensor and projection of the accelerometer, gyroscope and magnetometer signals using orientation angles.In this investigation, all the features have been extracted from the datasets and a BN classifier applied for classification using all of the features [85].Figure 7 gives the overall accuracies for the recognition of the user's physical activity, device placement, and both activity and device placement modes while using the different set of sensors.
Time efficiency is a critical issue when using smartphones.Figure 8 shows time efficiency obtained from different sets of sensors for the DB of all users and all activities.Although this figure is showing the time consumption for a specific computer and DB, it is useful for comparing the time efficiency achieved by using various sensors.Also in this figure, three recognition scenarios were considered, including user's physical activity, device placements, and both activity and device placement modes.
Micromachines 2015, 6 1119 user's motion and device orientation changes.Orientation determination is a significant feature to distinguish among sets of on-body device positions: vertical (pocket), horizontal (in hand while reading the phone) or random (bag, backpack).Magnetometer sensor also helps determining the orientation as well as absolute heading information (deviation from earth magnetic field).In addition to such physical hard-sensors, orientation soft-sensor provided by android API can be used to estimate the device orientation using the fusion of accelerometer, gyroscope and magnetometer signals.The orientation information includes the angles (roll, pitch, and yaw) which describe the orientation of the device coordinate system with respect to the local navigation reference frame.The output of the orientation soft-sensor can be either used as an independent sensor or as a means to project other sensor data from device's coordinate system to the reference navigation system.Another signal which can be used is the projection of the gravity vector onto the coordinate axis.This signal approximately measures the orientation of the device, so it is used for the same purpose.The useful set of sensors is the one which has the most correlation with the activity classes.There is a processing stage to analyze which sensor signals have the most useful information for activity recognition.The classification accuracy and time efficiency of different sensors have been investigated using all the activity and device orientation classes, which are listed in Figure 6.
In this research, different motion sensors were used including: tri-axial accelerometer, gyroscope magnetometer, orientation sensor and projection of the accelerometer, gyroscope and magnetometer signals using orientation angles.In this investigation, all the features have been extracted from the datasets and a BN classifier applied for classification using all of the features [85].Figure 7 gives the overall accuracies for the recognition of the user's physical activity, device placement, and both activity and device placement modes while using the different set of sensors.
Time efficiency is a critical issue when using smartphones.Figure 8 shows time efficiency obtained from different sets of sensors for the DB of all users and all activities.Although this figure is showing the time consumption for a specific computer and DB, it is useful for comparing the time efficiency achieved by using various sensors.Also in this figure, three recognition scenarios were considered, including user's physical activity, device placements, and both activity and device placement modes.By comparing Figures 7 and 8, it is obvious that although applying all the sensor information leads to the highest accuracy, using accelerometer and orientation information has a better balance between accuracy and battery consumption.

What Is the Optimums Set of Features?
After selecting the appropriate signals, the window size can be defined by clicking and dragging the mouse on the shown signals or by entering the starting and ending point.To accomplish feature extraction, the data is divided into two-second segments and features are extracted from 80 readings conducted  By comparing Figures 7 and 8, it is obvious that although applying all the sensor information leads to the highest accuracy, using accelerometer and orientation information has a better balance between accuracy and battery consumption.

What Is the Optimums Set of Features?
After selecting the appropriate signals, the window size can be defined by clicking and dragging the mouse on the shown signals or by entering the starting and ending point.To accomplish feature extraction, the data is divided into two-second segments and features are extracted from 80 readings conducted  7 and 8 it is obvious that although applying all the sensor information leads to the highest accuracy, using accelerometer and orientation information has a better balance between accuracy and battery consumption.

What Is the Optimums Set of Features?
After selecting the appropriate signals, the window size can be defined by clicking and dragging the mouse on the shown signals or by entering the starting and ending point.To accomplish feature extraction, the data is divided into two-second segments and features are extracted from 80 readings conducted within the two-second segments.The two-second duration has been chosen because the experiments show that it provides sufficient time to capture meaningful features involved in different activities.The signal windows have a 50% overlap.To investigate the results of the feature extraction step, various combinations of sensors have been considered to find the useful set of sensors for discerning each set of activity and device placement.As shown in Figure 9, a GUI has been developed for feature extraction.All the sensors and signals from the previous step (preprocessing) can be used and shown in this GUI.
In the left panel, there is a list of different features considered in this research.After selecting a window of the signals (tri-axial accelerometer in Figure 9), various features can be extracted from the selected data.By pressing the "Feature Extraction" button (Figure 9), selected feature can be shown in the "Show Feature" panel.The computational complexity of the feature extraction techniques is different.
Figure 10 indicates the time consumption of the feature extraction techniques in MATLAB using a CORE i7 CPU @ 2.7 GHz computer.It is noticeable that the histogram, wavelet and frequency-domain features have more computations in comparison with simple time-domain features.In the left panel, there is a list of different features considered in this research.After selecting a window of the signals (tri-axial accelerometer in Figure 9), various features can be extracted from the selected data.By pressing the "Feature Extraction" button (Figure 9), selected feature can be shown in the "Show Feature" panel.The computational complexity of the feature extraction techniques is different.
Figure 10 indicates the time consumption of the feature extraction techniques in MATLAB using a CORE i7 CPU @ 2.7 GHz computer.It is noticeable that the histogram, wavelet and frequency-domain features have more computations in comparison with simple time-domain features.Although Figure 10 shows time efficiency obtained from a specific sensors and specific sample numbers, it is useful for comparing the time efficiency achieved by using various features.Time efficiency is a critical issue when processing different sensors and signals.However, there is a trade-off between recognition accuracy and time efficiency of the algorithm.There are no theoretical guidelines that suggest the appropriate features to use in specific classification situation.Therefore, a careful investigation of the available features is necessary to improve the performance of the recognition method.A good feature space can often yield simple and easily understood classification techniques; a poor feature space may yield complex classification techniques whose true structures are difficult or impossible to discern.In the following section, different feature selection methods and their accuracies for activity recognition is discussed.

What Is the Optimum Feature Selection Method?
In order to increase efficiency of activity recognition and reduce computations, a feature selection method is applied.Optimum features are those with maximum correlation with the class attributes and minimum inter correlation with the other features.In other words, if the dimensionality of a feature set is too high, some features might be irrelevant and do not even provide useful information for classification, and therefore the computation is slow and training is difficult.The feature selection approach consists of detecting and discarding the features that are demonstrated to minimally cause a correct response by the classifier.In this research, four different feature evaluation methods have been used including CFS (Correlation Feature Selection), PCA (Principal Component Analysis), SVM (Support Vector Machine) and gain ratio.In the first case, all of the 46 features have been considered and the overall accuracy of 99% is reached.Then, using different feature selection methods is explored.CFS uses 28 uncorrelated features which results in 88% overall activity recognition accuracy.PCA uses 12 independent linear combinations of features and results in 87% recognition accuracy.The two other methods, SVM and gain ratio use only four of the features to classify the data and provides recognition accuracy of 96.2% and 94.4%, respectively, in the maximum case.Figure 11 illustrates the overall classification accuracies of using different sets of feature for recognition of physical activities, device placement, and both activity and device placement.Although Figure 10 shows time efficiency obtained from a specific sensors and specific sample numbers, it is useful for comparing the time efficiency achieved by using various features.Time efficiency is a critical issue when processing different sensors and signals.However, there is a trade-off between recognition accuracy and time efficiency of the algorithm.There are no theoretical guidelines that suggest the appropriate features to use in specific classification situation.Therefore, a careful investigation of the available features is necessary to improve the performance of the recognition method.A good feature space can often yield simple and easily understood classification techniques; a poor feature space may yield complex classification techniques whose true structures are difficult or impossible to discern.In the following section, different feature selection methods and their accuracies for activity recognition is discussed.

What Is the Optimum Feature Selection Method?
In order to increase efficiency of activity recognition and reduce computations, a feature selection method is applied.Optimum features are those with maximum correlation with the class attributes and minimum inter correlation with the other features.In other words, if the dimensionality of a feature set is too high, some features might be irrelevant and do not even provide useful information for classification, and therefore the computation is slow and training is difficult.The feature selection approach consists of detecting and discarding the features that are demonstrated to minimally cause a correct response by the classifier.In this research, four different feature evaluation methods have been used including CFS (Correlation Feature Selection), PCA (Principal Component Analysis), SVM (Support Vector Machine) and gain ratio.In the first case, all of the 46 features have been considered and the overall accuracy of 99% is reached.Then, using different feature selection methods is explored.CFS uses 28 uncorrelated features which results in 88% overall activity recognition accuracy.PCA uses 12 independent linear combinations of features and results in 87% recognition accuracy.The two other methods, SVM and gain ratio use only four of the features to classify the data and provides recognition accuracy of 96.2% and 94.4%, respectively, in the maximum case.Figure 11 illustrates the overall classification accuracies of using different sets of feature for recognition of physical activities, device placement, and both activity and device placement.Table 7 lists the most efficient features in each case for activity recognition using SVM and gain ratio feature evaluators.As it can be inferred from Table 7, SVM has the best recognition rate of 96.2% between the other methods using only four features.

Micromachines 2015, 6 1123
Table 7 lists the most efficient features in each case for activity recognition using SVM and gain ratio feature evaluators.As it can be inferred from Table 7, SVM has the best recognition rate of 96.2% between the other methods using only four features.

What Is the Best Classification Algorithm?
The selected features are used as inputs for the classification and recognition methods.In this stage, a number of features from the pre-selected feature set were used to train and test different classifiers.Several classifiers provided by WEKA [77], namely BN, NB and ANN are evaluated and compared.Comparative studies on classification algorithms are difficult due to the lack of universally accepted quantitative performance evaluation measures.Many researchers use the classification error as the final  The selected features are used as inputs for the classification and recognition methods.In this stage, a number of features from the pre-selected feature set were used to train and test different classifiers.Several classifiers provided by WEKA [77], namely BN, NB and ANN are evaluated and compared.Comparative studies on classification algorithms are difficult due to the lack of universally accepted quantitative performance evaluation measures.Many researchers use the classification error as the final quality measurement; therefore, this research adopts a similar approach [61].An error or confusion matrix is often used to evaluate the true labels and the labels returned by the classification algorithms as the quality assessment measure.Table 8 shows Confusion matrix for 12 classes of activity and device location when BN classifier is used.It can be observed that some of the activities (such as walking and using stairs) and some of the device placements (such as on-belt and trousers front pocket position) were misclassified or cross classified.This can be improved by using new features such as using a walking pattern.
Regarding the confusion matrices, it can be observed that some of the activities (such as walking and using and some of the device placements (such as on-belt and trousers front pocket positions) were misclassified or cross-classified.This can be improved by using new features such as walking patterns.In addition, the direction for further work includes collecting data in more natural environments without researchers' interventions and using a larger number of people to test the reliability of the trained classifier in its recognition of new and unseen activity patterns.
For evaluating the classifiers, F-measure accuracy (overall accuracy) of the test data has been used in this research to evaluate recognition performance using the following formulas: Recall " F ´measure " 2*Preciion*Recall Preciion `Recall where TP indicates the number of true positive or correctly classified results, FP is the number of false positive or unexpected results, and FN is false negative or miss-classified results.The 10-fold cross-validation is used to evaluate the classification models.By using this algorithm, the database of the test data has been randomly divided into 10 equally sized folders.Each time, one folder has been chosen as the test data set and the rest as training data sets.After training the classifier, an evaluation is made using the test data set to get the precision, recall and the F-measure for each activity.After each folder is tested, the average F-measure of all the folders is computed as the overall result for the activities (Figure 12).Performance degrades especially for three modes: belt, pocket and backpack.More specifically: ‚ "Belt" is often misclassified as "Pocket"; ‚ "Pocket" is often misclassified as "Backpack"; ‚ "Backpack" is sometimes misclassified as "Belt" and "Reading".This is expected since the way the users put their navigators in pocket and bags are actually quite ambiguous.In this case, it might be practical to merge the three confusing modes together and consider a universal classification.However, talking, dangling and reading have good distinctions, even for different users.By investigating each activity's recognition rate, it can be inferred that the user activities such as: driving, walking, running, taking stairs and elevator modes have an accuracy of 95%.In contrast, the classification models cannot perform as well in device placement recognition.Table 9 shows the recognition rate for some of the activity modes using four features selected by SVM feature evaluator and by applying various classifiers such as BN, NB [86] and ANN.By investigating each activity's recognition rate, it can be inferred that the classification models distinguish between the device placements and user activities with an overall accuracy of 95%.Although ANN requires more computational capabilities in comparison to Bayesian Network and Naïve Bayes methods, the accuracies obtained from the three classifiers are close to each other (Table 9).This could be the result of the fact that the activities are discriminated by the four extracted features with a high accuracy.

Conclusions
This paper presents an activity recognition system that employs a smartphone's sensors (accelerometer, gyroscope, and magnetometer) to monitor a user's physical movements for navigation applications.Different activities which are important in a navigation application, such as walking, running, descending/ascending stairs and using an elevator have been explored, as well as transitions between these different activities.One unique aspect of this research is that no particular constraint on device placement was imposed.Various common device positions were used such as being held for talking while on the ear, held for texting, in a pocket, on a belt, in a bag and down at one's side hanging naturally while walking.
The activity recognition algorithm is based on the learning capability from sample datasets [87].Three recognition scenarios were investigated including: identifying the user's activity, device placement (where the mobile device is placed on the user's body) and both user's activity and device placement modes simultaneously.This paper investigated the best set of sensors to use in smartphones and the optimum features needed for a simple and efficient classification solution using the experimental tests conducted in this research to establish a good balance between accuracy and computational cost.
In the first step, the number of input sensors signals has been investigated.Results showed that using accelerometer sensors are efficient in recognition of user motions but not enough for recognition of device location and orientation; therefore, the orientation soft-sensor (based on the fusion of accelerometer, magnetometer and gyroscope) is added as a new sensor reading, which can relieve the effect of the orientation change on the performance of activity classification.
The second step was to select the best set of extracted features instead of using all of the time-domain and frequency-domain features to train the classifier.Experiments demonstrated that when feature selection methods were applied, it was successful in removing redundancy in features and thus reducing computational complexity.For activity recognition, four features have been explored to reduce computational load without compromising accuracy.A set of four essential features selected by SVM feature evaluator method includes Frequency Range Power, Spectral Entropy, Zero Crossing and Quartile.
Compared to the more complex classifiers such as ANN, the results showed that the Bayesian Network classifier yielded a similar performance, having a more extensible algorithm structure and requiring fewer computations.The Bayesian Network classifier provides an overall recognition accuracy of 96.2% on a variety of six activities and six device positions using only four features provided by SVM feature selection method.
Inspired from multi-sensor activity recognition research [4,30,32], three MEMS sensors on the smartphone (accelerometer, gyroscope and magnetometer sensors) are used in this research to consider both motion and orientation of the device.As an improvement to the previous work, the accelerometer and gyroscope as well as other sensors such as the magnetometer sensor are integrated to recognize activity context more reliably.Moreover, in most of the research in this area, the specialized accelerometers are fixed to the users' body or have a certain orientation, this assumption usually does not hold for the usual case of carrying the phone in the hand or pocket.However, in this study no assumption is made about how users carry their mobile phones.This study contributes to the intelligent navigation computation domain by focusing on three issues: (i) evaluation analysis for classifiers' accuracies and providing reliable results for selecting the best set of sensors and features to optimize the performance of activity-logging applications on smartphones, (ii) extensive analysis of the effect of a separate estimation of user activity and device placement or considering both of them together, and (iii) considering different placement of the mobile device without any assumptions on fixing the device orientation.

Figure 1 .
Figure 1.The steps involved in activity recognition using a feature-level sensor fusion.

Figure 1 .
Figure 1.The steps involved in activity recognition using a feature-level sensor fusion.

Figure 2 .
Figure 2. Six different positions for calibration of the accelerometer and gyroscope (each sensitive axis pointing alternately up and down).(a) Z-axis is along with g but in the opposite direction; (b) X-axis is along with g but in the opposite direction; (c) Z-axis is along with g and with the same direction; (d) X-axis is along with g and with the same direction; (e) Y-axis is along with g but in the opposite direction; (f) Y-axis is along with g and with the same direction.

Figure 2 .
Figure 2. Six different positions for calibration of the accelerometer and gyroscope (each sensitive axis pointing alternately up and down).(a) Z-axis is along with g but in the opposite direction; (b) X-axis is along with g but in the opposite direction; (c) Z-axis is along with g and with the same direction; (d) X-axis is along with g and with the same direction; (e) Y-axis is along with g but in the opposite direction; (f) Y-axis is along with g and with the same direction.

Y
, commonly used in linear regression.In the below equation n describes the number of elements of Ñ X and Ñ Y : vector Ñ a i can be computed by vector subtraction, (i.e.,

Figure 3 .
Figure 3. Schematic diagram of the data collection process.

Figure 4 .
Figure 4. Collecting training datasets for different activities and device placements.

Figure 3 .
Figure 3. Schematic diagram of the data collection process.

Figure 3 .
Figure 3. Schematic diagram of the data collection process.

Figure 4 .
Figure 4. Collecting training datasets for different activities and device placements.

Figure 4 .
Figure 4. Collecting training datasets for different activities and device placements.

Figure 5 .
Figure 5. Calibrates accelerometers and gyros outputs in different placement mode.

Figure 5 .
Figure 5. Calibrates accelerometers and gyros outputs in different placement mode.

Figure 6 .
Figure 6.Different activity contexts assumed for personal navigation services.

Figure 6 .
Figure 6.Different activity contexts assumed for personal navigation services.

Figure 7 .
Figure 7. Recognition accuracy using different sets of sensors for different activity modes (Classifier: Bayesian Network, Number of features: 46).

Figure 8 .
Figure 8.Time consumption of using different sensors for different activities modes (Classifier: BN, Number of features: 46).

Figure 7 .Figure 7 .
Figure 7. Recognition accuracy using different sets of sensors for different activity modes (Classifier: Bayesian Network, Number of features: 46).

Figure 8 .
Figure 8.Time consumption of using different sensors for different activities modes (Classifier: BN, Number of features: 46).

Figure 8 .
Figure 8.Time consumption of using different sensors for different activities modes (Classifier: BN, Number of features: 46).

Micromachines 2015, 6
1121within the two-second segments.The two-second duration has been chosen because the experiments show that it provides sufficient time to capture meaningful features involved in different activities.The signal windows have a 50% overlap.To investigate the results of the feature extraction step, various combinations of sensors have been considered to find the useful set of sensors for discerning each set of activity and device placement.As shown in Figure9, a GUI has been developed for feature extraction.All the sensors and signals from the previous step (preprocessing) can be used and shown in this GUI.

Figure 9 .
Figure 9. Feature extraction GUI from activity recognition module (From left to right and up to down, this GUI includes: Data Selection, Window Size, Features, Show Feature and Save Dataset panels).

Figure 9 .
Figure 9. Feature extraction GUI from activity recognition module (From left to right and up to down, this GUI includes: Data Selection, Window Size, Features, Show Feature and Save Dataset panels).

Figure 10 .
Figure 10.Time efficiency of feature extraction techniques on a window of 80 samples.

Figure 10 .
Figure 10.Time efficiency of feature extraction techniques on a window of 80 samples.

Figure 11 .
Figure 11.Recognition accuracy using different number of features for different activity modes (Classifier: Bayesian Network, Number of features: 46).

Figure 11 .
Figure 11.Recognition accuracy using different number of features for different activity modes (Classifier: Bayesian Network, Number of features: 46).

Figure 12 .
Figure 12.Recognition accuracy using different classifier for different activity modes using four essential features selected by the SVM method.

Figure 12 .
Figure 12.Recognition accuracy using different classifier for different activity modes using four essential features selected by the SVM method.

Table 2 .
Summary of past work on activity recognition using accelerometers (The recognition accuracies values for references without explicit statements have been marked: NA).

Table 3 .
Summary of past work on activity recognition using smartphones.

Table 5 .
Categorization of the classification methods.kNN is based on the closest training samples in the feature space.The most popular similarity measure to find the closest samples is the Euclidean distance.k denotes the number of classes.

Table 6 .
Investigation of different sampling frequencies of Samsung Galaxy Note 7000.

Table 7 .
Selected feature using SVM and Gain Ratio feature evaluator and their corresponding recognition accuracy (Classifier: BN).

Table 7 .
Selected feature using SVM and Gain Ratio feature evaluator and their corresponding recognition accuracy (Classifier: BN).

Table 8 .
Confusion matrix for 12 classes of activity and device location for BN classifier.

Table 8 .
Confusion matrix for 12 classes of activity and device location for BN classifier.

Table 9 .
Comparison of different classifier in activity recognition of the database (DB) of 120 min data using four essential features selected by the support vector machine (SVM) method.