A Smartphone Step Counter Using IMU and Magnetometer for Navigation and Health Monitoring Applications

The growing market of smart devices make them appealing for various applications. Motion tracking can be achieved using such devices, and is important for various applications such as navigation, search and rescue, health monitoring, and quality of life-style assessment. Step detection is a crucial task that affects the accuracy and quality of such applications. In this paper, a new step detection technique is proposed, which can be used for step counting and activity monitoring for health applications as well as part of a Pedestrian Dead Reckoning (PDR) system. Inertial and Magnetic sensors measurements are analyzed and fused for detecting steps under varying step modes and device pose combinations using a free-moving handheld device (smartphone). Unlike most of the state of the art research in the field, the proposed technique does not require a classifier, and adaptively tunes the filters and thresholds used without the need for presets while accomplishing the task in a real-time operation manner. Testing shows that the proposed technique successfully detects steps under varying motion speeds and device use cases with an average performance of 99.6%, and outperforms some of the state of the art techniques that rely on classifiers and commercial wristband products.


Introduction
Recent advances in Micro-Electro-Mechanical Systems (MEMS) technology has made it feasible to manufacture Inertial Measurement Units (IMU) sensors that are low cost, low on power consumption on chip and also lightweight [1,2]. Most of the smart devices pedestrians use these days are enabled with such technology. The presence of those sensors at the disposal of the user make them very appealing to be used for various applications such as activity and health monitoring [3], gaming and personal navigation [4], and emergency services [5].
An IMU is a group of sensors that can sense the motion of the user represented as accelerations and angular rate changes of orientation. MEMS IMUs are low-grade sensors that are also referred to as commercial-grade. In [6], a comparison of the different types and grades of IMU is presented. The comparison shows the relatively high errors of the commercial-grade IMUs in comparison to the higher more expensive counterparts. In [7], VecorNav-one of the leading inventors of embedded navigation solutions-presents a comparison between the performance of the different grades of IMUs, and the expected deterministic errors of each of them. In addition, Reference [8] provides a performance comparison between the different underlying technologies used in IMU sensors. The performance

Background
This section presents some of the existing step detection and counting techniques. The techniques presented can be categorized into two categories: strapdown systems and handheld devices. In some research, the step detection device will be referred to as a pedometer.

Strapdown Systems
Multiple approaches rely on the use of an IMU strapped down to a segment of the body. The advantage of such systems is the elimination of separate platform motion from the body, where nearly all of the measurements captured by the system represent the motion of the body segment it is connected to. Some of the developed systems are: foot mounted, waist belt mounted, or wrist mounted. The most exhaustively tested approach for navigation purposes is foot mounted.
Foot mounted approach relishes the benefits of closely capturing the characteristics of the GC and the underlying kinetics of it. For example, during the stance phase, it is expected that the IMU is stationary, hence it is deduced that the measurements from the accelerometer tend to gravity value, while the gyroscope value tends to zero. Exploiting this information, the stance phase can be easily detected. Furthermore, the accurate detection of the stance phase would enable the use of correction methods for compensating for the drift in the measurements. For the case of foot mounted sensors, the detection process relies on the use of thresholds for detecting the stance phase in most cases, or the use of GC state transition modeling.
A threshold-based approach that relies on the use of gyroscope measurements for the detection of the stance phase is proposed in [23]. The stance is detected when the gyroscope measurements fall within a predefined threshold. A more reliable approach presented in [24] where multiple threshold-based constrains are defined for both accelerometer and gyroscope measurements, where three threshold checks are carried out and a step is identified when all three conditions are met. Another threshold-based approach is found in [25], it uses thresholds for acceleration measurements to identify a stance, but it also incorporates a validation step through defining a minimum time span for the stance.
Threshold-based detection can suffer from degraded performance especially in fast walking and running modes, where the stance phase period is diminished or eliminated. To overcome this limitation, some researches apply gait phase classifier, where the goal is to detect all GC phases to detect the cases when the stance phase is undetected.
The authors in [26] define a Finite State Machine (FSM) with a probability transition matrix to identify the four main phases of gait. The classification is based on the accelerations from a tri-axial accelerometer and a single axis gyroscope. Similarly, in [27], a state transition is defined. The update in this approach is the use of a tri-axial gyroscope instead of only a single axis gyroscope. In [28], an FSM that assumes that the GC is a Hidden Markov Model (HMM) is proposed. The measurements from the gyroscope and the force resistors installed on the sole of the shoe are used for the phase classification along with the probability transition matrix. This work was modified in [29] to replace the force resistors by accelerometer and redefining the probability transition matrix. A Bayesian Network (BN) was proposed in [30]. The aim of the network is to distinguish the stance phase only using a set of three threshold-based constraints for the accelerometer and gyroscope measurement, and a predefined GC phase threshold defined by kinesiology. In [31], a study of the effect of footwear effect on gait features is proposed. The study uses Artificial Neural Networks (ANN) for gait feature detection for the same subject under the use of different footwear types. It is shown that different types of footwear, namely: bare-foot, sneakers, and high heels, have different effects on the accelerations generated during walking. From this study, it is drawn that some external conditions might affect the performance of non-adaptive step detection techniques. More research based on ANN was presented in [32], where an ankle-foot orthosis was used. The orthosis was equipped with an IMU, two Force Sensitive Resistors (FSRs) mounted on the sole, and an angle sensor mounted vertical to the ankle. The purpose of this research was to detect steps and classify the action being taken, such as stair Sensors 2017, 17, 2573 4 of 24 ascent/descent, or level ground. By detecting the type of motion, the actuators of the orthosis can be modified accordingly to facilitate the motion.
A different approach was presented in [33] that proposes the use of a magnetometer on one shoe, while placing a permanent magnet on the other. The algorithm detects the steps through the processing of the magnetometer measurements, which are no longer the magnetic north but the proximity of the magnet on the other shoe.
Due to recent advances in wearable devices, namely smart watches [34] and fitness bands, they have attracted a lot of research in the health monitoring applications. Those devices enabled with IMU along with a variety of sensors such as heart rate, and temperature sensors. The use of wrist strapdown systems utilizes the same techniques as the foot-mounted but suffer from the disadvantage of the free motion of the arm, where the arm undergoes motion that is not related to walking behavior in some cases, and hence require more analysis. On the other hand, the wrist placement can be more beneficial for health applications for its direct contact with the skin, e.g. it can sample heart rate and skin temperature along with other useful information. Two examples of recent patents for step detection utilizing a wrist placement device can be found in [35,36]. Different kinds of pedometer implementations and placements have been compared in [37], where the results show that the most desirable place is the waist. It is also worth noting that the paper did not include foot mounted systems and only evaluated smartphone performance in a pocket placement state.
For the use of threshold-based detection and state transitions, a predictable pattern should exist. This hypothesis holds for tethered sensors in most cases such as the foot-mounted case, but, in the case of a free-moving handheld device, there are no predictable outcomes at a given time, and hence researchers resort to other methods as will be presented in the next subsection.

Handheld Devices
In the case of untethered-handheld devices, there is no detectable zero velocity region for the stance phase of the GC as the human upper body half is in continuous motion unlike the foot, and the device might exhibit non-walking related motion from the arm motion causing orientation changes and accelerations that do not represent the walking behavior. Hence, methods for step detection rely on peak extraction instead of zero acceleration periods. Regular peak detection techniques usually rely on classifiers to determine the use of the handheld device-smartphone-to adaptively adjust the step detection thresholds.
Peak detection has been exploited in many researches. In [3],a technique based on peak detection is proposed. It requires a training phase to estimate the user dependent thresholds for step detection before it can be used for navigational purposes. This is inconvenient, as the parameters estimated will work for one user but it is not guaranteed to work for others, while another limitation is that it only works in compassing mode.
A classifier is developed in [38] to identify the type of motion the device is experiencing. Once the device use case is identified, a decision of using either the accelerometer or the gyroscope measurements is made based on the class of motion identified. The steps are then detected through peak extraction technique. The classifier proposed is a supervised classifier, which means a training phase was required to obtain the thresholds for the decision making in the classifier.
Another approach that uses a classifier was presented in [39]. It classifies the motion type to two classes through the use a periodicity detection algorithm. Peaks are then extracted to represent candidate steps and validated for removal of false steps. The validation of candidate steps is through the integration of the measurements during the step duration to check if significant displacement occurs. In [40], a classifier that for three use cases is used, namely holding, swinging, and pocket placement. Based on the classification, the acceleration component to be used is selected, where it can be the z-component, y-component, or the vertical acceleration from the leveled measurements. A feedforward ANN with pattern recognition was proposed in [41]. The network has only directed connections and requires reference data for training. The approach used utilizes the ANN for the step detection and step length estimation.
In some use cases of the phone, fake signals that look like human motion can be simulated leading to false step counts. The authors in [42] propose an adaptive filtering method for eliminating false peaks. Unlike most approaches, the presented work uses the norm of the acceleration measurements from the accelerometer and not only vertical acceleration analysis for the step detection. The process starts by extracting pairs of peak/valley. Each peak and valley being detected is a candidate until verified through magnitude and temporal filtering. Upon the verification of a peak/valley pair, a step is identified.
A different approach of classification is developed in [43] by employing a fuzzy-logic classifier. Features from a Band-Pass filtered acceleration norm are extracted and evaluated with a membership function. The output of the classifier is then processed through a set of rules for defuzzification to identify the step.
While there are other approaches that utilize different techniques for step detection, peak detection and threshold based techniques are usually employed due to their simplicity and low overhead. A summary of different approaches that rely on the analysis of the IMU sensor measurements analysis is found in [44]. The different approaches are evaluated on bases of complexity, computational overhead, and real-time applicability.
Other methods for step detection that rely on other sensors have been suggested such as using the camera for visual odometry. A camera-based step detection example can be found in [45]. The limitation of such approach is that the smartphone motion is restrained to count the steps, the smartphone should be held in a position that captures the foot motion. Similarly, in [46], visual odometry is utilized but with a different camera usage hypothesis. The hypothesis is that the camera orientation captures the pedestrian's first-person perspective. The proposed methodology uses the Speeded Up Robust Features (SURF) algorithm for feature extraction from the captured frames. This approach requires holding the device in a certain way, limiting the usability of the device.
Step detection and counting is of great importance for many applications, and although tethered approaches show high accuracy, it would be more desirable to have a non-constrained device, of multiple purposes use for the user to use such as a smartphone. The dynamics of the smartphone along with the unpredictability of walking behavior changes of pedestrians make it a challenging task. The authors of this paper propose an algorithm for step detection and counting that is unified for all use cases of the smartphone and step modes of the pedestrian through using features that are invariant to both.

Methodology
Using a handheld free-moving device, such as smartphones for motion tracking, exhibits a different motion pattern from a strapdown system, such as placing an IMU on the shoe or waist belt area. One of the main differences is the absence of the static period, which is usually exploited in the foot mounted systems for step detection. The usual pattern that is expected in the case of a handheld device in a static pose is represented as a sequence of alternating peaks and valleys, where each peak/valley pair represents a single step.
In this section, a novel step detection algorithm is proposed that is independent from the smartphone use case and does not require a classifier to adaptively tune the parameters for the step detection. The algorithm is based on conclusions drawn from extensive analysis of the three signals used for the step detection which are the acceleration norm, angular rates vector, and magnetic vector.
First, the acceleration norm is used without gravity compensation to avoid errors introduced from the separation process and the transformation of measurements into the Local Level Frame (LLF). From studying the norm of the accelerations in the case of a fixed device pose-compassing mode-to obtain the pattern of accelerations exerted by the walking motion, it was concluded that the acceleration norm has the following properties:

•
The acceleration norm computed from Equation (1) has a sinusoidal pattern where each pair of peak/valley represents a step. Figure 1 shows an example of walking pattern with a smartphone in compassing mode with peaks and valleys detected by the proposed algorithm.

•
The magnitude difference between the peak/valley pair is inversely proportional to the step duration and proportional to the motion pace. Figures 2 and 3 elaborate the change in magnitude in correspondence to pace variation with time.

•
The use of net force acceleration norm-uncompensated for gravity component-as shown in Figure 4, magnifies the pattern in the signal around the peaks and valleys while also smoothing it around the gravity shift component. This is due to the combined factor from both linear acceleration and gravity as presented in Equation (2).

•
Although in many studies it is assumed that the measured norm is the root of sum of square of gravity and linear acceleration, from physics, the resulting force from both vectors is computed, as in Equation (2). Simply subtracting the gravity value from the resultant does not yield the linear acceleration where residuals from gravity remains due to the component derived from the angle between the vectors.
As for the case of angular rates measured by the gyroscope and the magnetic vector measured by the magnetometer, the resulting signals are useful in the phone dangling use-case. In a phone dangling state, the user holds the phone in his hand while swinging his arms in a normal motion as when walking holding nothing or something of minimal weight that does not affect his motion. In this case, the patterns generated also resemble a sinusoidal wave but each half of the signal represents a step. Acc = 2 a x 2 + a y 2 + a z 2 (1) where Acc is the net acceleration magnitude; Speci f ic f orce is the linear acceleration vector norm; g is the gravity norm; ∅ is the angle between gravity and linear acceleration; and a x , a y , a z are the accelerations in the body frame.  Although in many studies it is assumed that the measured norm is the root of sum of square of gravity and linear acceleration, from physics, the resulting force from both vectors is computed, as in Equation (2). Simply subtracting the gravity value from the resultant does not yield the linear acceleration where residuals from gravity remains due to the component derived from the angle between the vectors.
As for the case of angular rates measured by the gyroscope and the magnetic vector measured by the magnetometer, the resulting signals are useful in the phone dangling use-case. In a phone dangling state, the user holds the phone in his hand while swinging his arms in a normal motion as when walking holding nothing or something of minimal weight that does not affect his motion. In this case, the patterns generated also resemble a sinusoidal wave but each half of the signal represents a step.  Based on these findings, the proposed algorithm, shown in Figure 5 as a block diagram, starts by filtering the measurements from the sensors using an adaptive low-pass filter that is discussed in Section 3.1; after that, it applies a peak/valley pair detection for the acceleration norm, with time filtering based on the peak-to-valley magnitude and peak-to-valley delay, as per Sections 3.2 and 3.3. Verification of peaks and valleys for cases of high device motion is also applied through further investigating the dominant axis of angular rotation and magnetic change rate, as discussed in Sections 3.4 and 3.5, where a peak/valley extraction is also applied conditionally in the case of repetitive-high-variance patterns to the gyroscope and magnetometer measurements. The peaks and valleys of the angular velocity and magnetic field should coincide within a threshold from the peaks of the acceleration. Finally, the step is verified through the integration of the acceleration measurements within the step window, as will be shown in Section 3.6. The conditional blocks are executed only in the case of the detection of a dominant repetitive signal in the gyroscope or magnetometer measurements or both.
Where is the net acceleration magnitude; is the linear acceleration vector norm; is the gravity norm; ∅ is the angle between gravity and linear acceleration; and , , are the accelerations in the body frame.
Based on these findings, the proposed algorithm, shown in Figure 5 as a block diagram, starts by filtering the measurements from the sensors using an adaptive low-pass filter that is discussed in Section 3.1; after that, it applies a peak/valley pair detection for the acceleration norm, with time  filtering based on the peak-to-valley magnitude and peak-to-valley delay, as per Sections 3.2 and 3.3. Verification of peaks and valleys for cases of high device motion is also applied through further investigating the dominant axis of angular rotation and magnetic change rate, as discussed in Sections 3.4 and 3.5, where a peak/valley extraction is also applied conditionally in the case of repetitive-high-variance patterns to the gyroscope and magnetometer measurements. The peaks and valleys of the angular velocity and magnetic field should coincide within a threshold from the peaks of the acceleration. Finally, the step is verified through the integration of the acceleration measurements within the step window, as will be shown in Section 3.6. The conditional blocks are executed only in the case of the detection of a dominant repetitive signal in the gyroscope or magnetometer measurements or both.

Figure 5.
Step detection block diagram.

Adaptive Filter
Sensor signals are poised by different errors and inaccuracies due to many factors. In the case of IMU measurements, the signals can be analyzed to compensate for deterministic errors. White noise and process noise can be hard to determine and model, hence the need for digital filtering. For a signal to be successfully filtered, the frequency of the desired signal needs to be estimated.
The proposed system makes use of an Infinite Impulse Response (IIR) Butterworth low pass filter for its simplicity and low computational overhead. An adaptive cut-off frequency is continuously tuned and updated based on the recently detected walking speed.
Selection of the filter order is important, as it affects two main aspects of the filter, the latency and roll-off. In a Butterworth filter, the higher the order of the filter, the higher the steepness of the transition between the pass and stop bands yielding fast roll-off which makes it closer to an ideal filter, but, on the other hand, the group delay increases making real-time processing unachievable. With a low order, the latency is low but the roll of is slower and hence frequencies from the stop band still exist in the filtered signal. For the desirability of real-time processing in this application, a low filter order is needed to minimize the latency. To overcome the slow roll-off of the filter, the cut-off frequency is slightly reduced to compensate for the effect of undesired frequency residuals.
Using a static cut-off frequency for filtering a signal with varying frequency can lead to either loss of information or left over residual noise that affects the system. The effects of over and under estimating the cut-off frequency is shown in Figures 6 and 7, and are compared to the case of adaptively tuning the cut-off frequency. Figure 8 elaborates on the specific effect of under-filtering in comparison to the adaptive filter as residual fluctuations from motion noise remain in the signal. Step detection block diagram.

Adaptive Filter
Sensor signals are poised by different errors and inaccuracies due to many factors. In the case of IMU measurements, the signals can be analyzed to compensate for deterministic errors. White noise and process noise can be hard to determine and model, hence the need for digital filtering. For a signal to be successfully filtered, the frequency of the desired signal needs to be estimated.
The proposed system makes use of an Infinite Impulse Response (IIR) Butterworth low pass filter for its simplicity and low computational overhead. An adaptive cut-off frequency is continuously tuned and updated based on the recently detected walking speed.
Selection of the filter order is important, as it affects two main aspects of the filter, the latency and roll-off. In a Butterworth filter, the higher the order of the filter, the higher the steepness of the transition between the pass and stop bands yielding fast roll-off which makes it closer to an ideal filter, but, on the other hand, the group delay increases making real-time processing unachievable. With a low order, the latency is low but the roll of is slower and hence frequencies from the stop band still exist in the filtered signal. For the desirability of real-time processing in this application, a low filter order is needed to minimize the latency. To overcome the slow roll-off of the filter, the cut-off frequency is slightly reduced to compensate for the effect of undesired frequency residuals.
Using a static cut-off frequency for filtering a signal with varying frequency can lead to either loss of information or left over residual noise that affects the system. The effects of over and under estimating the cut-off frequency is shown in Figures 6 and 7, and are compared to the case of adaptively tuning the cut-off frequency. Figure 8 elaborates on the specific effect of under-filtering in comparison to the adaptive filter as residual fluctuations from motion noise remain in the signal.

Temporal Filtering
Due to the noise in the signal, in some cases, two consecutive peaks or valleys can occur. The adaptive filter helps in reducing the chances of this happening, yet a fail-safe is needed to eliminate the undesired residual pikes. The temporal filtering works through adaptively tuning a time threshold, where a detected peak/valley can be replaced by another one of higher/lower magnitude. Occurrence of peak/valley outside the defined replacement zone is neglected unless it is an opposing type of peak with high magnitude difference from the recently detected peak.
Originally, a peak or valley is detected if the value in the middle position of a window is greatest or lowest respectively, as shown in Equation (3), where the detected peak/valley is a candidate that is only verified if not replaced by another within a time threshold. Figure 9 shows the temporal threshold for update and rejection, where the blue region is the update region and the red is the rejection region. The update region is defined by a starting point that is based on a time threshold from the last detected peak/valley, indicating the starting point for searching for a peak and replacing it if conditions are met. The update region ending point is based on a time threshold starting from the first peak/valley detected within this update region.
The rejection region is the intermediate time in transition from peak to valley and vice versa, where no peaks are supposed to exist in regular motion. The thresholds for defining start and end of the regions are adaptively tuned based on the estimated motion speed, as shown in Equations (4)- (8).

Temporal Filtering
Due to the noise in the signal, in some cases, two consecutive peaks or valleys can occur. The adaptive filter helps in reducing the chances of this happening, yet a fail-safe is needed to eliminate the undesired residual pikes. The temporal filtering works through adaptively tuning a time threshold, where a detected peak/valley can be replaced by another one of higher/lower magnitude. Occurrence of peak/valley outside the defined replacement zone is neglected unless it is an opposing type of peak with high magnitude difference from the recently detected peak.
Originally, a peak or valley is detected if the value in the middle position of a window is greatest or lowest respectively, as shown in Equation (3), where the detected peak/valley is a candidate that is only verified if not replaced by another within a time threshold. Figure 9 shows the temporal threshold for update and rejection, where the blue region is the update region and the red is the rejection region. The update region is defined by a starting point that is based on a time threshold from the last detected peak/valley, indicating the starting point for searching for a peak and replacing it if conditions are met. The update region ending point is based on a time threshold starting from the first peak/valley detected within this update region.
The rejection region is the intermediate time in transition from peak to valley and vice versa, where no peaks are supposed to exist in regular motion. The thresholds for defining start and end of the regions are adaptively tuned based on the estimated motion speed, as shown in Equations (4)- (8).
peak : a n−1 < a n >a n+1 valley : a n−1 > a n < a n+1 (3) th s = 0.5 * ∆t 1 (5) th e = 0.3 * ∆t 2 (7) where a is the acceleration norm; t (p|v) is the time of peak or valley; ∆t is the time interval between peak and valley; th s is the time threshold for start of search/update region; th e is the time threshold for end of update region; and t (p|v) n/r is the time of replacement candidate for nth peak/valley. Where: is the acceleration norm; is the time of peak or valley; ∆ is the time interval between peak and valley; ℎ is the time threshold for start of search/update region; ℎ is the time threshold for end of update region; and / is the time of replacement candidate for n th peak/valley.

Peak-to-Peak and Pseudo Zero Crossing
For each detected sequence of peak/valley or valley/peak, the difference of the magnitudes reflects the speed of motion during the step, while their average represents the pseudo zero crossing at which a step starts or ends. The difference in magnitude referred to as the peak-to-peak value is used along with the time difference between the pair to adaptively tune the cut-off frequency for the next segment of the signal.
During regular motion, when a pedestrian is speeding up or down, the change of speed does not occur instantaneously, rather increases or decreases gradually. A peak/valley magnitude difference over time represents change in walking speed, and time difference represents a half step duration. Both can be used to tune the cut-off frequency and the time thresholds for the expected upcoming peak/valley. Equation (9) represents the magnitude at which a step is declared and the start of the next coming step. Equation (11) represents the criteria for considering a change of walking pace, which leads to the application of Equation (12).
where: is the pseudo zero crossing; is the nth peak acceleration magnitude; is the nth valley acceleration magnitude; ∆ is the nth peak/valley magnitude difference; is the speed up threshold; is the speed down threshold and is equal to − ; is the estimated step frequency; and is the sampling frequency. The peak-to-peak magnitude and the pseudo zero-crossing are also used for detecting sudden changes of motion. When a peak/valley is detected in the rejection zone, if the magnitude difference from the pseudo zero-crossing is sufficient, while the peak-to-peak is also of high magnitude, it

Peak-to-Peak and Pseudo Zero Crossing
For each detected sequence of peak/valley or valley/peak, the difference of the magnitudes reflects the speed of motion during the step, while their average represents the pseudo zero crossing at which a step starts or ends. The difference in magnitude referred to as the peak-to-peak value is used along with the time difference between the pair to adaptively tune the cut-off frequency for the next segment of the signal.
During regular motion, when a pedestrian is speeding up or down, the change of speed does not occur instantaneously, rather increases or decreases gradually. A peak/valley magnitude difference over time represents change in walking speed, and time difference represents a half step duration. Both can be used to tune the cut-off frequency and the time thresholds for the expected upcoming peak/valley. Equation (9) represents the magnitude at which a step is declared and the start of the next coming step. Equation (11) represents the criteria for considering a change of walking pace, which leads to the application of Equation (12).
∆a n = a p n − a vn ∆a n−1 = a p n−1 − a vn−1 (10) where: p zc is the pseudo zero crossing; a p n is the nth peak acceleration magnitude; a vn is the nth valley acceleration magnitude; ∆a n is the nth peak/valley magnitude difference; su th is the speed up threshold; sd th is the speed down threshold and is equal to −su th ; f s is the estimated step frequency; and F is the sampling frequency. The peak-to-peak magnitude and the pseudo zero-crossing are also used for detecting sudden changes of motion. When a peak/valley is detected in the rejection zone, if the magnitude difference from the pseudo zero-crossing is sufficient, while the peak-to-peak is also of high magnitude, it indicates a sudden change in pace or rapid change in the device motion separately from the user. In this case, although the peak/valley is in the rejection region, it will be accepted as a candidate.

Gyroscope Fusion
In some use cases, the angular rates generated have a dominant repetitive pattern in one of the gyroscope axes. The signal pattern is similar to that of the acceleration norm but half its frequency. The acceleration norm represents the full motion of the platform-the pedestrian in this case-hence capturing the patterns from both right and left legs. In the case of the gyroscope, if it is placed in a shirt pocket by the torso, it will capture the sway motion of the torso, while, if it is place in a pants pocket, whether a side pocket or rear pocket, it will sense the motion of the leg it is appended to. In both cases, the generated signal is repetitive for a full stride, which is equivalent to two steps. If the phone is handheld, under regular motion conditions the arm swings forward when the opposing leg is moving forward and swings backwards when the leg nearby moves forward. Hence, the cyclic arm motion represents two steps being taken. In the previously mentioned motion cases, the generated signal has a frequency that is half that of the acceleration.
First, a dominant axis of rotational motion is determined using the variance of the signal. Then, a periodicity check is applied. If a periodic signal is found, it is used for step detection along with the acceleration, where, for each peak or valley detected in the gyroscope signal, there should exist a peak in the acceleration norm. If high angular rates exist but no periodicity, it is an indication that the device is undergoing irregular motion that does not match the walking behavior. In such a case, the gyroscope measurements are neglected and not used for detection.
When the condition is stand, peak detection is applied to dominant axis of motion in the gyroscope measurements, where, for each detected peak and valley in the gyroscope, there are two corresponding peaks in the acceleration norm with a valley in between. The peak matching utilizes a time window threshold for verifying the validity of the peak. Figure 10 shows the angular rates of a regular swinging motion while walking, where it can be seen that the z-axis measurements are periodic with distinguishable peaks and valleys, the matching process is then elaborated in Figure 11, where, for each peak in the acceleration norm, there exists a peak/valley match in the dominant angular rate signal extracted. indicates a sudden change in pace or rapid change in the device motion separately from the user. In this case, although the peak/valley is in the rejection region, it will be accepted as a candidate.

Gyroscope Fusion
In some use cases, the angular rates generated have a dominant repetitive pattern in one of the gyroscope axes. The signal pattern is similar to that of the acceleration norm but half its frequency. The acceleration norm represents the full motion of the platform-the pedestrian in this case-hence capturing the patterns from both right and left legs. In the case of the gyroscope, if it is placed in a shirt pocket by the torso, it will capture the sway motion of the torso, while, if it is place in a pants pocket, whether a side pocket or rear pocket, it will sense the motion of the leg it is appended to. In both cases, the generated signal is repetitive for a full stride, which is equivalent to two steps. If the phone is handheld, under regular motion conditions the arm swings forward when the opposing leg is moving forward and swings backwards when the leg nearby moves forward. Hence, the cyclic arm motion represents two steps being taken. In the previously mentioned motion cases, the generated signal has a frequency that is half that of the acceleration.
First, a dominant axis of rotational motion is determined using the variance of the signal. Then, a periodicity check is applied. If a periodic signal is found, it is used for step detection along with the acceleration, where, for each peak or valley detected in the gyroscope signal, there should exist a peak in the acceleration norm. If high angular rates exist but no periodicity, it is an indication that the device is undergoing irregular motion that does not match the walking behavior. In such a case, the gyroscope measurements are neglected and not used for detection.
When the condition is stand, peak detection is applied to dominant axis of motion in the gyroscope measurements, where, for each detected peak and valley in the gyroscope, there are two corresponding peaks in the acceleration norm with a valley in between. The peak matching utilizes a time window threshold for verifying the validity of the peak. Figure 10 shows the angular rates of a regular swinging motion while walking, where it can be seen that the z-axis measurements are periodic with distinguishable peaks and valleys, the matching process is then elaborated in Figure 11, where, for each peak in the acceleration norm, there exists a peak/valley match in the dominant angular rate signal extracted.

Magnetometer Fusion
The surrounding magnetic field sensed by the magnetometer over a period is supposed to be consistent if no interference occurs. The magnetic vector has been previously used for heading estimation and for positioning using magnetic map matching. In this paper, the magnetic vector sensed by the magnetometer is used for purposes of step detection.
The hypothesis is that the magnetic field does not abruptly change within a step. Hence, any changes in the magnetic intensity measurements by the magnetometer represent a change in the orientation of the device with respect to the surrounding magnetic field.
In a dangling use case of the phone, the magnetometer axes orientation change with respect to the surrounding magnetic field resulting in a periodic signal in one or more axes of the magnetometer. This signal is similar to that generated in the gyroscope measurements and having the same properties where the signals frequency is half that of the acceleration norm of walking. For this approach to be beneficial for step detection, the magnetic change rate induced by the change of the device pose must be of higher order than the interference from surrounding sources.
As illustrated in Figure 12, the magnetic field norm is computed based on Equation (13). As the magnetic field remains nearly constant based on Equation (14), the changes in measured components are due to changes in the orientation of the sensor frame with respect to the vector, as shown in Equation (15). Each of the axes of the sensor measure a component from the magnetic field vector depending on the non-coplanar angle between the vector and the axis based on the cosine rule. As the frame orientation changes during the motion of the user, the component will vary in a repetitive form. Figure 13 shows the variations in magnetic components in 3-D magnetometer measurements, while nearly maintaining a constant net magnitude. It is also shown in Figure 14 how the variations in the magnetic measurements coincide with the acceleration norm induced by the walking motion.

Magnetometer Fusion
The surrounding magnetic field sensed by the magnetometer over a period is supposed to be consistent if no interference occurs. The magnetic vector has been previously used for heading estimation and for positioning using magnetic map matching. In this paper, the magnetic vector sensed by the magnetometer is used for purposes of step detection.
The hypothesis is that the magnetic field does not abruptly change within a step. Hence, any changes in the magnetic intensity measurements by the magnetometer represent a change in the orientation of the device with respect to the surrounding magnetic field.
In a dangling use case of the phone, the magnetometer axes orientation change with respect to the surrounding magnetic field resulting in a periodic signal in one or more axes of the magnetometer. This signal is similar to that generated in the gyroscope measurements and having the same properties where the signals frequency is half that of the acceleration norm of walking. For this approach to be beneficial for step detection, the magnetic change rate induced by the change of the device pose must be of higher order than the interference from surrounding sources.
As illustrated in Figure 12, the magnetic field norm is computed based on Equation (13). As the magnetic field remains nearly constant based on Equation (14), the changes in measured components are due to changes in the orientation of the sensor frame with respect to the vector, as shown in Equation (15). Each of the axes of the sensor measure a component from the magnetic field vector depending on the non-coplanar angle between the vector and the axis based on the cosine rule. As the frame orientation changes during the motion of the user, the component will vary in a repetitive form. Figure 13 shows the variations in magnetic components in 3-D magnetometer measurements, while nearly maintaining a constant net magnitude. It is also shown in Figure 14 how the variations in the magnetic measurements coincide with the acceleration norm induced by the walking motion.
m x k = M k cos(θ x k ) m y k = M k cos θ y k m zk = M k cos(θ zk ) (15) where: θ x k , θ y k , θ zk are the angles from the vector to each of the body frame axes at time K m x , m y , m z are the magnetic intensities in the body frame; and M K is the magnetic norm at time K.
where: , , are the angles from the vector to each of the body frame axes at time K , , are the magnetic intensities in the body frame; and is the magnetic norm at time K. The peak matching for the magnetometer detected peaks follows the same rules as those of the angular rates. A magnetic peak should fall in the same window as that of the acceleration norm and angular rates. The peak matching is used to indicate magnetic perturbations, where, if an angular rate peak is found while the magnetic changes are high but has no coinciding peak, it is recognized as magnetic interference.   The peak matching for the magnetometer detected peaks follows the same rules as those of the angular rates. A magnetic peak should fall in the same window as that of the acceleration norm and angular rates. The peak matching is used to indicate magnetic perturbations, where, if an angular rate peak is found while the magnetic changes are high but has no coinciding peak, it is recognized as magnetic interference.

Step Validation
A step validation method is needed to verify if the detected sequence is an actual step or a mimicking behavior, where, in some use-cases, the smartphone acceleration signals with sequences of peak/valley pairs can be induced, even though the platform is not in motion. The validation in this The peak matching for the magnetometer detected peaks follows the same rules as those of the angular rates. A magnetic peak should fall in the same window as that of the acceleration norm and angular rates. The peak matching is used to indicate magnetic perturbations, where, if an angular rate peak is found while the magnetic changes are high but has no coinciding peak, it is recognized as magnetic interference.

Step Validation
A step validation method is needed to verify if the detected sequence is an actual step or a mimicking behavior, where, in some use-cases, the smartphone acceleration signals with sequences of peak/valley pairs can be induced, even though the platform is not in motion. The validation in this algorithm is a simple double integration of the acceleration signals after the removal of the gravity vector to obtain the corresponding displacement for the detected pattern. The gravity can be compensated for by transforming the measurements from the sensor frame to the Local Level Frame (LLF). The algorithm proposed in [47] was used for tracking the orientation of the device to be able to obtain the measurements in the LLF where the gravity component is all summed up in the vertical axis to earth, making the separation of gravity and only obtaining the linear accelerations achievable, after which this displacement is compared to a significant motion threshold that is adaptively tuned based on the user previous steps. If the displacement is found to be greater than the threshold, the step is valid and is counted, otherwise considered as a false positive and removed. Equations (16)- (18) represent the displacement of the current step, the step threshold computation, and the condition for accepting a step. The step displacement is the double integration of the linear acceleration within the detected step period denoted by start (s) and end (e). The step threshold is computed as 0.6 of the average of the previous k steps.
The step window k is set to 3 to keep information of recent step sizes while being able to adapt to changes in walking speeds. As the window size increases, the capability of the algorithm to cope with changing walking speed would degrade at transition points with fast speed change. On the other hand, if the window is set too small, there will not be enough information to represent the motion speed when the user is walking with a nearly steady pace.
The 0.6 factor used is to compensate for sudden drop in step length when transitioning from running to walking, while neglecting displacement from accelerations integration over time from arm motion in static mode.
where: d n = nth step displacement; la norm = linear acceleration norm; and th s = step displacement threshold.

Proposed Step Detection Algorithm
The proposed step detection and counting technique applies a sequence of algorithms to detect the steps taken by a user and verify them. The algorithm is a real-time processing of the sensor data with only a one epoch delay for detecting the peaks and valleys in the signal. Table 1 shows the notations of the variables used in the algorithm. Algorithms 1-6 show the main modules of the proposed methodology, while Algorithm 7 shows the main operation framework of the system. A window of size n of angular rates vector m(n) A window of size n of magnetic intensity vector σ ωk Variance for angular rates of axis k σ mk Variance for magnetic intensities of axis k dr Dominance rank t a peak Time of latest acceleration peak t avalley Time of latest acceleration valley t ω peak Time of latest angular rate peak/valley t m peak Time of latest magnetic intensity peak/valley t th peak Time threshold for peak matching t a/r Time threshold for end of rejection zone and beginning of candidate detection th u Time threshold for updating acceleration peak/valley th s Significant displacement threshold for step validation If S n−1 == S n If S n =1 && (t a peak k − t a peak k−1 < th u ) && a peak k > a peak k−1 Return(update-peak) Elseif S n = −1 && (t avalley k − t avalley k−1 < th u ) && a valley k < a valley k−1
Step detection framework Initiate adaptive filter frequencies and coefficients Repeat for each sample: ---peak extraction and update section ---If in motion Filter measurements S ← Detect candidate If S = 1 If S k−1 = −1 && time_since_valley > t a/r Valid peak detected Elseif S k−1 = S ∆t = t SK − t SK−1 a peak k = peak update (th u , S k−1 , S k ) If ∆t ∼ = step duration && significant motion detected Account for valley miss Endif Endif Elseif S = −1 If S k−1 = 1 && time_since_peak > t a/r Valid valley detected Elseif S k−1 = S ∆t = t SK − t SK−1 a valley k = peak update (th u , S k−1 , S k ) If ∆t ∼ = step duration && significant motion detected Account for peak miss Endif Endif Endif ---peak validation from gyroscope and magnetometer --dominant axis extraction(ω(n), m(n)) If dr > 0 S ω , S m ← Detect candidate(ω(n), m(n)) If S ω = 1 || S ω = −1 peak matching(t a peak , t ω peak , t m peak ) validate matching peaks Endif Endif ---step validation --step validation (A sk ) adaptive filter frequency selector (LPD)

Testing
To test the proposed algorithm, datasets were collected by two smartphones, namely the HTC m9 and the iPhone 6. The test scenarios explained in the following subsection were carried out on both devices by multiple users. The algorithm was implemented on the android device, which utilizes a Qualcomm Snapdragon 810 [48] with a clock speed up to 2.0 GHz. SensorLog app [49] was used on the iPhone device to log data that were processed in a sequential manner to emulate the real-time scenario. The android version operated in the background without causing degraded user experience. It is to be noted that, from the presented pseudocodes, the algorithm does not apply any extensive computations that would require high resources.

Experimental Setup
A group of ten users contributed to the data collection for the algorithm testing. The group is composed of five males and five females within the ages of 21-34. Each of the test subjects carried out a walking test of 100 steps for six different phone use-cases and four walking modes summing up a total of 24 tests per user. Table 2 shows the tests carried out by the test subjects.

Device Pose
Step In addition to the data presented in the previous table, six tests were carried out, where, in each test, the user walked 383-594 steps while changing the phone orientation, use-case, varying the walking speed, switching between walking and running, and scaling stairs. In those three tests, step counts provided by two wrist fitness bands were sampled to be compared with the proposed implementation. The wristbands used were the Fitbit Flex2 and the Xiaomi Mi Band 2. The bands were mounted on the right wrist while the smartphone was held in the right hand, hence all of the devices are experiencing nearly the same signals, except for the case when the phone was placed in the pocket. Table 3 shows the average detection success of the proposed algorithm for each combination of step mode and device pose carried out by the test subjects. The walking pace for the regular walking ranged between 1.5 and 2.3 steps, the slow walking between 0.7 and 1.3, and the running between 2.5 and 4 steps. As seen, the lowest performance recorder for step modes was in the slow walking case; this occurs because the motion noise becomes more dominant in slow step modes, like hand shaking and imbalance in motion. As the speed of motion increases, the motion noise ratio to the motion signal itself becomes less and is filtered out easily. The effect of the motion noise was highest when in texting mode, which was caused by the patters generated from the tapping on the screen being dominant and causing many fake peaks. The best reported performances where in the case of regular walking with compassing, pocket and phoning, which was due to the relative static pose of the phone, which made the motion signal dominant in comparison to the device motion. Table 3. Average accuracy of the proposed algorithm.

Device Pose
Step Overall, the performance of the algorithm can be judged based on the final criterion of use-case, which is free-motion. In this test, the test subjects were asked to use the phone in a combination of different poses while walking non-stop. The most important test case is the combination of mixed step mode with free moving device. The reported accuracy is 99.6%, which means that, out of a total of 2000 steps taken by all the test subjects, only 8 steps were missed.
The remaining six tests were not pre-planned. For each test, the user walked around for three to four minutes switching between device pose and step modes. Table 4 shows the performance evaluation of the proposed algorithm in comparison to the FitBit Flex2 and Xiaomi MIBand2. Two of the environments where the tests were carried out are shown in Figures 15 and 16. As shown, the places where the tests took place varied between indoor and outdoor environments to test the stability of the algorithm. The remaining six tests were not pre-planned. For each test, the user walked around for three to four minutes switching between device pose and step modes. Table 4 shows the performance evaluation of the proposed algorithm in comparison to the FitBit Flex2 and Xiaomi MIBand2. Two of the environments where the tests were carried out are shown in Figures 15 and 16. As shown, the places where the tests took place varied between indoor and outdoor environments to test the stability of the algorithm.  For elaboration, the signals from Test 1 are shown in Figures 17 and 18. The filtering successfully adapted to the speed of motion, where at some point the user was running at a pace of 5 Hz. If the signal was filtered with a constant cut-off frequency, it would have been either under-filtered and suffer from undesired motion noise, or over-filtered where the magnitudes of the acceleration during the running period would have been drastically reduced and would have caused the system to fail in the detection. The remaining six tests were not pre-planned. For each test, the user walked around for three to four minutes switching between device pose and step modes. Table 4 shows the performance evaluation of the proposed algorithm in comparison to the FitBit Flex2 and Xiaomi MIBand2. Two of the environments where the tests were carried out are shown in Figures 15 and 16. As shown, the places where the tests took place varied between indoor and outdoor environments to test the stability of the algorithm.  For elaboration, the signals from Test 1 are shown in Figures 17 and 18. The filtering successfully adapted to the speed of motion, where at some point the user was running at a pace of 5 Hz. If the signal was filtered with a constant cut-off frequency, it would have been either under-filtered and suffer from undesired motion noise, or over-filtered where the magnitudes of the acceleration during the running period would have been drastically reduced and would have caused the system to fail in the detection. For elaboration, the signals from Test 1 are shown in Figures 17 and 18. The filtering successfully adapted to the speed of motion, where at some point the user was running at a pace of 5 Hz. If the signal was filtered with a constant cut-off frequency, it would have been either under-filtered and suffer from undesired motion noise, or over-filtered where the magnitudes of the acceleration during the running period would have been drastically reduced and would have caused the system to fail in the detection.
The extraction of the dominant axis of motion for both the magnetometer and the gyroscope after filtering was also successful, where, in the case of the magnetometer, there were two dominant axes of motion and the algorithm switched between them based on the variance of the signal. It can be deduced from the figure that the user switched between dangling the phone at times and holding it nearly steady at others. The change of the orientation of the device and the step mode taken by the user did not affect the performance of the extraction nor the peak detection and hence leading to good step detection with high accuracy.
Results shown in Table 4 indicate a higher performance by the proposed algorithm in comparison to the two fitness bands used. A minimum accuracy of 99.38% and a maximum of 99.66% are reported with an average of 99.47%. In all tests, the proposed algorithm using the smartphone outperformed the accuracy of both bands. The reported results in Tables 3 and 4 are compensated for false positives. In some cases, in the tests, false steps were counted that should have been filtered. For each false positive detected through manual analysis of the datasets, a step was subtracted from the resulting count for that test to get the actual accuracy of detection. The case with the most reported false positives was the typing while in slow walking speed step mode with 13 false positives. The number of false positives detected was 19 steps out of 111 misdetections through the 50,836 total steps taken in the experiment. This yields a 17.12% of the miscounts being false steps, where 11.71% of the miscounts occurring in the specific combination of use-case texting and slow walking step mode. The extraction of the dominant axis of motion for both the magnetometer and the gyroscope after filtering was also successful, where, in the case of the magnetometer, there were two dominant axes of motion and the algorithm switched between them based on the variance of the signal. It can be deduced from the figure that the user switched between dangling the phone at times and holding it nearly steady at others. The change of the orientation of the device and the step mode taken by the user did not affect the performance of the extraction nor the peak detection and hence leading to good step detection with high accuracy.  Table 4 indicate a higher performance by the proposed algorithm in comparison to the two fitness bands used. A minimum accuracy of 99.38% and a maximum of 99.66% are reported with an average of 99.47%. In all tests, the proposed algorithm using the smartphone outperformed the accuracy of both bands.  The extraction of the dominant axis of motion for both the magnetometer and the gyroscope after filtering was also successful, where, in the case of the magnetometer, there were two dominant axes of motion and the algorithm switched between them based on the variance of the signal. It can be deduced from the figure that the user switched between dangling the phone at times and holding it nearly steady at others. The change of the orientation of the device and the step mode taken by the user did not affect the performance of the extraction nor the peak detection and hence leading to good step detection with high accuracy.  Table 4 indicate a higher performance by the proposed algorithm in comparison to the two fitness bands used. A minimum accuracy of 99.38% and a maximum of 99.66% are reported with an average of 99.47%. In all tests, the proposed algorithm using the smartphone outperformed the accuracy of both bands.

Conclusions
A novel step detection algorithm using smartphones was proposed for detecting steps while being invariant to the device pose, use-case, and step mode of the user. An adaptive low-pass filter that continuously tunes the cut-off frequency was proposed, where the filter successfully selects the appropriate cut-off frequency reducing the motion noise in the signal while preserving crucial walking information. Fusion of information from the angular rates and magnetic intensity with the acceleration norm was proposed for peak detection verification in cases of high angular rates with periodicity. All the thresholds used for detection are adaptively computed during operation to cope with step mode variations and are independent from user specific information and behavior. The proposed algorithm shows high versatility with a maximum accuracy of 100% in some cases of fixed device pose and a minimum of 99.1% in the case of slow walking while texting due to screen tapping induced noise. The reported average accuracy is 99.6% for combined step modes with low dynamic phone pose change over time, and a 99.47% for high dynamic changes. The proposed algorithm outperformed the accuracy of two fitness bands available in the market while not requiring any extra hardware to be used, as in the case of a wrist mounted fitness band. This algorithm provides a convenient means of self-assessing activity levels while only requiring the installation of an app on a user's smartphone, and can also impact the accuracy of a PDR by accurately detecting steps taken.