Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities

Del Rosario, Michael B.; Lovell, Nigel H.; Redmond, Stephen J.

doi:10.3390/s19132845

Open AccessFeature PaperArticle

Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities

by

Michael B. Del Rosario

¹

,

Nigel H. Lovell

¹

and

Stephen J. Redmond

^1,2,3,*

¹

Graduate School of Biomedical Engineering, UNSW, Sydney, NSW 2052, Australia

²

UCD School of Electrical and Electronic Engineering, University College Dublin, Belfield, 4 Dublin, Ireland

³

UCD Centre for Biomedical Engineering, University College Dublin, Belfield, 4 Dublin, Ireland

^*

Author to whom correspondence should be addressed.

Sensors 2019, 19(13), 2845; https://doi.org/10.3390/s19132845

Submission received: 14 May 2019 / Revised: 18 June 2019 / Accepted: 21 June 2019 / Published: 26 June 2019

(This article belongs to the Special Issue Wearable and Nearable Biosensors and Systems for Healthcare)

Download

Browse Figures

Versions Notes

Abstract

:

Features were developed which accounted for the changing orientation of the inertial measurement unit (IMU) relative to the body, and demonstrably improved the performance of models for human activity recognition (HAR). The method is proficient at separating periods of standing and sedentary activity (i.e., sitting and/or lying) using only one IMU, even if it is arbitrarily oriented or subsequently re-oriented relative to the body; since the body is upright during walking, learning the IMU orientation during walking provides a reference orientation against which sitting and/or lying can be inferred. Thus, the two activities can be identified (irrespective of the cohort) by analyzing the magnitude of the angle of shortest rotation which would be required to bring the upright direction into coincidence with the average orientation from the most recent 2.5 s of IMU data. Models for HAR were trained using data obtained from a cohort of 37 older adults (83.9 ± 3.4 years) or 20 younger adults (21.9 ± 1.7 years). Test data were generated from the training data by virtually re-orienting the IMU so that it is representative of carrying the phone in five different orientations (relative to the thigh). The overall performance of the model for HAR was consistent whether the model was trained with the data from the younger cohort, and tested with the data from the older cohort after it had been virtually re-oriented (Cohen’s Kappa 95% confidence interval [0.782, 0.793]; total class sensitivity 95% confidence interval [84.9%, 85.6%]), or the reciprocal scenario in which the model was trained with the data from the older cohort, and tested with the data from the younger cohort after it had been virtually re-oriented (Cohen’s Kappa 95% confidence interval [0.765, 0.784]; total class sensitivity 95% confidence interval [82.3%, 83.7%]).

Keywords:

quaternion; smartphone; feature engineering; human activity recognition; sensor fusion

1. Introduction

Wearable movement sensors, i.e., sensors that incorporate inertial measurement units (IMUs) and barometric altimeters, have been championed as tools that will positively impact health care [1]. These technologies have demonstrated their utility in the remote monitoring of patient rehabilitation [2], as well as the clinical analysis of gait [3], from which parameters can be extracted to predict falls in the elderly [4,5]. They have come to prominence in the management of Parkinson’s disease [6], objectively quantifying patient tremor [7], and tracking the impact of the disease on their gait [8]. Moreover, wearables have been adopted for the longitudinal monitoring of physical activity, which can be used to identify those at risk of developing type-2 diabetes [9] and obesity [10].

1.1. Multiple Sensors or a Single Sensor?

The number of sensors that an individual needs to wear for adequate human activity recognition (HAR) is dependent on three factors: (i) the number of activities to be recognized, (ii) the location of the sensor(s) on the body, and (iii) the nature of the sensors (i.e., some arbitrary combination of accelerometers, gyroscopes, barometric pressure sensors, magnetometers, etc.). If the activities to be recognized involve the movement of each of the body’s limbs (e.g., lunges, push ups, hand stands, etc.), multiple sensors may need to be worn on the body at specific locations to obtain measurements that allow the movement to be accurately identified. Wearing a single sensor on the body is sufficient if the model for HAR is only identifying gross body movements (e.g., standing, sitting, walking, running), placing the sensor near the body’s center of mass is ideal (i.e., the thigh) [11].

In either case, wearing sensors at different locations on the body will increase the performance of a model for HAR [12], but at the expense of user compliance and burden [13], particularly if the sensor(s) must be placed somewhere uncomfortable or unsightly [14]. Consequently, wearable sensor systems for population-based studies are predominantly of the single-sensor variety [15,16], ideally integrating seamlessly into the daily lives of users (e.g., embedded within a watch, necklace, sock or belt) [17].

1.2. Smartphone-Based Human Activity Recognition

The dramatic recent increase in smartphone ownership [18,19] coupled with society’s dependence on smartphones [20,21] has changed the paradigm. If individuals carry their device with them, the measurements from the smartphone’s IMU and barometric altimeter can be analyzed to identify the users’ gross body movements throughout the day. As a result, smartphones can be used as tools for the purposes of physical rehabilitation, weight loss, etc., in which the ability to recognize human activity is essential [22]. Finally, the greater penetration of smartphones amongst those of a lower socioeconomic status [23] would enable population-based interventions to be conducted at a reduced cost and with a wider reach. There are different models for HAR which can be adopted [24].

1.2.1. Fixed-to-the-Body

In this scenario, the IMUs embedded within smartphones are used in place of dedicated IMU devices to: (i) detect falls [25], (ii) monitor activities of daily living [26], (iii) monitor the performance of soccer and field hockey athletes [27], and (iv) swimmers [28]. These models assume that the smartphone will be worn at one location on the body and that the device’s orientation relative to the body segment on which it is worn is known a priori and does not change during the monitoring period because it is held in place with a strap or similar apparatus.

1.2.2. Body-Position-Dependent

Conversely, models can be designed under the assumption that the smartphone is not strapped to the body and will be placed in either the user’s pants/chest pocket [29], or hand or bag [30]. These models do not require the user’s smartphone habits to change (as with those in Section 1.2.1) to accommodate the device being fixed to the body, however, this makes it challenging to infer the user’s postural orientation due to the variability with which the sensor can be oriented in the pocket with respect to the body (i.e., the initial orientation of the device relative to the body segment on which it is worn cannot be controlled, and the orientation of the device can vary over time since it is not firmly fixed to the body).

1.2.3. Body-Position-Dependent

The final variant relaxes all constraints with respect to the smartphone’s position and orientation on the body. Models are robust to device transitions from hand to pants/chest pocket [31], or bag, at any moment [32,33]. A trade-off for this robustness (compared to those discussed in Section 1.2.1 or Section 1.2.2) is that it can be difficult to determine body posture due to variability in both the device’s location and its relative orientation to the body.

1.3. Extracting Information for Activity Recognition

There are two broad supervised learning approaches which have emerged to process sensor measurements for HAR: (i) feature engineering and classification, or (ii) deep learning.

1.3.1. Accounting for Variability in Device Orientation and Position

A number of pre-processing techniques have been proposed to reduce the variability in sensor measurements due to the inconsistency of the location and/or orientation with which the device is placed on the body. Khan et al. demonstrated that linear discriminant analysis (LDA) can be used to improve a classification algorithm’s ability to distinguish between transitions from sitting to standing (and vice versa), and standing to lying (and vice versa) [34]. They also illustrated how kernel discriminant analysis (KDA) can estimate both the interclass and intraclass variance of features used to separate periods of walking, running, walking upstairs, and walking downstairs [35]. Henpraserttae et al. applied eigenvalue decomposition to tri-axial accelerometer measurements to infer the device’s orientation with respect to the body by assuming that most of the acceleration due to body movement is in the forward direction, and that the vertical axis can be inferred from the low-pass filtered acceleration [36].

Yurtman and Barshan proposed another transformation based on singular value decomposition (SVD). They first pre-processed the data from a tri-axial accelerometer, tri-axial gyroscope, and tri-axial magnetometer so that it had unit variance, before SVD was applied to the entire time sequence to make the sensor measurements agnostic to the device’s orientation [37]. Yurman et al. followed their seminal work with another method which combined the measurements from the accelerometer, gyroscope, and magnetometer, to estimate the sensor’s orientation within the global frame of reference when it is firmly fixed to the body. The differential quaternions they generated, which estimated the relative change in the device’s orientation between time intervals, enabled the raw sensor measurements to be expressed in a reference frame invariant to the sensor’s orientation [38].

1.3.2. Feature Engineering and Classification

Feature engineering involves the application of domain knowledge to design hand-crafted features [39,40] which describe the changing characteristics of the data with respect to time. These features and labels (i.e., the human activities to be recognized which are temporally aligned with the feature values) are input to a classification algorithm (e.g., decision tree, support vector machine, Naïve Bayes, artificial neural networks, etc.) which tries to derive the best mathematical model that separates the labeled observations, based on the statistic distributions of those features.

1.3.3. Deep Learning/Deep Neural Networks

Alternatively, domain knowledge can be replaced with a standalone artificial intelligence solution that abstracts the entire feature extraction and classification process. Deep learning approaches are a natural extension to artificial neural networks, comprised of numerous neurons and layers, which attempt to learn both the Best Features and model for HAR by using the training labels to determine the value of the neurons’ weights at each layer in the network [41]. The performance of these approaches are dependent on the network’s architecture and the quality and quantity of the training data. While convolutional neural networks [42], short-time Fourier transforms combined with temporal convolution layers [43], long/short term memory (LSTM) networks [44], or a combination of convolution, recurrent, and LSTM network layers [45], have all been shown to perform exceptionally well, they incur a considerable energy cost when running on a smartphone due to the demands of real-time processing [46].

1.4. Contribution

This paper addresses the limitations associated with methods for inferring postural orientation that are dependent on the sensor’s precise anatomical placement [47,48] by presenting a novel method (i.e., a hand-crafted feature) for identifying sedentary periods of activity that is robust to variability in the sensor’s orientation. The sensor’s orientation during walking periods is learned on-line and used as a reference for the upright body orientation (represented by the quaternion

q_{upright}^{}

). Comparing the sensor’s recent average orientation (over a sliding window) to

q_{upright}^{}

enables standing and sedentary periods to be distinguished, regardless of the IMU’s orientation. It is important to distinguish between standing and sedentary activities due to their differing energy expenditure profiles [49,50]. Furthermore, there are definitive relationships between total sedentary time per day and: type-2 diabetes [51,52]; cardiovascular mortality; all-cause mortality [53,54]; and even cancer [55].

2. Materials

Wearable sensor data from our previous work [56], in which a cohort of twenty younger adults (15 male and five female) of ages 21.9 ± 1.7 years (Human Research Ethics Advisory, reference number 08/2012/48) and 37 older adults (25 male and 12 female) of age 83.9 ± 3.4 years (Human Research Ethics Committee, reference number HC12316) performed nine human activities whilst a smartphone was placed in their pants pocket, was used to evaluate the method proposed herein. The younger adults were able-bodied university students recruited from the University of New South Wales, Sydney, Australia. The older adults were recruited from a cohort of participants enrolled in an existing study on memory and aging at Neuroscience Research Australia (NeuRA), Sydney, Australia. These participants were community-dwelling and retirement village residents living in inner and eastern Sydney; aged 65+ years; English-speaking; with a mini-mental state examination (MMSE) score of 24 or above; no acute psychiatric condition with psychosis or unstable medical condition; not currently participating in a fall prevention trial.

Sensor data from the IMU and barometric altimeter were originally sampled at

f_{I M U} = 100

Hz and

f_{b a r} = 16

Hz, respectively. The measurements from the IMU and altimeter were also re-sampled at 40 Hz and 20 Hz, respectively, to demonstrate the method’s ability to be adapted for a reduced sampling rate, thereby reducing the prospective power consumption of the algorithm. This is important because the usability of wearable sensors increases if they can operate continuously throughout the waking day [24,57].

Periods of human activity, originally labeled as elevator up/down, were relabeled as standing to focus on the clinical relevance of the activity rather than the wider context of the person being in an elevator; this naturally increased the classification performance by reducing the range of activities being classified. Additionally, sitting and lying were collectively re-labeled as sedentary. Consequently, the nine activities described in [56] were reduced to six: sedentary, standing, walking, walking upstairs (WU), walking downstairs (WD) and postural transitions (PT).

3. Methods

Note in the sections that follow, (i) quaternion multiplication (⊗) and conjugation (

^{*}

) are defined in [58]; (ii) vectors are bold-faced (i.e.,

b

); (iii) quaternions are bold-faced, italicized and normalized unless explicitly stated (i.e.,

q

=

q / ||q||

); (iv) vectors expressed in the sensor frame, or estimated in the global frame of reference, will be denoted with the superscripts

^{s} b

, and

^{g} b

, respectively; (v) a function will be denoted as f(…) with arguments inside the brackets.

3.1. Generating Data Representative of Different Orientations

Each quaternion in Figure 1a–f was used to transform the accelerometer and gyroscope data (

r_{acc}

and

r_{gyr}

, respectively) collected in our previous work [56], into new accelerometer and gyroscope data (

v_{acc}

and

v_{gyr}

, respectively) that would have been obtained if the smartphone were re-oriented in the pants pocket (Equation (1)). Note: (i) {

r_{acc}

,

r_{gyr}

,

v_{acc}

,

v_{gyr}

} ∈

R^{3}

; (ii) data from the barometric altimeter were not transformed as these scalar measurements are orientation invariant.

[\begin{matrix} 0 & v_{x}^{} & v_{y}^{} & v_{z}^{} \end{matrix}] = q \otimes [\begin{matrix} 0 & r_{x}^{} & r_{y}^{} & r_{z}^{} \end{matrix}] \otimes q^{*} .

(1)

3.2. Estimating the Orientation of the IMU

The data generated in Section 3.1 were fused using the adaptive error-state Kalman filter (AESKF) for orientation estimation, developed in our previous work [59], to estimate the device’s orientation. The tuning parameters of the AESKF algorithm are listed in Table 1. Note, whilst there are many algorithms that can be used to estimate the IMU’s orientation, the AESKF was chosen for its computational efficiency relative to other algorithms [59].

3.2.1. Removing the Heading from the Estimated Orientation

The estimated orientation,

q_{AESKF, k}^{}

, had an arbitrary heading that did not contain any information about the orientation of the IMU on the individual’s body, since the person can face in any direction and perform the same activity. Consequently, this was removed by aligning the orientation,

q_{AESKF, k}^{}

, with the north-facing component of the standard basis,

e_{x}^{}

=

[\begin{matrix} 1 & 0 & 0 \end{matrix}]

. First, the

x

basis vector of the quaternion,

q_{AESKF, k}^{}

, was identified and projected to the

x y

-plane (Equation (2)). Once

x_{x y, k}^{}

is determined, the quaternion,

q_{north, k}^{}

, that rotates the device orientation,

q_{AESKF, k}^{}

, northward can be calculated (Equations (4)–(6)). The resultant quaternion,

q_{k}^{}

, had a fixed heading, i.e., the yaw angle,

ψ

= 0 (see Figure 2b), which ensured that the shortest rotation between two quaternions (Section 4.2.2), did not contain any information about changes in the device’s heading which normally occur due to turning the body.

x_{x y, k}^{} = {[\begin{matrix} q_{0}^{2} + q_{1}^{2} - q_{2}^{2} - q_{3}^{2} & 2 (q_{1}^{} q_{2}^{} + q_{0}^{} q_{3}^{}) & 0 \end{matrix}]}_{k}^{}

(2)

θ_{k}^{} = {cos}^{- 1} (M M (x_{x y, k}^{} \cdot e_{x}^{}) / ∥\begin{matrix} x_{x y, k}^{} \end{matrix}∥)

(3)

n_{k}^{} = x_{x y, k}^{} \times e_{x}^{}

(4)

q_{north, k}^{} = {[\begin{matrix} cos (\frac{θ}{2}) & n \cdot sin (\frac{θ}{2}) \end{matrix}]}_{k}

(5)

q_{k}^{} = q_{north, k}^{} \otimes q_{AESKF, k}^{}

(6)

3.2.2. Smoothing the Estimated Orientation

The effects of the IMU shifting/re-orienting within the individual’s pants pocket as they move through the world were minimized by time-averaging the quaternion,

q_{k}^{}

, using a computationally-efficient one-pass method [60], to create a moving average (window size N) of the device orientation (Equation (7)) from the last 2.5 worth of data,

{\bar{q}}_{k}^{}

.

{\bar{q}}_{k}^{} = f_{q, avg}^{} (q_{k - N + 1}^{}, \dots, q_{k}^{}),

(7)

see Appendix A Equation (A1).

4. Feature Extraction

The features in Table 2 were aggregated (using a sliding window with 50% overlap) using sensor data from the most recent 2.5 s. Features (1)–(4) were obtained by processing the sensor measurements with finite impulse response (FIR) linear phase filters, as described in our previous work [56]. Whilst novel features (5)–(8) are described in Section 4.1–Section 4.3.

4.1. Squared Magnitude of Pitch/Roll Angular Velocity

In our previous work [56], the three orthogonal gyroscope measurements were each band-pass filtered (between 1 and 20 Hz) to isolate the frequency components predominantly due to walking [61]. The squared magnitude of these three band-pass filtered signals at each time sample,

ω_{bpf, k}^{2}

, was used to distinguish between active/inactive periods of activity. Alternatively, the measurements can be expressed in the estimated global frame of reference (GFR) using the device orientation,

q_{k}^{}

. The squared magnitude of the pitch/roll rotation,

^{g} ω_{x y, k}^{2}

, can be isolated using Equations (8) and (9) since rotations about the vertical axis are primarily due to turning.

\begin{matrix} \begin{matrix} {[\begin{matrix} 0 & ^{g} ω_{x}^{} & ^{g} ω_{y}^{} & ^{g} ω_{z}^{} \end{matrix}]}_{k} = q_{k}^{} \otimes {[\begin{matrix} 0 & ^{s} ω_{x}^{} & ^{s} ω_{y}^{} & ^{s} ω_{z}^{} \end{matrix}]}_{k} \otimes {(q_{k}^{})}^{*} \end{matrix} \end{matrix}

(8)

\begin{matrix} \begin{matrix} ^{g} ω_{x y, k}^{2} =^{g} ω_{x, k}^{2} +^{g} ω_{y, k}^{2} \end{matrix} \end{matrix}

(9)

4.2. Detecting Sedentary Periods

The tilt angle,

Θ_{tilt, k}^{}

, was previously used [56] to identify the postural orientation of the body relative to the global frame of reference (GFR) [47,48]. In this previous formulation, the magnitude of

Θ_{tilt, k}^{}

is dependent on one of the axes of the IMU (the y-axis in Figure 3a) remaining in coincidence with the long axis of the thigh. This constraint is apparent when the IMU is oriented such that another axis is aligned with the long axis of the thigh (Figure 3b), but is also a problem if the orientation of the IMU shifts in the pocket. A new approach is proposed in Section 4.2.2 which compares the average recent orientation with the average orientation during walking periods (i.e., the sensor’s orientation during walking periods is continuously learned and used to define the ‘upright’ orientation, against which all other orientations are compared).

4.2.1. Estimate the Upright Orientation using the Orientation during Walking Periods

Walking periods were identified using the method proposed by Jiménez et al. [62] which analyzed the squared magnitude of the raw tri-axial gyroscope signal,

^{s} ω_{k}^{2}

(see Equation (10)), and the magnitude of the unbiased sample variance,

ς_{acc, k}^{2}

(see Equation (12)), of the squared magnitude of the raw tri-axial accelerometer signal,

^{s} a_{k}^{2}

(see Equation (11)). When both signals are greater than pre-determined thresholds, the individual carrying the IMU was presumed to be walking (see Equation (13)). Note, j =

k - N + 1

in Equation (12), and

ς_{acc, k}^{2}

was calculated using a computationally-efficient method [60] with a sliding window of 0.25 s; i.e., N = ⌊0.25 ×

f_{IMU}^{} ⌋

.

^{s} ω_{k}^{2} =^{s} {[ω_{x}^{2} + ω_{y}^{2} + ω_{z}^{2}]}_{k}

(10)

^{s} a_{k}^{2} =^{s} {[a_{x}^{2} + a_{y}^{2} + a_{z}^{2}]}_{k}

(11)

ς_{acc, k}^{2} = \frac{1}{N - 1} [(\sum_{i = j}^{k} {(^{s} a_{i}^{2})}^{2}) - \frac{1}{N} {({\sum_{i = j}^{k}}^{s} a_{i}^{2})}^{2}]

(12)

b_{walk, k} = \{\begin{matrix} 1, & (ω_{k}^{2} > 5 rad / s) \cap (ς_{acc, k}^{2} > 10 m^{2} / s^{4}) \\ 0, & otherwise \end{matrix}

(13)

The scalar and vector components of each orientation,

q_{k}^{}

, which correspond to these walking periods are stored and used to calculate the ‘upright’ orientation,

q_{upright, k}^{}

(see Equation (14) and Appendix A). Note:

^{⋆}

denotes the set of N indices corresponding to the most recent 2.5 s of data for which

b_{walk, k}^{}

= 1, and need not be a contiguous set of sample indices.

q_{upright, k}^{} = f_{q, avg}^{} (q_{{(k - N + 1)}_{}^{⋆}}^{}, \dots, q_{k_{}^{⋆}}^{}),

(14)

see Appendix A Equation (A1).

The approach presented herein improves upon the method proposed by Elvira et al. because: (i) an orientation algorithm is utilized [63] whose estimated inclination angle is immune to magnetic interference [64,65]; (ii) an adaptive error-state Kalman filter (AESKF) is used which enables the estimated orientation to be quickly corrected during ‘quasi-static’ periods [59]; (iii) the estimated heading (i.e., the yaw component,

ψ

°) is removed from the orientation estimated (the importance of which is demonstrated in Section 3.2.1); (iv) most importantly this method is believed to be the first to demonstrate the utility of the shortest rotation between two quaternions as a feature for HAR.

4.2.2. Calculate the Shortest Rotation between the Upright Orientation and the Average Recent Orientation

Once the average recent orientation,

{\bar{q}}_{k}^{}

, and the upright orientation,

q_{upright, k}^{}

, are known, the magnitude of the shortest rotation between them (see Equation (15)) can be calculated (see derivation in Appendix B) and used to distinguish standing and sedentary (seated/lying) periods, regardless of the IMU’s orientation relative to the thigh.

ϑ_{tilt, k}^{} = f_{angle}^{} (q_{upright, k}^{}, {\bar{q}}_{k}^{}),

(15)

see Appendix B Equation (A2).

4.3. Estimating Velocity in the Vertical Direction of the GFR

The inertial acceleration in the sensor frame was obtained by measuring the magnitude of the Earth’s gravitational acceleration,

||y_{a, 0}||

, (i.e., the accelerometer measurement during a quasi-static period, where the accelerometer is not moving) and expressing this measurement in the sensor frame of reference as

^{s} z_{k}^{}

using the accelerometer-corrected attitude,

q_{k}

(see Equations (16) and (17)). The acceleration due to gravity, as measured in the sensor frame of reference,

^{s} g_{ref, k}

, can then be subtracted from the raw accelerometer measurement,

^{s} y_{a, k}^{}

(Equation (18)), to obtain the inertial acceleration in the sensor frame,

^{s} d_{a, k}

.

This acceleration can be expressed in the estimated GFR,

^{g} d_{a, k}^{}

=

{[\begin{matrix} ^{g} d_{a, x}^{} & ^{g} d_{a, y}^{} & ^{g} d_{a, z}^{} \end{matrix}]}_{k}^{}

, using Equation (19). At this point, the sensor’s velocity in the vertical direction of the GFR can be estimated by fusing the vertical component of the acceleration,

^{g} d_{a, z, k}^{}

, with the barometric pressure sensor measurements,

p_{k}^{}

, using a complementary filter [66] or Kalman filter [67]. Assuming the external acceleration,

{\ddot{z}}_{k}^{}

=

^{g} d_{a, z, k}^{}

, remains constant over the sampling interval, T =

\frac{1}{f_{IMU}^{}}

, and the bandwidth of

{\ddot{z}}_{k}^{}

is less than

\frac{f_{IMU}^{}}{2}

, the time-propagation of the altitude, z and velocity,

\dot{z}

, can be modeled [68] according to Equation (20):

^{s} z_{k}^{} = {[\begin{matrix} 2 (q_{1} q_{3} - q_{0}) q_{2}) & 2 (q_{2} q_{3} + q_{0} q_{1}) & 2 {(q_{0})}^{2} - 1 + 2 {(q_{3})}^{2} \end{matrix}]}_{k}^{}

(16)

^{s} {\overset{˘}{g}}_{ref, k} = ||^{s} y_{a, 0}|| \cdot^{s} z_{k}^{}

(17)

^{s} d_{a, k}^{} =^{s} y_{a, k} -^{s} {\overset{˘}{g}}_{ref, k}

(18)

{[\begin{matrix} 0 & ^{g} d_{a, x}^{} & ^{g} d_{a, y}^{} & ^{g} d_{a, z}^{} \end{matrix}]}_{k} = q_{}^{} \otimes {[\begin{matrix} 0 & ^{s} d_{a, x}^{} & ^{s} d_{a, y}^{} & ^{s} d_{a, z}^{} \end{matrix}]}_{k}^{} \otimes q_{}^{*}

(19)

\begin{array}{l} {[\begin{matrix} z \\ \dot{z} \end{matrix}]}_{k}^{} & = & {[\begin{matrix} 1 & T \\ 0 & 1 \end{matrix}]}_{k}^{} & {[\begin{matrix} z \\ \dot{z} \end{matrix}]}_{k - 1} & + {[\begin{matrix} \frac{T^{2}}{2} \\ T \end{matrix}]}_{k} & {\ddot{z}}_{k} \\ x_{k} & = & A_{k} & x_{k - 1} & + & G_{k} u_{k} + w_{k} \end{array}

(20)

4.3.1. Process Model

Imperfections in Equation (20), i.e., acceleration not being constant during the sampling interval, and noise in the acceleration input to the system,

u_{k}^{}

, prevent the system’s true state,

x

, from being observed. Consequently, the system’s state can only be estimated as

{\overset{˘}{x}}_{k}

, by combining the process model with measurements obtained directly from the system. The ‘prediction step’ (i.e., Equations (22) and (23)) produces an a priori estimate of the system’s state,

{\overset{˘}{x}}_{k}^{-}

, and covariance,

P_{k}^{-}

. Note: (i)

Q_{k}^{}

is the process noise covariance matrix, (ii)

\frac{a_{m}^{}}{2}

≤

σ_{acc}

≤

a_{m}^{}

, where

a_{m}^{}

is the magnitude of the maximum acceleration the system will experience [68].

{\overset{˘}{x}}_{k}^{-} = A_{k}^{} {\overset{˘}{x}}_{k - 1}^{+} + G_{k}^{} u_{k}^{}

(21)

P_{k}^{-} = A_{k}^{} P_{k - 1}^{+} A_{k}^{T} + Q_{k}^{}

(22)

Q_{k}^{} = G_{k}^{} G_{k}^{T} σ_{acc}^{2} = [\begin{matrix} \frac{1}{4} T^{4} & \frac{1}{2} T^{3} \\ \frac{1}{2} T^{3} & T^{2} \end{matrix}] σ_{acc}^{2}

(23)

4.3.2. Observation Model

The observation model (Equation (24)) transforms the state estimate,

{\overset{˘}{x}}_{k}^{}

, to the domain of the barometric pressure sensor

p_{k}^{}

(i.e., it converts altitude (in m) to air pressure (hPa) [69]), and enables the measurement residual,

{\tilde{y}}_{k}

, to be calculated (Equation (26)). The measurement residual has a covariance

S_{k}^{}

that combines the covariance of the a priori state estimate,

P_{k}^{-}

, and variance in the measurement from the barometric pressure sensor,

R_{k}^{} = σ_{bar}^{2}

(Equation (27)); i.e., the variance in the barometric pressure when the device remains stationary. The gain in the filter,

K_{k}

, can be determined by consolidating the covariances of the a priori state estimate and measurement residual (Equation (28)), thereby enabling the a posteriori state estimate,

{\overset{˘}{x}}_{k}^{+}

, and covariance,

P_{k}^{+}

, to be determined as described in Equations (29) and (30). Note: (i)

H_{k}^{}

is the Jacobian of

h (x_{k}^{})

, that is, derivatives with respect to the elements of the state vector

x_{k}^{}

, evaluated at the estimate

x_{k}^{} = {\overset{˘}{x}}_{k}^{}

; (ii)

I_{2}^{}

a

2 \times 2

identity matrix.

h (x_{k}^{}) = p_{0}^{} {(1 - \frac{z_{k}^{}}{44330.77})}^{5.26}

(24)

H_{k}^{} = [\begin{matrix} \frac{\partial h_{}^{}}{\partial z_{}^{}} & \frac{\partial h}{\partial \dot{z}} \end{matrix}] |_{h (x = x_{k}^{})}

(25)

{\tilde{y}}_{k} = p_{k}^{} - h ({\overset{˘}{x}}_{k}^{-})

(26)

S_{k}^{} = H_{k}^{} P_{k}^{-} H_{k}^{T} + R_{k}^{}

(27)

K_{k}^{} = P_{k}^{-} H_{k}^{T} S_{k}^{- 1}

(28)

{\overset{˘}{x}}_{k}^{+} = {\overset{˘}{x}}_{k}^{-} + K_{k}^{} {\tilde{y}}_{k}^{}

(29)

P_{k}^{+} = (I_{2}^{} - K_{k}^{} H_{k}^{}) P_{k}^{-}

(30)

It is hoped that the Kalman-filtered velocity estimate,

{\dot{z}}_{k}

, was able to distinguish between walking periods, upstairs (

{\dot{z}}_{k} > > 0

), downstairs (

{\dot{z}}_{k} < < 0

), and on a level surface (

{\dot{z}}_{k} \approx 0

). This would extend the utility of the Kalman-filtered velocity estimate, beyond applications in fall detection [70], for example.

5. Hierarchical Description of Human Activity

Rather than use one supervised machine learning algorithm to perform HAR, a hierarchical model of human activity (HMHA) [71,72] was devised and translated into a feature-based model (see Figure 4). A decision tree based on the classification and regression tree (CART) algorithm developed by Brieman [73] was trained for each node of the model and pruned so that there is only one leaf node for each activity class (see an example in Figure 5b). This approach minimized over-fitting [74], ensured that the model was easily interpreted, and makes the process of HAR tractable in the event of misclassification [75]. In addition, the weights of each class were balanced when the decision tree was trained to ensure that the thresholds selected accounted for any class imbalances [76].

6. Models and Performance Metrics

6.1. Performance at a High Sampling Rate

A number of models were developed in which a model for recognizing human activity was trained using all of the sensor data collected from the younger and/or older cohorts using either: (i) the Original Features, i.e., features (1)–(4) in Table 2; (ii) the New Features (i.e., features (5)–(8) in Table 2), and; (iii) four pairs of features (i.e., features 1 and 5; features 2 and 6; features 3 and 7; features 4 and 8 from Table 2) were provided to four separate instances of the CART algorithm to select the four Best Features to separate the human activities into distinct classes according to the structure of the HMHA described in Figure 4a; these pairings represent features which are similar in terms of the information they captures, e.g., features 1 and 5 capture angular velocity information in subtly different ways.

The robustness of each model for HAR was evaluated by virtually re-orienting the device (as described in Section 3.1) to obtain data from the younger and/or older cohorts that are representative of five different device orientations (see Figure 1b–f). Each model’s performance was evaluated by training the model with either of the younger and/or older cohorts data and testing the model with either of the younger and/or older cohorts data after it had been virtually re-oriented, using 10-fold cross-validation. Ninety-five percent confidence intervals (95%CIs) were calculated for the: Cohen’s kappa (

κ

) and total classification sensitivity (%), as well as the sensitivity (%) and specificity (%) of each activity class. This process is repeated for the ‘Best Features’ (determined in the above search procedure).

6.2. Translating Performance to Different Sampling Rates

Finally, the HMHA are evaluated by training the model with data from the younger and/or older cohort at either (i) the original sampling rate (i.e.,the IMU sampled at 100 Hz, and the barometric altimeter data sampled at 16 Hz), or (ii) a reduced sampling rate (i.e., the IMU resampled at 40 Hz, and the barometric altimeter data resampled at 20 Hz), and testing the model with the virtually rotated data from the younger and/or older cohort at the reciprocal sampling rate to determine if the performance and thresholds of the model are consistent. The metrics reported in Section 6.1 were also used to evaluate the model’s performance.

7. Results and Discussion

7.1. Comparing Features Using Shannon Entropy

When the Shannon entropy [77] of the training datasets (i.e., Figure 6a,c) or testing datasets (i.e., Figure 6b,d) were compared for the features

^{g} {\bar{ω}}_{x y, k}^{2}

and

{\bar{ω}}_{bpf, k}^{2}

, two things become evident. Firstly, both features appear to be orientation invariant because the Shannon entropy is constant whether or not it is calculated from the training data or test data (i.e., here the test data was the virtually re-oriented training data). Secondly, the Shannon entropy was reduced by 0.085 bits when the quaternion-derived feature was used in place of the original feature proposed in our previous work, showing an improvement in the separation of the class distributions [56].

The features

{\bar{Θ}}_{tilt, k}

and

{\bar{ϑ}}_{tilt, k}

can be used to further separate the inactive class into standing and sedentary (i.e., sitting or lying) classes. When the Shannon entropy was calculated for both tilt angle features using the training data (see Figure 7a,c, respectively) the Shannon entropy [77] dropped by 0.142 bits when

{\bar{ϑ}}_{tilt, k}

was used in place of

{\bar{Θ}}_{tilt, k}

. A more pronounced difference of 0.669 bits was observed between

{\bar{Θ}}_{tilt, k}

and

{\bar{ϑ}}_{tilt, k}

when the Shannon entropy was calculated using the test data (see Figure 7b,d, respectively). Whilst the Shannon entropy of

{\bar{ϑ}}_{tilt, k}

increases by 0.291 bits when the test data are used in place of the training data, the shape of the normalized frequency distribution is more consistent for all device re-orientations when compared with

{\bar{Θ}}_{tilt, k}

which increased by 0.818 bits for the re-oriented (test) data. This suggests that the quaternion-derived feature,

{\bar{ϑ}}_{tilt, k}

is more robust to how a smartphone is initially placed in the pocket.

Whilst the new feature,

{\bar{ϑ}}_{tilt, k}

, appears to improve the recognition rate of both standing and sedentary periods of activity, using the change in the shortest rotation between the upward and average orientations,

Δ {\bar{ϑ}}_{tilt, k}

to distinguish between postural transitions and periods of walking (i.e., walking upstairs, walking downstairs, or walking on a level surface) does not. This is evident by the increase in Shannon entropy (when

{\bar{ϑ}}_{tilt, k}

is compared to

Δ {\bar{ϑ}}_{tilt, k}

) whether or not the feature values are generated from the training or testing data, i.e., from 0.463 bits to 0.866 bits, or from 0.463 bits to 1.049 bits, respectively (see Figure 8). Additionally, since the Shannon entropy of

{\bar{a}}_{lpfdif, k}^{2}

remains constant at 0.463 bits irrespective of the dataset used, it confirms that the feature previously described [56] is orientation invariant, as expected.

Conversely, both the average differential pressure,

Δ P_{k}^{}

, and the velocity in the vertical direction of the estimated GFR,

{\bar{v}}_{z, k}^{}

, are orientation invariant as evident by the Shannon entropy which remains constant whether or not the training or testing data are used, for both the original feature (Figure 9a,b) and the quaternion-derived feature (Figure 9c,d). The Shannon entropy of

Δ P_{k}^{}

(1.795 bits) is substantially smaller than

{\bar{v}}_{z, k}^{}

(4.070 bits) which suggests that the estimated velocity in the vertical direction (obtained by fusing vertical acceleration and barometric pressure using an extended Kalman filter) of the estimated GFR is not as useful in distinguishing between walking on flat or inclined surfaces when compared to using the rate of change of pressure measured by the barometric altimeter alone.

Although speculative, it is likely that large amplitude accelerations measured by the IMU in the pants pocket during walking is masking the subtle pattern changes in vertical acceleration associated with ascending/descending stairs. It is plausible that if the accelerometer had been placed in a chest pocket, an improved estimate of vertical acceleration may have been obtained by the Kalman filter.

7.2. Comparing the Overall Performance of Models for HAR

When the HMHA is trained and tested with data from the cohort of younger adults (at the original sampling rate:

f_{IMU}

= 100 Hz;

f_{BAR}

= 16 Hz), or trained with the older cohort, and tested on the data from the younger cohort after it has been re-oriented, the performance improvement of the models (i.e., the 95% confidence interval of the Cohen’s kappa,

κ_{{CI 95}^{}}

, and total class sensitivity,

ϱ_{{CI 95}^{}}

) are negligible. For two out of the three remaining scenarios, there are substantial improvements in the model’s performance when the quaternion-derived features developed herein (i.e.,

^{g} {\bar{ω}}_{x y, k}^{2}

and

{\bar{ϑ}}_{tilt, k}

) are incorporated into the process of human activity recognition. This can be observed in Table 3 when the model is trained with data from the older cohort and tested with the data from the older cohort after it has been re-oriented (i.e.,

κ_{{CI 95}^{}}

increases from

[0.685, 0.697]

to

[0.721, 0.733]

;

ϱ_{{CI 95}^{}}

increases from

[77.6 %, 78.5 %]

to

[79.9 %, 80.7 %]

), as well as when the model is trained with data from both cohorts and tested with the data from both cohorts after it has been re-oriented (i.e.,

κ_{{CI 95}^{}}

increases from

[0.702, 0.713]

to

[0.732, 0.742]

;

ϱ_{{CI 95}^{}}

increases from

[78.4 %, 79.2 %]

to

[80.3 %, 81.1 %]

).

It is particularly noteworthy that the performance of the model trained with the ‘Best Features’ using the data collected from the younger cohort and tested with the ‘best beatures’ using the data collected from the older adults after it has been re-oriented (i.e.,

κ_{{CI 95}^{}}

=

[0.782, 0.793]

;

ϱ_{{CI 95}^{}}

=

[84.9 %, 85.6 %]

) is comparable to the performance of the model trained with the data collected from the older cohort and tested with the data from the younger cohort (i.e.,

κ_{{CI 95}^{}}

=

[0.765, 0.784]

;

ϱ_{{CI 95}^{}}

=

[82.3 %, 83.7 %]

). This contradicts the finding of our previous work [56] in which the performance of a model for HAR trained on younger cohorts degraded substantially when tested on older cohorts (due to the use of the tilt angle feature,

{\bar{Θ}}_{tilt, k}^{}

, which was not orientation invariant (see Table 3, the column labeled ‘Original Features’)), compared to the opposite scenario in which the model is trained with the older cohort’s data and tested with the data from the younger cohort, which gives the better performance.

The improvements in total classification sensitivity and Cohen’s kappa gained by incorporating the quaternion-derived features (see Table 3 column labeled ‘Best Features’) persist when the data from the IMU are re-sampled at a reduced rate (see the column labeled ‘Best Features

^{†}

’ in Table 3). This demonstrates the robustness of both the features and the HMHA to a decrease in the sample rate, which is an important design consideration given the limited battery life of wearable sensors (i.e., smartphones, smartwatches, etc.). Interestingly, there are marginal improvements in Cohen’s kappa (i.e., from

κ

=

[0.732, 0.742]

when

f_{IMU}

= 100 Hz to

κ

=

[0.778, 0.787]

when

f_{IMU}

= 40 Hz) and total class sensitivity (i.e., from

ϱ_{CI 95}

=

[80.3 %, 81.1 %]

when

f_{IMU}

= 100 Hz to

ϱ_{CI 95}

=

[84.1 %, 84.8 %]

when

f_{IMU}

= 40 Hz) when the model is trained with the data from the younger and older cohorts and tested with the data from the younger and older cohorts after it has been re-oriented. Upon analyzing the class sensitivity of these two hierarchical models of human activity, it is evident that this is primarily due to an increase in the sensitivity of detecting the walking class, from ~72% to ~82% (see Figure 10xv,xx). This improvement can be attributed to the use of the quaternion derived feature,

^{g} {\bar{ω}}_{x y, k}^{2}

which measures the amount of pitch/roll rotation in the estimated GFR, a more consistent frame of reference than the local sensor frame.

7.3. Identifying Which Features Drive Model Performance

When Figure 10i,xi are compared (i.e., a HMHA trained with the Original Features extracted from the younger cohort and a HMHA trained with the Best Features extracted from the younger cohort), the differences in the model’s performance become apparent. Most notably, the sensitivity for the postural transition class increased from 70.05% to 86.64%. This improved performance is a by-product of modest increases in the model’s ability to identify standing (i.e., the standing class sensitivity increased from 88.29% to 89.98%) and sedentary periods of activity (i.e., the class sensitivity increased from 93.67% to 94.42%). These pieces of evidence support the argument that the quaternion-derived feature,

^{g} {\bar{ω}}_{x y, k}^{2}

, is better at distinguishing periods of activity (i.e., walking, or postural transitions) from periods of inactivity (i.e., standing or sedentary), a trend which is consistent across each of the five training and testing scenarios proposed in Section 6. However, it is likely that

^{g} {\bar{ω}}_{x y, k}^{2}

would not be as effective in this task if the smartphone is placed in the user’s chest pocket, which pitches and rolls less when compared to the thigh (i.e., it rotates less about the x and y axis of the estimated GFR) during walking.

The underlying causes for the improvement in the HMHA become clear after analyzing the sensitivity of the activity classes listed in Figure 10. When the columns corresponding to the HMHAs built using the Original Features and New Features (e.g., when Figure 10i is compared to Figure 10vi, Figure 10ii to Figure 10vii, and so on) were compared, it is evident that the model’s sensitivity to periods of walking upstairs decreased dramatically (e.g., in the case of (i) and (v), from 84.21% to 58.16%) when the differential pressure,

Δ P_{k}

, was replaced with the moving average velocity in the vertical direction of the estimated GFR,

{\bar{v}}_{z, k}

. This persisted whether the data from the younger or older cohort was used (i.e., when the columns entitled ‘Original Features’ and ‘New Features’ of Figure 10 are compared, the model’s sensitivity to periods of walking either upstairs or downstairs is reduced).

Δ P_{k}

is superior to

{\bar{v}}_{z, k}

in estimating vertical velocity and hence walking on stairs.

On the other hand, when

{\bar{Θ}}_{tilt, k}

was substituted with

{\bar{ϑ}}_{tilt, k}

, the sensitivity of the model to standing classes increases (from 60–70% to >80%) when the HMHA was trained with: (a) the older cohort’s data and tested with the older cohort’s data after it had been re-oriented (the second row in Figure 10); (b) the younger cohort’s data and tested with the older cohort’s data after it had been re-oriented (the fourth row in Figure 10); (c) the data from both cohorts and tested with the data from both cohorts after it had been re-oriented. This improvement underscores the utility of learning the orientation of the device when the body is definitely upright (i.e., when walking), demonstrating how this method can intuitively account for the variability in sensor measurements which may arise due to inconsistent device orientation when the IMU is placed on the body.

In addition, the rate of misclassification of sedentary and stationary periods of activity as postural transitions decreases markedly. This phenomena is consistent across the five scenarios evaluated (recall Section 6). When the columns labeled ‘Original Features’ and ‘Best Features’ are compared row by row, periods of standing that were originally classified as postural transitions are all but eliminated (e.g., compare Figure 10i and Figure 10xi), whilst the misclassification rate of sedentary activity as postural transitions decreased from ~16% to ~9% (compare Figure 10i and Figure 10xi); ~32% to ~5% (compare Figure 10ii and Figure 10xii); ~9% to ~5% (compare Figure 10iii and Figure 10xiii); ~33% to ~10% (compare Figure 10iv and Figure 10xiv); ~29% to ~7% (compare Figure 10v and Figure 10xv).

7.4. Comparing Model Performance at Different Sampling Rates

Due to the limited battery life of smartphones, it is becoming increasingly important that algorithms for human activity recognition are able to operate at a reduced sampling rate without suffering a degradation in classification accuracy. Consequently, the robustness of the model’s developed with the ‘Best Features’ were evaluated by training the model with the data collected from the younger and/or older cohort at 100 Hz, and testing the model’s performance with data from the younger and/or older cohort at 40 Hz after it had been virtually re-oriented (and vice versa). From Table 4 it is evident that both the Cohen’s kappa, and total class sensitivity of the HMHA proposed in Figure 4d remain consistent (i.e., the 95% confidence intervals overlap for almost all of the training and testing combinations evaluated) whether or not the HMHA is trained with the data at 100 Hz (i.e., the higher sampling rate) and tested with the re-oriented data at 40 Hz (i.e., the reduced sampling rate), or the reciprocal scenario in which the HMHA is trained with the data at 40 Hz and tested with the re-oriented data at 100 Hz.

The sole exception to this trend is the scenario in which the data re-sampled at 40 Hz from both the younger and older cohorts are used to train the HMHA, whilst the data sampled at 100 Hz from the younger and older cohorts after it has been re-oriented are used to test the HMHA. In this particular scenario, the ninety-five percent confidence interval of the Cohen’s kappa,

κ

, increased by ~0.04 from

κ_{{CI 95}^{}}

=

[0.732, 0.742]

to

κ_{{CI 95}^{}}

=

[0.776, 0.786]

. Similarly, the ninety-five percent confidence interval of the total class sensitivity increased by ~3% from

ϱ_{{CI 95}^{}}

=

[80.3 %, 81.1 %]

to

ϱ_{{CI 95}^{}}

=

[84.0 %, 84.7 %]

(see the bottom row of Table 4).

After analyzing Figure 11 it is evident that the model’s sensitivity to each activity remains relatively consistent as long as only one of the cohort’s data is used to train the model, and the other cohort’s data is used to test the model, irrespective of the sampling rate (i.e., when Figure 11iii,viii,xiii are compared; Figure 11iv,ix,xiv are compared, and so on). When both cohort’s data are used (i.e., when Figure 11v,x,xv are compared), the sensitivity of the model to the sedentary, standing, and postural transition classes is remarkably consistent whilst the sensitivity of the model to the three different walking, classes varies (whether or not the HMHA is trained with the data re-sampled at 40 Hz or the data sampled at 100 Hz). From Table 5 it is easy to see that this robustness in performance can be attributed to the relatively constant threshold values of

^{g} {\bar{ω}}_{x y, k}^{2}

, and

{\bar{ϑ}}_{tilt, k}

which change by <0.1 (rad

^{2}

·s

^{- 2}

and radians, respectively), suggesting that these features are robust to both the variation in sampling rate and the cohort from which the threshold is extracted (i.e., the threshold changes little whether trained on the younger and/or older cohort’s data).

Interestingly, the recognition rate of the postural transition class also remains fairly consistent (i.e., between ~89–91%), irrespective of the data which are used to train and test the HMHA. This suggests that

{\bar{a}}_{lpfdif, k}^{2}

is also robust to variations in the sampling rate of the IMU and the cohort from which the threshold are determined (see Table 5).

In the case of the walking, walking upstairs, and walking downstairs classes, the differences were negligible when the HMHA was trained with the data sampled at 100 Hz and tested with the same data after it had been re-oriented; or trained with the data at 100 Hz and tested with the data re-sampled at 40 Hz after it had been re-oriented. However, when the model was trained with the data re-sampled at 40 Hz and tested with the data at 100 Hz there were slight changes in the class sensitivity when compared to either of the two previously mentioned scenarios.

In particular, the model’s sensitivity to the walking class increased from ~71–72% to ~82% (trained with data at 40 Hz, tested with data at 100 Hz) due to marked reductions in periods of walking upstairs and walking downstairs being incorrectly identified as walking on a level surface (i.e., from 7.95% to 4.5% and 15.32% to 7.82%, respectively). Similarly, the sensitivity of the walking upstairs class decreased from ~76% to ~68% (see Appendix C, the row labeled ‘Train Y&O Test (Y&O)

^{*}

’ in Table A2) due to the increased misclassification of periods of walking upstairs as periods of walking on a level surface (i.e., from 4–5% to ~10%; see Figure 11 and compare panels (x) and (xv)). This suggests that the smaller threshold of

Δ P_{k}

= 0.092 hPa·s

^{- 1}

(see Table 5) is better (when compared to

Δ P_{k}

= 0.119 hPa·s

^{- 1}

) at distinguishing between periods of walking on a level surface versus walking upstairs.

This trend was mirrored in the reduction of the hierarchical model’s sensitivity to the walking downstairs class; i.e., decreasing from ~91–92% to ~87% (see Appendix C, the row labeled ‘Train Y&O Test (Y&O)

^{*}

’ in Table A2) due to the increased misclassification of periods of walking downstairs as periods of walking on a level surface (i.e., from 3–4% to ~8%; see Figure 11 and compare panels (x) and (xv)). Again, this suggests that the threshold of

Δ P_{k}

= −0.062 hPa·s

^{- 1}

(see Table 5) is better than the threshold of

Δ P_{k}

= −0.094 hPa·s

^{- 1}

(a change of ~51% in the threshold value) at distinguishing between periods of walking on a level surface versus walking downstairs.

7.5. Comparison to the State-of-the-Art

In order to draw a fair comparison with other published work that is representative of state-of-the-art methods, the scope of these comparisons is limited to reports which only utilized the smartphone’s internal sensing components to classify human activity. With this in mind, the state-of-the-art deep learning methods (recall Section 1.3.3) proposed by Ordoñez et al. [45] and Li et al. [78] are excluded because they utilize measurements from multiple IMUs that are placed at different anatomical locations on the body, whilst the works of Ravi et al. [43] and Ronao and Cho et al. [42] are included. Similarly, the ‘feature engineering and classification’-based approaches (recall Section 1.3.2) developed by Bao and Intille et al. [79] are omitted, whilst the works of Anguita et al. [80] and Shoaib et al. [29] are included.

Anguita et al. developed a hardware-friendly multi-class support vector machine which processed the accelerometer and gyroscope data (at 50 Hz) from a waist-worn smartphone (i.e., attached to a belt worn about their waist) to identify activities of daily living in a cohort of thirty participants aged between nineteen and 48 years. From these six channels, they extracted 561 spatial or spectral features (every 1.25 seconds using 50% overlapping windows) to identify six activities with a sensitivity between 72% and 96%: walking (95.6%), walking upstairs (72.1%), walking downstairs (79.7%), standing (92.2%), sitting (96.4%), and lying (100%) [80].

Shoaib et al. evaluated the utility of a smartphone’s internal sensors for the purposes of human activity recognition. They studied ten male participants, aged between 25 and 30 years, whilst a smartphone was firmly fixed to their body with a strap at one of five positions on their body (right and left front jeans pocket, on a belt near the right hip, right wrist, right upper arm). A smartphone application recorded the accelerometer, gyroscope, and magnetometer data at 50 Hz whilst each participant performed six activities of daily living (walking, jogging, sitting, standing, biking, walking upstairs, and walking downstairs) [29]. When features were extracted every two seconds (with 50% overlapping windows), the gyroscope-based features proved most effective in identifying periods of walking upstairs and walking downstairs (particularly when the sensor was placed in the jeans pocket or on the belt), whilst features from the magnetometer should only be used if they are independent of heading. Moreover, they advocate against ‘blindly combining different sensors’, suggesting a more manual approach to system and feature design.

Deep learning approaches attempt to tease out more subtle differences, imperceptible by human observation, in wearable sensor data which can be used for the purposes of HAR. Ronao and Cho, recruited 30 participants (age range not disclosed) to evaluate the performance of a model for HAR, based on deep convolutional neural networks (convnet). The smartphone was placed in a pocket of the participants’ clothing (location on body not disclosed), whilst data from the accelerometer and gyroscope were recorded at 50 Hz [42]. When the data were segmented in 2.5 second intervals with 50% overlap, the convnet could identify six activities: walking (98.99%), walking upstairs (100.00%), walking downstairs (100.00%), standing (93.23%), sitting (88.88%), lying (87.71%); with an overall sensitivity of 94.79%. Before the accelerometer and gyroscope data could be processed, each channel (of six) needed to be normalized by subtracting the mean of each signal, and dividing each channel by the channel’s standard deviation. At this point, 2.5 second data segments are input to a five-layer convnet comprised of three convolutional/pooling layers (with 96, 192, 192 neurons in each layer, respectively), a fully connected layer comprised of 1000 neurons, and a softmax classification layer with six neurons.

Ravi et al. combined features extracted from the spectrogram (i.e., the short-time Fourier transform coefficients) of accelerometer and gyroscope signals (both of which sampled at either 50 Hz or 200 Hz, respectively) with a three layered network comprised of a temporal convolution layer (15 filters, 80 nodes), fully-connected layer, and soft-max classification layer for the purposes of HAR. Data was obtained from ten subjects (using five different smartphones) who were allowed to place the phone anywhere on their body (or in their hand/bag) whilst they performed six activities of daily living. The total class sensitivity of their model for HAR was 95.7% with class sensitivities of ~95% (running), ~95% (walking), ~96% (cycling), ~96% (casual movement), ~96% (public transport), ~98% (idle), and ~74% (standing). Whilst the features derived from the six channel spectrogram enabled highly-variable activities to be distinguished from repetitive activities, the absence of time-domain-based features limited the model’s ability to infer the user’s postural orientation, which was further limited by the fact that the phone could be placed at various parts of the body, in the hand, or in a bag [43].

The model for HAR constructed by Gu et al. [81] implemented denoising autoencoders (two layers, 1000 neurons per layer) combined with a softmax classification layer to automate the HAR process. Features were extracted from two-second intervals of data from the smartphone’s accelerometer, gyroscope, magnetometer, and barometer (all of which were sampled at 64 Hz, except for the barometer which was sampled at 32 Hz). Twelve participants (six male, six female) aged between twenty-five and thirty-five years were recruited to train the model to recognize eight activities of daily living. When the data from all four sensors were used by the denoising autoencoders (corrupting noise level = 0.5, learning rate = 1 × 10

^{- 3}

, weight of sparsity penalty term = 1), the F-measure is 94.04% and the class sensitivity for the eight activities are: stationary (~98%), walking (~92%), stationary but using the phone (~96%), running (~97%), walking upstairs (~94%), walking downstairs (~93%), elevator up (~84%), elevator down (~87%).

A general pitfall of all of the above deep learning approaches is that this approach does not inherently allow the training of the neural network to be constrained by the domain knowledge that the smartphone could be placed anywhere on the body and with any orientation. For deep learning approaches, some safeguards against obtaining a classifier model which is not robust to such variability in smartphone placement and orientation involves collecting large datasets which capture this variability, or to perform preprocessing of the smartphone signals to generate features which are tolerant to such variability; the latter somewhat goes against the spirit of the deep learning approach.

8. Limitations

There are limitations with the study presented herein which need to be acknowledged. The model for HAR developed is dependent on the wearable sensor (i.e., device containing an IMU and barometer, such as a smartphone) remaining in the pants pocket throughout the day, which is not a realistic expectation since the individual’s lower body garments may not always have a suitable pocket, or a pocket large enough to place the wearable sensor. If the wearable sensor is strapped to the thigh, the quaternion-derived feature,

{\bar{ϑ}}_{tilt, k}

, should always be able to separate standing and sedentary periods. If the device is sporadically removed from the pants pocket whilst the person is moving, it is conceivable that the walking detector (Equation (13)) could ‘learn’ an incorrect upright orientation,

q_{upright, k}^{}

, thereby reducing the accuracy of the model for HAR until it relearns the correct upright orientation from the next 2.5 s of true walking data; robustness to this scenario will be evaluated in future work.

9. Conclusions and Future Work

This paper developed a model for HAR capable of recognizing six human activities (standing, sedentary, walk, walk upstairs, walk downstairs, as well as postural transitions between the standing and sedentary classes), regardless of the smartphone’s orientation in the pants pocket by using a quaternion-based complementary filter [63] to estimate the device’s orientation, thereby enabling sensor measurements to be expressed in a consistent frame of reference (the world/global frame). Four New Features were developed, and two were shown to be useful in the classification of human activities, namely

^{g} {\bar{ω}}_{x y, k}^{2}

, which utilized an estimate of the IMU’s orientation to determine the magnitude of the pitch/roll angular velocity, and

{\bar{ϑ}}_{tilt, k}^{}

, which measured the angle between the recent average orientation and the estimated upright orientation; upright orientation was estimated as the average orientation of the IMU when walking was detected. The success of these quaternion-derived features suggest that existing methods for recognizing human activities would benefit from converting all measurements to the global frame of reference where the feature values would be more consistent, especially if the orientation of the IMU with respect to the body is not fixed.

Author Contributions

Conceptualization, M.B.D.R. and S.J.R.; data curation, M.B.D.R.; formal analysis, M.B.D.R.; investigation, M.B.D.R. and S.J.R.; methodology, M.B.D.R., N.H.L. and S.J.R.; project administration, N.H.L. and S.J.R.; resources, N.H.L. and S.J.R.; supervision, N.H.L. and S.J.R.; validation, M.B.D.R.; visualization, M.B.D.R.; writing—original draft, M.B.D.R., N.H.L. and S.J.R.; eriting—review and editing, M.B.D.R., N.H.L. and S.J.R.

Funding

This research was funded by an Australian Research Council Discovery Projects grant (DP130102392).

Acknowledgments

We gratefully acknowledge our colleagues at UNSW, Jingjing Wang and Kejia Wang, and at Neuroscience Research Australia, Stephen Lord, Kim Delbaere, and Matthew Brodie, for their assistance in collecting data for the older cohort.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Average of Multiple Quaternions

Gramkow’s method was used to calculate the average of N quaternions [82]. Note: the quaternion is normalized after each component of

\bar{q}

has been calculated (see Equation (A1)). If the scalar component of a quaternion,

q = [\begin{matrix} q_{0}^{} & q_{1}^{} & q_{2}^{} & q_{3}^{} \end{matrix}]

, in the window was negative (i.e., if

q_{0}^{}

< 0) each component of the quaternion was negated (thereby preserving the rotational information since

q

and

- q

represent the same rotation [83]) so that each quaternion lies in the same half plane.

\begin{matrix} \bar{q} = f_{q, avg}^{} (q_{1}^{}, \dots, q_{N}^{}) = [\begin{matrix} \frac{{\bar{q}}_{0}^{}}{||q||} & \frac{{\bar{q}}_{1}^{}}{||q||} & \frac{{\bar{q}}_{2}^{}}{||q||} & \frac{{\bar{q}}_{3}^{}}{||q||} \end{matrix}] \\ {\bar{q}}_{j}^{} = \frac{1}{N} \sum_{k = 1}^{N} q_{j, k}^{} (where j \in {0, 1, 2, 3}); ||q|| = \sqrt{{\bar{q}}_{0}^{2} + {\bar{q}}_{1}^{2} + {\bar{q}}_{2}^{2} + {\bar{q}}_{3}^{2}} \end{matrix}

(A1)

Appendix B. Shortest Rotation Between Two Quaternions

The rotation that brings two quaternions,

q_{A}^{}

and

q_{B}^{}

, into coincidence is

q_{AB}^{}

=

q_{A}^{} \otimes {(q_{B}^{})}^{*}

, i.e.,

q_{B}^{} \otimes q_{AB}^{}

=

q_{A}^{}

. The shortest angle between these quaternions, 0 ≤

ϑ

≤

π

is given by

q_{AB, 0}^{}

, the scalar component of

q_{AB}^{}

, Equation (A2).

\begin{matrix} ϑ = f_{angle}^{} (q_{A}^{}, q_{B}^{}) = \{\begin{matrix} 2 {cos}^{- 1} (q_{AB, 0}^{}), q_{AB, 0}^{} \geq 0 \\ 2 {cos}^{- 1} (- q_{AB, 0}^{}), q_{AB, 0}^{} < 0 \end{matrix} \end{matrix}

(A2)

Appendix C. Ninety-five Percent Confidence Intervals for the Class Sensitivity and Class Specificity of the Hierarchical Models of Human Activity

Table A1. Ninety-five percent confidence intervals for the sensitivity and specificity of each activity class when HMHA were developed with different feature subsets (

f_{{IMU}^{}}

= 100 Hz;

f_{{bar}^{}}

= 16 Hz).

Table A1. Ninety-five percent confidence intervals for the sensitivity and specificity of each activity class when HMHA were developed with different feature subsets (

f_{{IMU}^{}}

= 100 Hz;

f_{{bar}^{}}

= 16 Hz).

	Activity	Sensitivity (%)			Specificity (%)
		Original	New	Best	Original	New	Best
		Features	Features	Features	Features	Features	Features
	Sedentary	$[92.8, 94.5]$	$[93.6, 95.2]$	$[93.6, 95.2]$	$[99.3, 99.6]$	$[98.4, 98.9]$	$[98.4, 98.9]$
	Standing	$[87.1, 89.5]$	$[88.8, 91.1]$	$[88.8, 91.1]$	$[96.4, 97.2]$	$[96.1, 96.9]$	$[96.1, 96.9]$
Train Y	Walking	$[79.9, 82.7]$	$[25.8, 28.9]$	$[78.2, 81.0]$	$[94.5, 95.5]$	$[97.4, 98.1]$	$[96.0, 96.9]$
Test Y *	Walking Upstairs	$[80.5, 87.9]$	$[53.2, 63.1]$	$[80.5, 87.9]$	$[97.7, 98.3]$	$[93.2, 94.2]$	$[98.0, 98.5]$
	Walking Downstairs	$[90.7, 95.9]$	$[87.2, 93.3]$	$[90.7, 95.9]$	$[97.1, 97.7]$	$[86.6, 87.9]$	$[97.2, 97.9]$
	Postural Transitions	$[64.0, 76.1]$	$[73.4, 84.2]$	$[82.1, 91.2]$	$[98.0, 98.6]$	$[95.3, 96.1]$	$[97.8, 98.3]$

	Sedentary	$[92.9, 93.9]$	$[87.5, 88.8]$	$[87.5, 88.8]$	$[94.2, 94.8]$	$[98.6, 98.9]$	$[98.6, 98.9]$
	Standing	$[66.9, 69.2]$	$[80.9, 82.7]$	$[80.9, 82.7]$	$[97.2, 97.5]$	$[97.2, 97.5]$	$[97.2, 97.5]$
Train O	Walking	$[73.6, 75.0]$	$[24.5, 25.8]$	$[73.4, 74.8]$	$[95.1, 95.7]$	$[96.6, 97.1]$	$[95.9, 96.5]$
Test O *	Walking Upstairs	$[54.4, 62.7]$	$[66.0, 73.8]$	$[54.2, 62.5]$	$[96.4, 96.7]$	$[81.4, 82.2]$	$[96.4, 96.8]$
	Walking Downstairs	$[83.9, 89.8]$	$[47.9, 56.7]$	$[83.9, 89.8]$	$[94.0, 94.5]$	$[90.3, 90.9]$	$[94.0, 94.5]$
	Postural Transitions	$[47.2, 53.2]$	$[83.3, 87.5]$	$[89.9, 93.3]$	$[96.3, 96.7]$	$[94.8, 95.2]$	$[94.9, 95.3]$
	Sedentary	$[73.9, 75.6]$	$[86.6, 87.9]$	$[86.6, 87.9]$	$[97.5, 97.9]$	$[98.7, 99.0]$	$[98.7, 99.0]$
	Standing	$[59.5, 61.9]$	$[79.8, 81.8]$	$[79.8, 81.8]$	$[93.6, 94.1]$	$[97.1, 97.5]$	$[97.1, 97.5]$
Train Y	Walking	$[87.7, 88.7]$	$[23.4, 24.8]$	$[85.7, 86.8]$	$[88.6, 89.5]$	$[96.5, 97.0]$	$[93.3, 94.0]$
Test O *	Walking Upstairs	$[62.7, 70.7]$	$[73.2, 80.3]$	$[62.9, 70.9]$	$[97.9, 98.2]$	$[79.0, 79.8]$	$[98.1, 98.4]$
	Walking Downstairs	$[83.9, 89.8]$	$[44.5, 53.3]$	$[81.5, 87.8]$	$[96.7, 97.0]$	$[91.3, 91.9]$	$[96.9, 97.2]$
	Postural Transitions	$[74.4, 79.5]$	$[78.8, 83.6]$	$[86.5, 90.4]$	$[96.9, 97.3]$	$[96.0, 96.4]$	$[96.6, 97.0]$
	Sedentary	$[94.5, 96.0]$	$[94.9, 96.3]$	$[94.9, 96.3]$	$[98.7, 99.2]$	$[98.1, 98.7]$	$[98.1, 98.7]$
	Standing	$[92.8, 94.7]$	$[88.5, 90.8]$	$[88.5, 90.8]$	$[93.7, 94.7]$	$[96.4, 97.2]$	$[96.4, 97.2]$
Train O	Walking	$[61.1, 64.5]$	$[21.5, 24.4]$	$[62.5, 65.8]$	$[97.6, 98.3]$	$[97.6, 98.3]$	$[97.3, 98.0]$
Test Y *	Walking Upstairs	$[75.7, 83.8]$	$[46.6, 56.6]$	$[75.7, 83.8]$	$[96.9, 97.6]$	$[94.7, 95.6]$	$[96.7, 97.4]$
	Walking Downstairs	$[88.5, 94.3]$	$[84.0, 90.9]$	$[88.5, 94.3]$	$[95.2, 96.0]$	$[86.2, 87.4]$	$[95.0, 95.8]$
	Postural Transitions	$[45.0, 58.3]$	$[73.9, 84.7]$	$[82.6, 91.6]$	$[96.2, 97.0]$	$[92.8, 93.8]$	$[95.4, 96.2]$
	Sedentary	$[92.6, 93.4]$	$[88.9, 90.0]$	$[88.9, 90.0]$	$[95.6, 96.1]$	$[98.7, 98.9]$	$[98.7, 98.9]$
	Standing	$[72.2, 74.0]$	$[82.6, 84.1]$	$[82.6, 84.1]$	$[97.2, 97.5]$	$[97.2, 97.6]$	$[97.2, 97.6]$
Train Y&O	Walking	$[71.9, 73.2]$	$[35.0, 36.4]$	$[71.6, 72.9]$	$[94.8, 95.4]$	$[95.2, 95.7]$	$[95.8, 96.3]$
Test (Y&O) *	Walking Upstairs	$[69.3, 75.1]$	$[48.3, 54.8]$	$[69.3, 75.1]$	$[96.3, 96.7]$	$[87.4, 88.0]$	$[96.4, 96.7]$
	Walking Downstairs	$[89.6, 93.3]$	$[65.1, 71.4]$	$[89.7, 93.4]$	$[93.6, 94.0]$	$[89.2, 89.8]$	$[93.7, 94.1]$
	Postural Transitions	$[54.4, 59.9]$	$[81.7, 85.8]$	$[88.6, 91.9]$	$[96.8, 97.1]$	$[95.1, 95.5]$	$[95.8, 96.1]$

* The test data was obtained by virtually re-orienting the data from the younger (Y) and older (O) cohort as described in Section 3.1.

Table A2. Ninety-five percent confidence intervals for the sensitivity and specificity of each activity when hierarchical models for human activity recognition were developed with the best features.

	Activity	Sensitivity (%)			Specificity (%)
	Train	100 Hz	100 Hz	40 Hz $^{†}$	100 Hz	100 Hz	40 Hz $^{†}$
	Test	100 Hz	40 Hz $^{†}$	100 Hz	100 Hz	40 Hz $^{†}$	100 Hz
	Sedentary	$[93.6, 95.2]$	$[93.0, 94.7]$	$[93.2, 94.9]$	$[98.4, 98.9]$	$[98.4, 98.9]$	$[98.5, 99.0]$
	Standing	$[88.8, 91.1]$	$[88.6, 90.9]$	$[87.8, 90.2]$	$[96.1, 96.9]$	$[95.7, 96.5]$	$[96.3, 97.1]$
Train Y	Walking	$[78.2, 81.0]$	$[76.2, 79.1]$	$[78.8, 81.6]$	$[96.0, 96.9]$	$[96.3, 97.2]$	$[95.3, 96.2]$
Test Y $^{⋆}$	Walking Upstairs	$[80.5, 87.9]$	$[84.6, 91.2]$	$[79.4, 86.9]$	$[98.0, 98.5]$	$[97.7, 98.3]$	$[98.2, 98.7]$
	Walking Downstairs	$[90.7, 95.9]$	$[94.1, 98.1]$	$[90.7, 95.9]$	$[97.2, 97.9]$	$[96.9, 97.5]$	$[96.8, 97.5]$
	Postural Transition	$[82.1, 91.2]$	$[83.2, 91.9]$	$[81.1, 90.4]$	$[97.8, 98.3]$	$[97.9, 98.5]$	$[98.1, 98.6]$
	Sedentary	$[87.5, 88.8]$	$[87.7, 88.9]$	$[87.2, 88.5]$	$[98.6, 98.9]$	$[98.5, 98.8]$	$[98.6, 98.9]$
	Standing	$[80.9, 82.7]$	$[81.0, 82.8]$	$[79.9, 81.8]$	$[97.2, 97.5]$	$[97.1, 97.5]$	$[97.3, 97.7]$
Train O	Walking	$[73.4, 74.8]$	$[73.1, 74.5]$	$[75.6, 77.0]$	$[95.9, 96.5]$	$[95.9, 96.4]$	$[95.7, 96.2]$
Test O $^{⋆}$	Walking Upstairs	$[54.2, 62.5]$	$[58.8, 67.0]$	$[49.3, 57.7]$	$[96.4, 96.8]$	$[96.4, 96.8]$	$[97.3, 97.6]$
	Walking Downstairs	$[83.9, 89.8]$	$[85.8, 91.4]$	$[82.8, 88.9]$	$[94.0, 94.5]$	$[93.7, 94.2]$	$[94.2, 94.7]$
	Postural Transition	$[89.9, 93.3]$	$[87.6, 91.3]$	$[90.3, 93.6]$	$[94.9, 95.3]$	$[95.3, 95.7]$	$[94.4, 94.9]$
	Sedentary	$[86.6, 87.9]$	$[86.4, 87.7]$	$[85.5, 86.9]$	$[98.7, 99.0]$	$[98.7, 99.0]$	$[98.9, 99.1]$
	Standing	$[79.8, 81.8]$	$[80.1, 82.0]$	$[77.1, 79.1]$	$[97.1, 97.5]$	$[97.1, 97.4]$	$[97.5, 97.8]$
Train Y	Walking	$[85.7, 86.8]$	$[86.2, 87.2]$	$[86.9, 88.0]$	$[93.3, 94.0]$	$[93.3, 93.9]$	$[91.9, 92.6]$
Test O $^{⋆}$	Walking Upstairs	$[62.9, 70.9]$	$[69.0, 76.5]$	$[62.6, 70.5]$	$[98.1, 98.4]$	$[98.2, 98.5]$	$[98.5, 98.7]$
	Walking Downstairs	$[81.5, 87.8]$	$[83.6, 89.6]$	$[84.1, 90.0]$	$[96.9, 97.2]$	$[96.9, 97.3]$	$[96.5, 96.9]$
	Postural Transition	$[86.5, 90.4]$	$[86.0, 90.0]$	$[86.5, 90.4]$	$[96.6, 97.0]$	$[96.8, 97.2]$	$[96.9, 97.3]$
	Sedentary	$[94.9, 96.3]$	$[94.5, 96.0]$	$[94.8, 96.3]$	$[98.1, 98.7]$	$[98.0, 98.6]$	$[98.1, 98.7]$
	Standing	$[88.5, 90.8]$	$[88.1, 90.4]$	$[88.4, 90.7]$	$[96.4, 97.2]$	$[96.0, 96.9]$	$[96.4, 97.2]$
Train O	Walking	$[62.5, 65.8]$	$[59.4, 62.8]$	$[64.3, 67.6]$	$[97.3, 98.0]$	$[97.5, 98.2]$	$[97.2, 97.9]$
Test Y $^{⋆}$	Walking Upstairs	$[75.7, 83.8]$	$[80.3, 87.6]$	$[72.9, 81.3]$	$[96.7, 97.4]$	$[96.3, 97.0]$	$[97.6, 98.2]$
	Walking Downstairs	$[88.5, 94.3]$	$[90.2, 95.5]$	$[86.5, 92.8]$	$[95.0, 95.8]$	$[94.5, 95.3]$	$[95.2, 96.0]$
	Postural Transition	$[82.6, 91.6]$	$[83.2, 91.9]$	$[83.2, 91.9]$	$[95.4, 96.2]$	$[95.6, 96.3]$	$[94.8, 95.6]$
	Sedentary	$[88.9, 90.0]$	$[88.7, 89.8]$	$[88.7, 89.8]$	$[98.7, 98.9]$	$[98.6, 98.9]$	$[98.6, 98.9]$
	Standing	$[82.6, 84.1]$	$[82.5, 84.0]$	$[80.5, 82.1]$	$[97.2, 97.6]$	$[97.1, 97.4]$	$[97.6, 97.9]$
Train Y&O	Walking	$[71.6, 72.9]$	$[70.5, 71.8]$	$[81.9, 83.0]$	$[95.8, 96.3]$	$[95.8, 96.3]$	$[94.5, 95.1]$
Test (Y&O) $^{⋆}$	Walking Upstairs	$[69.3, 75.1]$	$[73.4, 78.9]$	$[65.3, 71.3]$	$[96.4, 96.7]$	$[96.4, 96.7]$	$[97.9, 98.1]$
	Walking Downstairs	$[89.7, 93.4]$	$[91.1, 94.6]$	$[85.4, 89.8]$	$[93.7, 94.1]$	$[93.2, 93.7]$	$[96.4, 96.7]$
	Postural Transition	$[88.6, 91.9]$	$[88.0, 91.3]$	$[89.6, 92.7]$	$[95.8, 96.1]$	$[96.0, 96.4]$	$[95.6, 96.0]$

^{⋆}

Test data were obtained by re-orienting the data from the younger (Y) and/or older (O) cohort (see Section 3.1);

^{†}

IMU data were re-sampled at 40 Hz, barometer data were re-sampled at 20 Hz.

References

Appelboom, G.; Camacho, E.; Abraham, M.E.; Bruce, S.S.; Dumont, E.L.; Zacharia, B.E.; D’Amico, R.; Slomian, J.; Reginster, J.Y.; Bruyère, O.; et al. Smart wearable body sensors for patient self-assessment and monitoring. Arch. Public Health 2014, 72, 28–36. [Google Scholar] [CrossRef] [PubMed]
Rosario, M.D.; Lovell, N.H.; Fildes, J.; Holgate, K.; Yu, J.; Ferry, C.; Schreier, G.; Ooi, S.Y.; Redmond, S.J. Evaluation of an mHealth-based Adjunct to Outpatient Cardiac Rehabilitation. IEEE J. Biomed. Health Inform. 2018, 22, 1938–1948. [Google Scholar] [CrossRef] [PubMed]
Seel, T.; Raisch, J.; Schauer, T. IMU-Based Joint Angle Measurement for Gait Analysis. Sensors 2014, 14, 6891–6909. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Dadashi, F.; Mariani, B.; Rochat, S.; Büla, C.J.; Santos-Eggimann, B.; Aminian, K. Gait and Foot Clearance Parameters Obtained Using Shoe-Worn Inertial Sensors in a Large-Population Sample of Older Adults. Sensors 2014, 14, 443–457. [Google Scholar] [CrossRef] [PubMed]
Wang, K.; Delbaere, K.; Brodie, M.; Lovell, N.; Kark, L.; Lord, S.; Redmond, S. Differences between Gait on Stairs and Flat Surfaces in Relation to Fall Risk and Future Falls. IEEE J. Biomed. Health Inform. 2017, 21, 1479–1486. [Google Scholar] [CrossRef] [PubMed]
Casamassima, F.; Ferrari, A.; Milosevic, B.; Ginis, P.; Farella, E.; Rocchi, L. A Wearable System for Gait Training in Subjects with Parkinson’s Disease. Sensors 2014, 14, 6229–6246. [Google Scholar] [CrossRef] [PubMed]
Patel, S.; Lorincz, K.; Hughes, R.; Huggins, N.; Growdon, J.; Standaert, D.; Akay, M.; Dy, J.; Welsh, M.; Bonato, P. Monitoring motor fluctuations in patients with Parkinson’s disease using wearable sensors. IEEE Trans. Inf. Technol. Biomed. 2009, 13, 864–873. [Google Scholar] [CrossRef]
Hubble, R.P.; Naughton, G.A.; Silburn, P.A.; Cole, M.H. Wearable sensor use for assessing standing balance and walking stability in people with Parkinson’s disease: A systematic review. PLoS ONE 2015, 10, 1–22. [Google Scholar] [CrossRef]
Healy, G.N.; Wijndaele, K.; Dunstan, D.W.; Shaw, J.E.; Salmon, J.; Zimmet, P.Z.; Owen, N. Objectively measured sedentary time, physical activity, and metabolic risk: The Australian diabetes, obesity and lifestyle study (AusDiab). Diabetes Care 2008, 31, 369–371. [Google Scholar] [CrossRef]
Tudor-Locke, C.; Brashear, M.; Johnson, W.; Katzmarzyk, P. Accelerometer profiles of physical activity and inactivity in normal weight, overweight, and obese U.S. men and women. Int. J. Behav. Nutr. Phys. Act. 2010, 7, 60–70. [Google Scholar] [CrossRef]
Westerterp, K. Physical activity assessment with accelerometers. Int. J. Obes. Relat. Metab. Disord. J. Int. Assoc. Study Obes. 1999, 23 (Suppl. 3), S45–S49. [Google Scholar] [CrossRef] [Green Version]
Zhang, K.; Werner, P.; Sun, M.; Pi-Sunyer, F.X.; Boozer, C.N. Measurement of Human Daily Physical Activity. Obes. Res. 2003, 11, 33–40. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Trost, S.G.; McIver, K.L.; Pate, R.R. Conducting accelerometer-based activity assessments in field-based research. Med. Sci. Sports Exerc. 2005, 37, S531–S543. [Google Scholar] [CrossRef] [PubMed]
Atallah, L.; Lo, B.; King, R.; Yang, G.Z. Sensor Positioning for Activity Recognition Using Wearable Accelerometers. IEEE Trans. Biomed. Circuits Syst. 2011, 5, 320–329. [Google Scholar] [CrossRef] [PubMed]
Ridgers, N.D.; Salmon, J.; Ridley, K.; O’Connell, E.; Arundell, L.; Timperio, A. Agreement between activPAL and ActiGraph for assessing children’s sedentary time. Int. J. Behav. Nutr. Phys. Act. 2012, 9, 15. [Google Scholar] [CrossRef] [PubMed]
Steeves, J.A.; Bowles, H.R.; Mcclain, J.J.; Dodd, K.W.; Brychta, R.J.; Wang, J.; Chen, K.Y. Ability of thigh-worn ActiGraph and activPAL monitors to classify posture and motion. Med. Sci. Sports Exerc. 2015, 47, 952–959. [Google Scholar] [CrossRef] [PubMed]
Yang, C.C.; Hsu, Y.L. A Review of Accelerometry-Based Wearable Motion Detectors for Physical Activity Monitoring. Sensors 2010, 10, 7772–7788. [Google Scholar] [CrossRef]
Smith, A. Nearly Half of American Adults are Smartphone Owners. 2012. Available online: http://s4.goeshow.com/bricepac/pie/2013/PDF/Software_and_Apps,_in_the_City_and_on_the_Campus_3-13-13.pdf (accessed on 24 May 2017).
Smith, A. Smartphone Ownership 2013. 2013. Available online: http://boletines.prisadigital.com/PIP_Smartphone_adoption_2013.pdf (accessed on 24 May 2017).
Ting, D.H.; Lim, S.F.; Patanmacia, T.S.; Low, C.G.; Ker, G.C. Dependency on Smartphone and the Impact on Purchase Behaviour. Young Consum. 2011, 12, 193–203. [Google Scholar] [CrossRef]
Sarwar, M.; Soomro, T.R. Impact of Smartphone’s on Society. Eur. J. Sci. Res. 2013, 98, 216–226. [Google Scholar]
Fiordelli, M.; Diviani, N.; Schulz, J.P. Mapping mHealth Research: A Decade of Evolution. J. Med. Internet Res. 2013, 15, 1–14. [Google Scholar] [CrossRef]
Neubeck, L.; Lowres, N.; Benjamin, E.J.; Freedman, S.B.; Coorey, G.; Redfern, J. The mobile revolution—Using smartphone apps to prevent cardiovascular disease. Nat. Rev. Cardiol. 2015, 12, 350–360. [Google Scholar] [CrossRef] [PubMed]
Del Rosario, M.B.; Redmond, S.J.; Lovell, N.H. Tracking the Evolution of Smartphone Sensing for Monitoring Human Movement. Sensors 2015, 15, 18901–18933. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Mellone, S.; Tacconi, C.; Schwickert, L.; Klenk, J.; Becker, C.; Chiari, L. Smartphone-based solutions for fall detection and prevention: The FARSEEING approach. Z. Gerontol. Geriatr. 2012, 45, 722–727. [Google Scholar] [CrossRef] [PubMed]
Zhang, S.; McCullagh, P.; Nugent, C.; Zheng, H. Activity monitoring using a smart phone’s accelerometer with hierarchical classification. In Proceedings of the Sixth International Conference on Intelligent Environments (IE), Kuala Lumpur, Malaysia, 19–21 July 2010; pp. 158–163. [Google Scholar]
Mitchell, E.; Monaghan, D.; Connor, N. Classification of sporting activities using smartphone accelerometers. Sensors 2013, 13, 5317–5337. [Google Scholar] [CrossRef] [PubMed]
Marshall, J. Smartphone sensing for distributed swim stroke coaching and research. In Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, 8–12 September 2013; ACM: New York, NY, USA, 2013. UbiComp ’13 Adjunct. pp. 1413–1416. [Google Scholar]
Shoaib, M.; Bosch, S.; Incel, O.; Scholten, H.; Havinga, P. Fusion of smartphone motion sensors for physical activity recognition. Sensors 2014, 14, 10146–10176. [Google Scholar] [CrossRef] [PubMed]
Antos, S.A.; Albert, M.V.; Kording, K.P. Hand, belt, pocket or bag: Practical activity tracking with mobile phones. J. Neurosci. Methods 2014, 231, 22–30. [Google Scholar] [CrossRef] [PubMed]
Khan, A.M.; Tufail, A.; Khattak, A.M.; Laine, T.H. Activity recognition on smartphones via sensor-fusion and KDA-based SVMs. Int. J. Distrib. Sens. Netw. 2014, 2014, 1–14. [Google Scholar] [CrossRef]
Lu, H.; Yang, J.; Liu, Z.; Lane, N.D.; Choudhury, T.; Campbell, A.T. The jigsaw continuous sensing engine for mobile phone applications. In Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems, SenSys ’10, Zurich, Switzerland, 3–5 November 2010; ACM: New York, NY, USA, 2010; pp. 71–84. [Google Scholar]
Anjum, A.; Ilyas, M. Activity recognition using smartphone sensors. In Proceedings of the IEEE Consumer Communications and Networking Conference (CCNC), Las Vegas, NV, USA, 11–14 January 2013; pp. 914–919. [Google Scholar]
Khan, A.M.; Lee, Y.K.; Lee, S.Y.; Kim, T.S. A triaxial accelerometer-based physical-activity recognition via augmented-signal features and a hierarchical recognizer. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 1166–1172. [Google Scholar] [CrossRef]
Khan, A.; Lee, Y.K.; Lee, S.; Kim, T.S. Human activity recognition via an accelerometer-enabled-smartphone using kernel discriminant analysis. In Proceedings of the 5th International Conference on Future Information Technology (FutureTech), Busan, Korea, 21–23 May 2010; pp. 1–6. [Google Scholar]
Henpraserttae, A.; Thiemjarus, S.; Marukatat, S. Accurate activity recognition using a mobile phone regardless of device orientation and location. In Proceedings of the International Conference on Body Sensor Networks, Dallas, TX, USA, 23–25 May 2011; pp. 41–46. [Google Scholar]
Yurtman, A.; Barshan, B. Activity Recognition Invariant to Sensor Orientation with Wearable Motion Sensors. Sensors 2017, 17, 1838. [Google Scholar] [CrossRef]
Yurtman, A.; Barshan, B.; Fidan, B. Activity Recognition Invariant to Wearable Sensor Unit Orientation Using Differential Rotational Transformations Represented by Quaternions. Sensors 2018, 18, 2725. [Google Scholar] [CrossRef]
Preece, S.J.; Goulermas, J.Y.; Kenney, L.P.J.; Howard, D.; Meijer, K.; Crompton, R. Activity identification using body-mounted sensors—A review of classification techniques. Physiol. Meas. 2009, 30, 1–33. [Google Scholar] [CrossRef] [PubMed]
Figo, D.; Diniz, P.C.; Ferreira, D.R.; Cardoso, J.M.P. Preprocessing techniques for context recognition from accelerometer data. Pers. Ubiquitous Comput. 2010, 14, 645–662. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Ronao, C.A.; Cho, S.B. Human activity recognition with smartphone sensors using deep learning neural networks. Expert Syst. Appl. 2016, 59, 235–244. [Google Scholar] [CrossRef]
Raví, D.; Wong, C.; Lo, B.; Yang, G. A Deep Learning Approach to on-Node Sensor Data Analytics for Mobile or Wearable Devices. IEEE J. Biomed. Health Inform. 2017, 21, 56–64. [Google Scholar] [CrossRef] [PubMed]
Milenkoski, M.; Trivodaliev, K.; Kalajdziski, S.; Jovanov, M.; Stojkoska, B.R. Real time human activity recognition on smartphones using LSTM networks. In Proceedings of the International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 21–25 May 2018; pp. 1126–1131. [Google Scholar]
Ordóñez, F.J.; Roggen, D. Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 2016, 16, 115. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Chen, Y.; Hao, S.; Peng, X.; Hu, L. Deep learning for sensor-based activity recognition: A Survey. Pattern Recognit. Lett. 2018. [Google Scholar] [CrossRef]
Grant, P.M.; Ryan, C.G.; Tigbe, W.W.; Granat, M.H. The validation of a novel activity monitor in the measurement of posture and motion during everyday activities. Br. J. Sports Med. 2006, 40, 992–997. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fisher, C.J. Using an Accelerometer for Inclination Sensing; Application Note: AN-1057; Analog Devices: Norwood, MA, USA, 2010; pp. 1–8. [Google Scholar]
Reiff, C.; Marlatt, K.; Dengel, D.R. Difference in caloric expenditure in sitting versus standing desks. J. Phys. Act. Health 2012, 9, 1009–1011. [Google Scholar] [CrossRef]
Mansoubi, M.; Pearson, N.; Clemes, S.A.; Biddle, S.J.; Bodicoat, D.H.; Tolfrey, K.; Edwardson, C.L.; Yates, T. Energy expenditure during common sitting and standing tasks: examining the 1.5 MET definition of sedentary behaviour. BMC Public Health 2015, 15, 516–523. [Google Scholar] [CrossRef]
Petersen, C.B.; Bauman, A.; Tolstrup, J.S. Total sitting time and the risk of incident diabetes in Danish adults (the DANHES cohort) over 5 years: A prospective study. Br. J. Sports Med. 2016, 50, 1382–1387. [Google Scholar] [CrossRef] [PubMed]
Åsvold, B.O.; Midthjell, K.; Krokstad, S.; Rangul, V.; Bauman, A. Prolonged sitting may increase diabetes risk in physically inactive individuals: An 11 year follow-up of the HUNT Study, Norway. Diabetologia 2017, 60, 830–835. [Google Scholar] [CrossRef] [PubMed]
Wilmot, E.G.; Edwardson, C.L.; Achana, F.A.; Davies, M.J.; Gorely, T.; Gray, L.J.; Khunti, K.; Yates, T.; Biddle, S.J.H. Sedentary time in adults and the association with diabetes, cardiovascular disease and death: Systematic review and meta-analysis. Diabetologia 2012, 55, 2895–2905. [Google Scholar] [CrossRef] [PubMed]
Tran, B.; Falster, M.O.; Douglas, K.; Blyth, F.; Jorm, L.R. Health Behaviours and Potentially Preventable Hospitalisation: A Prospective Study of Older Australian Adults. PLoS ONE 2014, 9, e93111. [Google Scholar] [CrossRef] [PubMed]
Biswas, A.; Oh, P.I.; Faulkner, G.E.; Bajaj, R.R.; Silver, M.A.; Mitchell, M.S.; Alter, D.A. Sedentary time and its association with risk for disease incidence, mortality, and hospitalization in adults: A systematic review and meta-analysis. Ann. Intern. Med. 2015, 162, 123–132. [Google Scholar] [CrossRef]
Del Rosario, M.B.; Wang, K.; Wang, J.; Liu, Y.; Brodie, M.; Delbaere, K.; Lovell, N.H.; Lord, S.R.; Redmond, S.J. A comparison of activity classification in younger and older cohorts using a smartphone. Physiol. Meas. 2014, 35, 2269–2286. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Khan, A.; Siddiqi, M.; Lee, S.W. Exploratory data analysis of acceleration signals to select light-weight and accurate features for real-time activity recognition on smartphones. Sensors 2013, 13, 13099–13122. [Google Scholar] [CrossRef]
Chou, J.C.K. Quaternion kinematic and dynamic differential equations. IEEE Trans. Robot. Autom. 1992, 8, 53–64. [Google Scholar] [CrossRef]
Del Rosario, M.B.; Khamis, H.; Ngo, P.; Lovell, N.H.; Redmond, S.J. Computationally-Efficient Adaptive Error-State Kalman Filter for Attitude Estimation. IEEE Sens. J. 2018, 18, 9332–9342. [Google Scholar] [CrossRef]
Welford, B.P. Note on a method for calculating corrected sums of squares and products. Technometrics 1962, 4, 419–420. [Google Scholar] [CrossRef]
Antonsson, E.K.; Mann, R.W. The frequency content of gait. J. Biomech. 1985, 18, 39–47. [Google Scholar] [CrossRef]
Jiménez, A.R.; Seco, F.; Prieto, J.C.; Guevara, J. Indoor pedestrian navigation using an INS/EKF framework for yaw drift reduction and a foot-mounted IMU. In Proceedings of the 2010 7th Workshop on Positioning Navigation and Communication, Dresden, Germany, 11–12 March 2010; pp. 135–143. [Google Scholar]
Del Rosario, M.B.; Lovell, N.H.; Redmond, S.J. Quaternion-Based Complementary Filter for Attitude Determination of a Smartphone. IEEE Sens. J. 2016, 16, 6008–6017. [Google Scholar] [CrossRef]
Madgwick, S.; Harrison, A.; Vaidyanathan, R. Estimation of IMU and MARG orientation using a gradient descent algorithm. In Proceedings of the IEEE International Conference on Rehabilitation Robotics, Zurich, Switzerland, 29 June–1 July 2011; pp. 1–7. [Google Scholar]
Elvira, V.; Nazábal-Renteria, A.; Artés-Rodríguez, A. A novel feature extraction technique for human activity recognition. In Proceedings of the 2014 IEEE Workshop on Statistical Signal Processing (SSP), Gold Coast, VIC, Australia, 29 June–2 July 2014; pp. 177–180. [Google Scholar]
Higgins, W.T. A comparison of complementary and Kalman filtering. IEEE Trans. Aerosp. Electron. Syst. 1975, AES-11, 321–325. [Google Scholar] [CrossRef]
Sabatini, A.; Genovese, V. A sensor fusion method for tracking vertical velocity and height based on inertial and barometric altimeter measurements. Sensors 2014, 14, 13324–13347. [Google Scholar] [CrossRef]
Bar-Shalom, Y.; Li, X.R.; Kirubarajan, T. Estimation with Applications to Tracking and Navigation; John Wiley & Sons, Inc.: New York, NY, USA, 2002. [Google Scholar]
Guo, G. Pressure Altimetry Using the MPL3115A2; Application Note: AN4528; Freescale Semiconductor Ltd.: Austin, TX, USA, 2012; pp. 1–13. [Google Scholar]
Sabatini, A.M.; Ligorio, G.; Mannini, A.; Genovese, V.; Pinna, L. Prior-to- and Post-Impact Fall Detection Using Inertial and Barometric Altimeter Measurements. IEEE Trans. Neural Syst. Rehabil. Eng. 2016, 24, 774–783. [Google Scholar] [CrossRef]
Capela, N.A.; Lemaire, E.D.; Baddour, N. Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients. PLoS ONE 2015, 10, e0124414. [Google Scholar] [CrossRef] [PubMed]
Capela, N.A.; Lemaire, E.D.; Baddour, N. Improving classification of sit, stand, and lie in a smartphone human activity recognition system. In Proceedings of the IEEE International Symposium on Medical Measurements and Applications, Turin, Italy, 7–9 May 2015; pp. 473–478. [Google Scholar]
Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; The Wadsworth and Brooks-Cole Statistics-Probability Series; Taylor & Francis: Abingdon, UK, 1984. [Google Scholar]
Esposito, F.; Malerba, D.; Semeraro, G.; Kay, J. A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 476–491. [Google Scholar] [CrossRef]
Aggarwal, J.; Ryoo, M. Human Activity Analysis: A Review. ACM Comput. Surv. 2011, 43, 1–43. [Google Scholar] [CrossRef]
Chawla, N.V. Overview. In Data Mining and Knowledge Discovery Handbook; Springer: Boston, MA, USA, 2010; pp. 875–886. [Google Scholar]
Shannon, C.; Weaver, W. The Mathematical Theory of Communication; Number v. 1 in The Mathematical Theory of Communication; University of Illinois Press: Champaign, IL, USA, 1949. [Google Scholar]
Li, F.; Shirahama, K.; Nisar, M.A.; Köping, L.; Grzegorzek, M. Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors. Sensors 2018, 18, 679. [Google Scholar] [CrossRef]
Bao, L.; Intille, S.S. Activity Recognition from User-Annotated Acceleration Data. In Pervasive Computing; Ferscha, A., Mattern, F., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 1–17. [Google Scholar]
Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine. In Proceedings of the 4th International Conference on Ambient Assisted Living and Home Care, IWAAL’12, Vitoria-Gasteiz, Spain, 3–5 December 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 216–223. [Google Scholar]
Gu, F.; Khoshelham, K.; Valaee, S.; Shang, J.; Zhang, R. Locomotion Activity Recognition Using Stacked Denoising Autoencoders. IEEE Internet Things J. 2018, 5, 2085–2093. [Google Scholar] [CrossRef]
Gramkow, C. On averaging rotations. Int. J. Comput. Vis. 2001, 42, 7–16. [Google Scholar] [CrossRef]
Dam, E.B.; Koch, M.; Lillholm, M. Quaternions, Interpolation and Animation; Technical Report; University of Copenhagen: Copenhagen, Denmark, 1998. [Google Scholar]

Figure 1. Six common ways that the inertial measurement units (IMU) might be placed in the pants pocket (assuming a seated position). The cylinder in each panel represents the orientation of the participant’s right thigh whilst seated (with the knee to the right-hand side of each image). The dashed lines labeled x, y, and z denote the original device reference frame,

^{1} q

, whilst the orthogonal basis defined by the vectors,

e_{1}^{}

,

e_{2}^{}

, and

e_{3}^{}

illustrate the device orientation generated. In panels (a–d) the IMU is located on the anterior surface of the thigh (i.e., a pocket on the front of the pants), whilst in panels (e,f) the IMU is located on the lateral surface of the thigh (i.e., a pocket on the outer side of the pants).

Figure 1. Six common ways that the inertial measurement units (IMU) might be placed in the pants pocket (assuming a seated position). The cylinder in each panel represents the orientation of the participant’s right thigh whilst seated (with the knee to the right-hand side of each image). The dashed lines labeled x, y, and z denote the original device reference frame,

^{1} q

, whilst the orthogonal basis defined by the vectors,

e_{1}^{}

,

e_{2}^{}

, and

e_{3}^{}

illustrate the device orientation generated. In panels (a–d) the IMU is located on the anterior surface of the thigh (i.e., a pocket on the front of the pants), whilst in panels (e,f) the IMU is located on the lateral surface of the thigh (i.e., a pocket on the outer side of the pants).

Figure 2. Effect of removing the yaw from an arbitrary orientation (the orthogonal basis defined by the vectors

e_{1}^{}

(in blue),

e_{2}^{}

(in red),

e_{3}^{}

(in green) by aligning it with the x-axis of the standard basis

x, y, z

: (a) orientation, with an arbitrary yaw angle; (b) the same orientation with the yaw component removed (see Equations (2)–(5)). Note: the light blue vector is

e_{1, x y}^{}

, the

e_{1}^{}

basis vector of the orientation projected to the

x y

plane; the pitch and roll angles are preserved.

Figure 2. Effect of removing the yaw from an arbitrary orientation (the orthogonal basis defined by the vectors

e_{1}^{}

(in blue),

e_{2}^{}

(in red),

e_{3}^{}

(in green) by aligning it with the x-axis of the standard basis

x, y, z

: (a) orientation, with an arbitrary yaw angle; (b) the same orientation with the yaw component removed (see Equations (2)–(5)). Note: the light blue vector is

e_{1, x y}^{}

, the

e_{1}^{}

basis vector of the orientation projected to the

x y

plane; the pitch and roll angles are preserved.

Figure 3. Two alternative methods for measuring the tilt. The limitations of the traditional tilt angle variable (i.e., the angle between the red basis vector and the z-axis of the global frame of reference (GFR)) are clear, making it impossible to separate standing (stick figure in gold) and sedentary periods (stick figure in grey). When the red basis vector: (a) runs along the length of the individual’s leg, the magnitude of the tilt angle changes by ≈

\frac{π}{2}

radians (i.e., 90°) between standing and sedentary periods, making it easy to discriminate these postures; (b) red basis vector runs along the mediolateral axis of the individuals’s leg, so the magnitude of the tilt angle of the red vector remains relatively unchanged between standing and sedentary periods, resulting in confusion between the two postures.

Figure 3. Two alternative methods for measuring the tilt. The limitations of the traditional tilt angle variable (i.e., the angle between the red basis vector and the z-axis of the global frame of reference (GFR)) are clear, making it impossible to separate standing (stick figure in gold) and sedentary periods (stick figure in grey). When the red basis vector: (a) runs along the length of the individual’s leg, the magnitude of the tilt angle changes by ≈

\frac{π}{2}

radians (i.e., 90°) between standing and sedentary periods, making it easy to discriminate these postures; (b) red basis vector runs along the mediolateral axis of the individuals’s leg, so the magnitude of the tilt angle of the red vector remains relatively unchanged between standing and sedentary periods, resulting in confusion between the two postures.

Figure 4. Illustration of how the activity classes can be separated using (a) a hierarchical description of human activity. A schematic for achieving the separation using the following features: (b) only original (i.e., features (1)–(4)), (c) only new (i.e., features (5)–(8)), and (d) best features from all eight old and New Features (i.e., features (3)–(6)) in Table 2. Each blue rectangle represents a classification and regression tree (CART) [73] implemented in MATLAB 2013b with ‘ClassificationTree.fit’. The CART algorithm used ‘uniform’ prior class probabilities to ensure that the thresholds selected accounted for any class imbalances.

Figure 5. (a) Normalized frequency histogram for the

Δ P_{k}

feature and three activity classes (walk downstairs (in red), walking (in blue), and walk upstairs (in light blue), visualized as stacked bars); (b) An example of the decision tree, used at each node of the HMHA; i.e., the blue rectangles in Figure 4.

x_{1}^{}

and

x_{2}^{}

, are derived from Figure 5a according to the classification and regression tree [73].

Figure 5. (a) Normalized frequency histogram for the

Δ P_{k}

feature and three activity classes (walk downstairs (in red), walking (in blue), and walk upstairs (in light blue), visualized as stacked bars); (b) An example of the decision tree, used at each node of the HMHA; i.e., the blue rectangles in Figure 4.

x_{1}^{}

and

x_{2}^{}

, are derived from Figure 5a according to the classification and regression tree [73].

Figure 6. The normalized histograms of the active (i.e., the walking, walking upstairs, walking downstairs, and postural transition classes pooled together) and inactive (i.e., the standing and sedentary classes pooled together) classes for the features

{\bar{ω}}_{bpf, k}^{2}

and

^{g} {\bar{ω}}_{x y, k}^{2}

. Panels (a,c) are generated from the training data (i.e., the pooled data from the younger and older cohorts, respectively). Panels (b,d) are generated from the test data (i.e., the pooled data from the younger and older cohort after they have been virtually re-oriented using the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note that the Shannon entropy for the quaternion-derived feature is both smaller and consistent, irrespective of the data it is calculated from, which suggests that it will be better at distinguishing the activity classes and is orientation invariant.

Figure 6. The normalized histograms of the active (i.e., the walking, walking upstairs, walking downstairs, and postural transition classes pooled together) and inactive (i.e., the standing and sedentary classes pooled together) classes for the features

{\bar{ω}}_{bpf, k}^{2}

and

^{g} {\bar{ω}}_{x y, k}^{2}

. Panels (a,c) are generated from the training data (i.e., the pooled data from the younger and older cohorts, respectively). Panels (b,d) are generated from the test data (i.e., the pooled data from the younger and older cohort after they have been virtually re-oriented using the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note that the Shannon entropy for the quaternion-derived feature is both smaller and consistent, irrespective of the data it is calculated from, which suggests that it will be better at distinguishing the activity classes and is orientation invariant.

Figure 7. The normalized histograms of the sedentary and standing classes (illustrated in Figure 4) for the features

{\bar{Θ}}_{tilt, k}

and

{\bar{ϑ}}_{tilt, k}

. Panels (a,c) are the histograms obtained when the training data are used (i.e., the pooled data from the younger and older cohorts, respectively). Panels (b,d) are the histograms obtained when the test data are used (i.e., the pooled data from the younger and older cohort after it had been virtually re-oriented using each of the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note that the entropy for the quaternion-derived feature is consistently smaller, which suggests that it will be better at distinguishing between activity classes.

Figure 7. The normalized histograms of the sedentary and standing classes (illustrated in Figure 4) for the features

{\bar{Θ}}_{tilt, k}

and

{\bar{ϑ}}_{tilt, k}

. Panels (a,c) are the histograms obtained when the training data are used (i.e., the pooled data from the younger and older cohorts, respectively). Panels (b,d) are the histograms obtained when the test data are used (i.e., the pooled data from the younger and older cohort after it had been virtually re-oriented using each of the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note that the entropy for the quaternion-derived feature is consistently smaller, which suggests that it will be better at distinguishing between activity classes.

Figure 8. The normalized histograms of the walking (including walking up or down) and postural transition classes (illustrated in Figure 4) for the features

{\bar{a}}_{lpfdif, k}^{2}

and

Δ {\bar{ϑ}}_{tilt, k}

. Panels (a,c) are the histograms obtained when the training data are used (i.e., the data from the younger and older cohorts, respectively). Panels (b,d) are the histograms obtained when the test data are used (i.e., the data from the younger and older cohort after it had been re-oriented with each of the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note that the Shannon entropy for the original feature is consistently lower than the quaternion-derived feature which suggests that it will be better at distinguishing the activity classes.

Figure 8. The normalized histograms of the walking (including walking up or down) and postural transition classes (illustrated in Figure 4) for the features

{\bar{a}}_{lpfdif, k}^{2}

and

Δ {\bar{ϑ}}_{tilt, k}

. Panels (a,c) are the histograms obtained when the training data are used (i.e., the data from the younger and older cohorts, respectively). Panels (b,d) are the histograms obtained when the test data are used (i.e., the data from the younger and older cohort after it had been re-oriented with each of the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note that the Shannon entropy for the original feature is consistently lower than the quaternion-derived feature which suggests that it will be better at distinguishing the activity classes.

Figure 9. The normalized histograms of the walk, walking upstairs, and walking downstairs classes for the features

Δ P_{k}^{}

and

{\bar{v}}_{z, k}^{}

. Panels (a,c) are the histograms obtained when the training data are used (i.e., the data from the younger and older cohorts, respectively). Panels (b,d) are the histograms obtained when the test data are used (i.e., the data from the younger and older cohort after it had been re-oriented with each of the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note how the Shannon entropy of each feature remains constant whether it is computed from the training or test data (which, remember, is the training data re-oriented), confirming that both features are invariant to the initial orientation, as expected.

Figure 9. The normalized histograms of the walk, walking upstairs, and walking downstairs classes for the features

Δ P_{k}^{}

and

{\bar{v}}_{z, k}^{}

. Panels (a,c) are the histograms obtained when the training data are used (i.e., the data from the younger and older cohorts, respectively). Panels (b,d) are the histograms obtained when the test data are used (i.e., the data from the younger and older cohort after it had been re-oriented with each of the quaternions in Figure 1b–f). The bar charts in all panels are ‘stacked’. Note how the Shannon entropy of each feature remains constant whether it is computed from the training or test data (which, remember, is the training data re-oriented), confirming that both features are invariant to the initial orientation, as expected.

Figure 10. The column titled: ‘Original Features’ (i.e., panels (i–v)) correspond to hierarchical models of human activity that were trained and tested with the features developed in our previous work [56] using the hierarchical model of human activity (HMHA) illustrated in Figure 4b; ‘New Features’ (i.e., panels (vi–x)) correspond to hierarchical models of human activity that were trained and tested with the features developed herein using the HMHA illustrated in Figure 4c; ‘Best Features’ (i.e., panels (xi–xv)) correspond to hierarchical models of human activity that were trained and tested with a combination of the original and New Features using the HMHA illustrated in Figure 4d; ‘Best Features

^{†}

’ (i.e., panels (xvi–xx)) equivalent to ‘Best Features’ with the IMU data re-sampled to 40 Hz, the barometer data to 20 Hz.

Figure 10. The column titled: ‘Original Features’ (i.e., panels (i–v)) correspond to hierarchical models of human activity that were trained and tested with the features developed in our previous work [56] using the hierarchical model of human activity (HMHA) illustrated in Figure 4b; ‘New Features’ (i.e., panels (vi–x)) correspond to hierarchical models of human activity that were trained and tested with the features developed herein using the HMHA illustrated in Figure 4c; ‘Best Features’ (i.e., panels (xi–xv)) correspond to hierarchical models of human activity that were trained and tested with a combination of the original and New Features using the HMHA illustrated in Figure 4d; ‘Best Features

^{†}

’ (i.e., panels (xvi–xx)) equivalent to ‘Best Features’ with the IMU data re-sampled to 40 Hz, the barometer data to 20 Hz.

Figure 11. Class Sensitivity for a Hierarchical Model of Human Activity Recognition using the Best Features: (i–v) trained with the IMU data at 100 Hz and tested with the IMU data at 100 Hz after it has been re-oriented; (vi–x) trained with the IMU data at 100 Hz and tested with the IMU data at 40 Hz after it has been re-oriented; (xi–xv) trained with the IMU data at 40 Hz and tested with the IMU data at 100 Hz after it has been re-oriented.

Table 1. Tuning parameters of the computationally-efficient adaptive error-state kalman filter.

Sampling Rate	$^{†} σ_{G}^{}$	$c_{a}^{}$	$c_{m}^{}$	$^{‡}$ $N_{short}^{}$	$^{‡}$ $N_{long}^{}$	$^{⋆}$ $ξ_{a}^{}$	$^{‡}$ $N_{m}^{}$	$^{⋆ ⋆}$ $ξ_{xy}^{}$
$f_{s}^{}$ = 100 Hz	0.01	0.1	0.99	7	49	1	7	5
$f_{s}^{}$ = 40 Hz	0.01	0.1	0.99	3	19	1	3	5

^{†}

rad/s;

^{‡}

samples;

^{⋆}

m s ⁻²; ⋆⋆ normalized units/s.

Table 2. Features extracted from the accelerometer, gyroscope, and barometric altimeter.

No.	Feature	Description
(1)	${\bar{ω}}_{bpf, k}^{2}$ $= \frac{1}{N} \sum_{j = i}^{k} {(ω_{bpf, x}^{2} + ω_{bpf, y}^{2} + ω_{bpf, z}^{2})}_{j}$	average squared band-pass-
(1)		filtered angular velocity
(2)	${\bar{Θ}}_{tilt, k}^{}$ $= \frac{1}{N} \sum_{j = i}^{k} {cos}^{- 1} {(a_{lpf, y}^{} / \sqrt{(a_{lpf, x}^{2} + a_{lpf, y}^{2} + a_{lpf, z}^{2})})}_{j}^{†}$	average inclination angle
(3)	${\bar{a}}_{lpfdif, k}^{2}$ $= \frac{1}{N} \sum_{j = i}^{k} {(a_{lpfdif, x}^{2} + a_{lpfdif, y}^{2} + a_{lpfdif, z}^{2})}_{j}$	average squared band-
(3)		pass-filtered acceleration
(4)	$Δ P_{k}^{}$ $= \frac{1}{N} \sum_{j = i}^{k} \partial p_{j}^{}$	average differential pressure
(5)	$^{g} {\bar{ω}}_{x y, k}^{2}$ $= \frac{1}{N} \sum_{j = i}^{k} {(^{g} ω_{x}^{2} +^{g} ω_{y}^{2})}_{j}$	average squared pitch/roll
(5)		angular velocity
(6)	${\bar{ϑ}}_{tilt, k}$ $= \frac{1}{N} \sum_{j = i}^{k} ϑ_{tilt, j}^{}$	average of the shortest rotation between
(6)		the upright and average orientations
(7)	$Δ {\bar{ϑ}}_{tilt, k}^{}$ $= \| {\bar{ϑ}}_{tilt, k}^{} - {\bar{ϑ}}_{tilt, k - N}^{} \|$	change in the shortest rotation between
(7)		the upward and average orientations
(8)	${\bar{v}}_{z, k}^{}$ $= \frac{1}{N} \sum_{j = i}^{k} {\dot{z}}_{j}^{}$	average velocity in the vertical
(8)		direction of the estimated GFR

Note: i = k−N +1; N–the number of samples in a 2.5 s analysis window (⌊2.5×

f_{IMU}^{} ⌋

); The variables:

ω_{bpf}^{}

,

a_{lpfdif}^{}

,

a_{lpf}^{}

, and

\partial p

correspond to the filtered signals described in [56];

^{†}

The tilt angle [48].

Table 3. Ninety-five percent confidence intervals for the Cohen’s Kappa and total class sensitivity when hierarchical models for human activity recognition were developed with different features.

Cohen’s Kappa ( $κ_{{CI 95}^{}}$ )
Train	Test	Original Features	New Features	Best Features	Best Features $^{†}$
Y	Y $^{⋆}$	$[0.820, 0.837]$	$[0.606, 0.629]$	$[0.827, 0.844]$	$[0.824, 0.841]$
O	O $^{⋆}$	$[0.685, 0.697]$	$[0.478, 0.491]$	$[0.721, 0.733]$	$[0.730, 0.741]$
Y	O $^{⋆}$	$[0.675, 0.688]$	$[0.468, 0.481]$	$[0.782, 0.793]$	$[0.779, 0.790]$
O	Y $^{⋆}$	$[0.760, 0.780]$	$[0.590, 0.613]$	$[0.765, 0.784]$	$[0.761, 0.781]$
Y&O	(Y&O) $^{⋆}$	$[0.702, 0.713]$	$[0.544, 0.556]$	$[0.732, 0.742]$	$[0.778, 0.787]$
Total Class Sensitivity ( $ϱ_{{CI 95}^{}}$ %)
Train	Test	Original Features	New Features	Best Features	Best Features $^{†}$
Y	Y $^{⋆}$	$[86.7, 88.0]$	$[69.0, 70.8]$	$[87.2, 88.5]$	$[87.0, 88.3]$
O	O $^{⋆}$	$[77.6, 78.5]$	$[57.3, 58.4]$	$[79.9, 80.7]$	$[80.6, 81.5]$
Y	O $^{⋆}$	$[77.7, 78.6]$	$[56.3, 57.3]$	$[84.9, 85.6]$	$[84.8, 85.5]$
O	Y $^{⋆}$	$[82.0, 83.5]$	$[67.5, 69.3]$	$[82.3, 83.7]$	$[82.0, 83.5]$
Y&O	(Y&O) $^{⋆}$	$[78.4, 79.2]$	$[63.9, 64.8]$	$[80.3, 81.1]$	$[84.1, 84.8]$

^{⋆}

Test data were obtained by re-orienting the data from the younger (Y) and/or older (O) cohort (see Section 3.1);

^{†}

IMU data were re-sampled at 40 Hz, barometer data were re-sampled at 20 Hz.

Table 4. Ninety-five percent confidence intervals for the Cohen’s Kappa and total class sensitivity (%) when a hierarchical model of human activity (HMHA) was developed with the Best Features at different sampling rates.

		Cohen’s Kappa ( $κ_{{CI 95}^{}}$ )			Total Class Sensitivity ( $ϱ_{{CI 95}^{}}$ %)
Train		100 Hz	100 Hz	40 Hz $^{†}$	100 Hz	100 Hz	40 Hz $^{†}$
	Test $^{⋆}$	100 Hz	40 Hz $^{†}$	100 Hz	100 Hz	40 Hz $^{†}$	100 Hz
Y	Y $^{⋆}$	$[0.827, 0.844]$	$[0.819, 0.837]$	$[0.823, 0.841]$	$[87.2, 88.5]$	$[86.6, 87.9]$	$[87.0, 88.3]$
O	O $^{⋆}$	$[0.721, 0.733]$	$[0.720, 0.732]$	$[0.728, 0.740]$	$[79.9, 80.7]$	$[79.8, 80.6]$	$[80.5, 81.3]$
Y	O $^{⋆}$	$[0.782, 0.793]$	$[0.786, 0.796]$	$[0.777, 0.788]$	$[84.9, 85.6]$	$[85.2, 85.9]$	$[84.6, 85.4]$
O	Y $^{⋆}$	$[0.765, 0.784]$	$[0.752, 0.772]$	$[0.769, 0.789]$	$[82.3, 83.7]$	$[81.3, 82.8]$	$[82.6, 84.1]$
Y&O	(Y&O) $^{⋆}$	$[0.732, 0.742]$	$[0.726, 0.736]$	$[0.776, 0.786]$	$[80.3, 81.1]$	$[79.9, 80.6]$	$[84.0, 84.7]$

^{⋆}

Test data were obtained by re-orienting the data from the younger (Y) and/or older (O) cohort (see Section 3.1);

^{†}

IMU data were re-sampled at 40 Hz, barometer data were re-sampled at 20 Hz.

Table 5. Comparison of thresholds when HMHA recognition were developed with the Best Features at different sampling rates.

Training Data	Feature	Threshold		Rule *
Training Data	Feature	100 Hz	40 Hz	Rule *
Y	$^{g} {\bar{ω}}_{x y, k}^{2}$	0.232	0.195	Inactive $^{†}$ if $^{g} {\bar{ω}}_{x y, k}^{2}$ ≤ threshold, else Active $^{‡}$
O		0.249	0.237
Y& O	(rad $^{2}$ ·s $^{- 2}$ )	0.230	0.202
Y	${\bar{ϑ}}_{tilt, k}$	0.668	0.689	Standing if ${\bar{ϑ}}_{tilt, k}$ ≤ threshold, else Sedentary (i.e., sitting or lying)
O		0.616	0.617
Y& O	(radians)	0.640	0.617
Y	${\bar{a}}_{lpfdif, k}^{2}$	75.6	93.4	Any Walking $^{§}$ if ${\bar{a}}_{lpfdif, k}^{2}$ ≤ threshold, else Postural Transition
O		36.4	32.8
Y& O	(m $^{2}$ ·s $^{- 4}$ )	46.7	46.4
Y	$Δ P_{k}^{}$	−0.107	−0.101	Walking Downstairs if $Δ P_{k}^{}$ ≤ threshold, else Walking
O		−0.067	−0.068
Y& O	(hPa·s $^{- 1}$ )	−0.062	−0.094
Y	$Δ P_{k}^{}$	0.128	0.142	Walking if $Δ P_{k}^{}$ ≤ threshold, else Walking Upstairs
O		0.092	0.105
Y& O	(hPa·s $^{- 1}$ )	0.092	0.119

Inactive

^{†}

— any of the standing or sedentary (sitting/lying) classes; active

^{‡}

— any of the walking, walking upstairs, walking downstairs, or postural transition classes; any walking

^{§}

— either of the walking, walking upstairs, or walking downstairs classes. * Each rule corresponds to a node of the HMHA illustrated in Figure 4d.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Del Rosario, M.B.; Lovell, N.H.; Redmond, S.J. Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities. Sensors 2019, 19, 2845. https://doi.org/10.3390/s19132845

AMA Style

Del Rosario MB, Lovell NH, Redmond SJ. Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities. Sensors. 2019; 19(13):2845. https://doi.org/10.3390/s19132845

Chicago/Turabian Style

Del Rosario, Michael B., Nigel H. Lovell, and Stephen J. Redmond. 2019. "Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities" Sensors 19, no. 13: 2845. https://doi.org/10.3390/s19132845

APA Style

Del Rosario, M. B., Lovell, N. H., & Redmond, S. J. (2019). Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities. Sensors, 19(13), 2845. https://doi.org/10.3390/s19132845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Learning the Orientation of a Loosely-Fixed Wearable IMU Relative to the Body Improves the Recognition Rate of Human Postures and Activities

Abstract

1. Introduction

1.1. Multiple Sensors or a Single Sensor?

1.2. Smartphone-Based Human Activity Recognition

1.2.1. Fixed-to-the-Body

1.2.2. Body-Position-Dependent

1.2.3. Body-Position-Dependent

1.3. Extracting Information for Activity Recognition

1.3.1. Accounting for Variability in Device Orientation and Position

1.3.2. Feature Engineering and Classification

1.3.3. Deep Learning/Deep Neural Networks

1.4. Contribution

2. Materials

3. Methods

3.1. Generating Data Representative of Different Orientations

3.2. Estimating the Orientation of the IMU

3.2.1. Removing the Heading from the Estimated Orientation

3.2.2. Smoothing the Estimated Orientation

4. Feature Extraction

4.1. Squared Magnitude of Pitch/Roll Angular Velocity

4.2. Detecting Sedentary Periods

4.2.1. Estimate the Upright Orientation using the Orientation during Walking Periods

4.2.2. Calculate the Shortest Rotation between the Upright Orientation and the Average Recent Orientation

4.3. Estimating Velocity in the Vertical Direction of the GFR

4.3.1. Process Model

4.3.2. Observation Model

5. Hierarchical Description of Human Activity

6. Models and Performance Metrics

6.1. Performance at a High Sampling Rate

6.2. Translating Performance to Different Sampling Rates

7. Results and Discussion

7.1. Comparing Features Using Shannon Entropy

7.2. Comparing the Overall Performance of Models for HAR

7.3. Identifying Which Features Drive Model Performance

7.4. Comparing Model Performance at Different Sampling Rates

7.5. Comparison to the State-of-the-Art

8. Limitations

9. Conclusions and Future Work

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Average of Multiple Quaternions

Appendix B. Shortest Rotation Between Two Quaternions

Appendix C. Ninety-five Percent Confidence Intervals for the Class Sensitivity and Class Specificity of the Hierarchical Models of Human Activity

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI