Pervasive Lying Posture Tracking

Automated lying-posture tracking is important in preventing bed-related disorders, such as pressure injuries, sleep apnea, and lower-back pain. Prior research studied in-bed lying posture tracking using sensors of different modalities (e.g., accelerometer and pressure sensors). However, there remain significant gaps in research regarding how to design efficient in-bed lying posture tracking systems. These gaps can be articulated through several research questions, as follows. First, can we design a single-sensor, pervasive, and inexpensive system that can accurately detect lying postures? Second, what computational models are most effective in the accurate detection of lying postures? Finally, what physical configuration of the sensor system is most effective for lying posture tracking? To answer these important research questions, in this article we propose a comprehensive approach for designing a sensor system that uses a single accelerometer along with machine learning algorithms for in-bed lying posture classification. We design two categories of machine learning algorithms based on deep learning and traditional classification with handcrafted features to detect lying postures. We also investigate what wearing sites are the most effective in the accurate detection of lying postures. We extensively evaluate the performance of the proposed algorithms on nine different body locations and four human lying postures using two datasets. Our results show that a system with a single accelerometer can be used with either deep learning or traditional classifiers to accurately detect lying postures. The best models in our approach achieve an F1 score that ranges from 95.2% to 97.8% with a coefficient of variation from 0.03 to 0.05. The results also identify the thighs and chest as the most salient body sites for lying posture tracking. Our findings in this article suggest that, because accelerometers are ubiquitous and inexpensive sensors, they can be a viable source of information for pervasive monitoring of in-bed postures.


Introduction
Keeping track of in-bed lying postures and the transitions between them provide useful clinical information about the patients' mobility [1,2], risk of developing hospital-acquired pressure injuries [3], hidden death in epilepsy [4], and quality of sleep [5].In-bed posture and activities of the patients in hospitals are usually monitored manually and through visual observation, which is a labor-and cost-intensive task.Therefore, researchers have proposed to monitor in-bed postures continuously and non-obtrusively using wearable sensors.
Automatic in-bed lying posture tracking systems have been developed using data collected from sensors of different modalities, such as accelerometers [6,7,8], load cell sensors [9], pressure sensors [10,11], infrared cameras [12], electrocardiogram waveforms [13], and multi-modal systems [14,15,16].Pressure mats and load sensor systems impose a high cost to the end-users and often require calibration.The camera-based systems usually encounter setup and privacy issues from the end-users and are more difficult to analyze than the wearable sensors [17].Other works have utilized multiple wearable sensors on different body locations for continuous lying-posture detection, which imposes discomfort to the end-users and impede long-term monitoring.To address these issues, we develop a traditional machine learning (ML) model for lying posture detection using a single accelerometer sensor, which is is an ensemble of decision tree classifiers with hand-engineered time-domain features.
Deep learning (DL) has emerged as the leading approach in the field of computer vision, voice recognition, and natural language processing in recent years.Deep neural networks are known as learners of high-level features for a specific problem domain.This makes DL models suitable models for human posture estimation.Moreover, DL models tend to reduce the overhead of feature engineering compared to traditional machine learning models [18].To date, no studies have explored the possibility of utilizing deep neural networks to acceleration-based lying posture tracking.We develop a deep learning model for lying posture detection using a single accelerometer sensor to investigate the possibility of replacing traditional feature-based machine learning models with deep neural networks, there-fore, reduce the burden of feature-engineering.Our deep learning model, adaptive long short-term memory network (AdaLSTM), is a long short-term memory network (LSTM) that uses an adaptive learning rate method with a decaying learning rate schedule.
More specific contributions of this paper are as follows.We (1) investigate the efficacy of a single accelerometer for lying-posture tracking using feature engineered machine learning models and deep LSTM networks; (2) identify the set of optimal time-domain features for accurate lying posture detection; (3) compare traditional machine learning with deep learning in recognition of lying postures; (4) evaluate nine different body sites to determine the most appropriate site to attach the accelerometer for accurate lying posture tracking.Our main findings as a result of our extensive experimental evaluation are as follows.We identified amplitude, mean, minimum, and maximum values of the lateral and vertical axes of the accelerometer sensor as the optimal set of time-domain features for accurate lying posture tracking.The proposed ensemble tree classifier achieved 97.1% F1-Score and 0.14 CoV when applied to the data from the sensor on the right thigh, and AdaLSTM obtained 95.3% F1-Score and 0.15 CoV when applied to the data from the chest sensor.These results demonstrated the thighs and the chest as the optimum location for the accelerometer sensor for accurate lying posture tracking.

Background & Related Studies
In this section, we discuss the previous studies on lying posture tracking using wearable accelerometer sensors.We divide these studies based on the number of wearable sensors into 1) multi-sensor and 2) single-sensor lying posture tracking.

Multi-Sensor Lying Posture Tracking
A study in [8] by Kwasnicki et al. proposed a lightweight sensing platform for monitoring sleep quality and posture using three wearable accelerometer sensors that were placed on both arms and the chest.They applied a K-nearest neighbor, naive Bayes, and decision tree classifiers on the mean and variance of each axis of the signal from all three accelerometer sensors.Their models achieved 99.5% average accuracy in detecting the four major lying postures (i.e., lying supine, prone, and laterals).In another study fallmann et al. proposed a lying posture detection algorithm using three accelerometer sensors on the chest and the legs.Their algorithm first, classified the postures using the acceleration-moving variance method, into stable and non-stable time windows, then classified the features into the postures prone, supine, and laterals.Their model achieved an average accuracy of 83.6% [19].Moreover, Wrzus et al. [6] developed a 99.7% accurate classification model using chest and thigh accelerometry data based on the angular orientation of the upper body along the vertical axis to classify lying postures.
The above-mentioned algorithms which rely on data from multiple sensors attached to the different locations on the user's body cause discomfort and limit usability, especially for long-term monitoring during sleep.Moreover, the possibility of sensor rotation during sleep might alter the angular axes of the sensors relative to each other, therefore decrease the accuracy of the orientation-based lying posture tracking.In this paper, we proposed lying posture detection algorithms that only use data from a single accelerometer sensor, which can be placed on one of the nine different body locations, including chest, thighs, ankles, arms, and wrists.The proposed models using a single accelerometer are more comfortable to the end-users, therefore, are favored over those using multi-sensor.

Single-sensor Lying Posture Tracking
In a study by Razjouyan et al. in [20], authors developed a lying posture detection algorithm based on a single accelerometer sensor on the chest of the user.They used a logistic regression model on 43 time-domain features which were extracted from the magnitude of the tri-axial accelerometer signal.The proposed model achieved 87.8% accuracy in detecting the lying postures supine, prone, and laterals for 21 users.In another study [7] by Zhang et al., the authors assessed the possibility of using a single accelerometer sensor on the chest to detect the lying posture during sleep.They used linear discriminant analysis (LDA) classifier on the mean value of each axis of the acceleration signal.They achieved an overall accuracy of 99% for classifying lying postures (lying supine, prone, and laterals).However, the authors of this study did not assess the effect of sensor location on the accuracy of lying posture tracking.In another study, Chang et al. developed a system that captured information about sleep events using a smartwatch.Their system distinguished sleep postures supine, prone, and laterals at an average precision of 96%.Their proposed algorithm detected the sleeping postures by combining the position of both hands and classification of features using a template-based Euclidean distance matching approach [21].However, the performance of such a model is highly dependent on the quality of pre-defined hand positions and sleep posture templates.Furthermore, the possible sensor rotations during sleeping might affect the accuracy of hand position recognition; and therefore affect the lying posture detection.Moreover, Jeng et.al [22] proposed a sleep position detection algorithm that achieved 90% accuracy in detecting postures supine, prone and laterals using the data collected from an accelerometer sensor on the wrist of the users.Their proposed model applied a support vector machine classifier with a linear kernel and a random forest of 100 trees on the mean value of the signal.

Methodologies
Figure 1 shows the overall architecture of the proposed ML and DL lying posture tracking.The overall training process includes two steps of data preparation and model development.

Data Preparation
We define an episode of data as a sequence of signals collected from one subject while performing a run of a specific posture (e.g., lying supine).The raw accelerometer signal and lying posture labels (e.g., supine, prone, and lateral) are fed into the data processing unit.The processing unit normalizes the signal to remove possible subject-based variations and segments it into different lying posture episodes based on the labels.

Traditional Lying Posture Tracking
The proposed traditional lying posture tracking consists of two main steps: (1) feature preparation in which an exhaustive set of features are extracted from each input episode; and (2) ensemble model learning, which trains an ensemble of 100 decision trees on the features and lying posture labels.

Feature Preparation
We extract 48 time-domain features from a sliding window over each episode of data (lying supine, prone, and on the left side) with a 50% overlap.We set the window size to the minimum episode length in the training dataset (e.g., 96 samples equal to 3.8 seconds).The selected features, shown in Table 1, are proven to be useful in human posture and activity recognition applications such as amplitude, mean, standard deviation, and angle of the signal [23,24].We compute a set of meta-features f i = {f 1 i , f 2 i , ..., f 48 i } for i th episode by averaging of all the extracted features from multiple windows over that episode.

Ensemble Model Learning
A total of 48 features are obtained from the previous step.We train an ensemble of 100 decision trees on the features and lying posture labels.We choose the bagging technique as the ensemble method to reduce the variance of the decision tree and overfitting to the existing data.A decision tree is selected as the weak learner because of the high dimension of the input features.As shown in Figure 1, we fit 100 decision trees on 100 random subsets of the original dataset with a randomly selected subset of features to minimize the correlation between individual trees.The final prediction is the majority voting on the decision of the individual trees [25].

Deep Lying Posture Tracking
Recurrent Neural Networks (RNNs) are a type of deep learners that are well-suited to model sequential data.However, RNNs fail to learn long-term dependencies in the data due to the problem of vanishing/exploding gradients.Long Short-term Memory (LSTM) network has been introduced to address this issue and capture long-term dependencies from the sequential time series data [26].LSTM networks have shown promising results on time series classification tasks [27,28].LSTM captures long-distance dependencies from sequential data through the integration of memory cells and RRNs [29].Bidirectional long short-term memory (bi-LSTM) networks were introduced as an extension to the LSTM networks.The bi-LSTM architecture consists of two LSTMs that train in two directions; therefore, it is capable of extracting long-term data dependencies in both forward and backward directions [30].Equation 1 and Equation 2show the mathematical formulation of a bidirectional LSTM with L hidden layers.At time-step t, each unit in layer (i) receives a set of parameters from the previous time-step (previous state h (i) t−1 ), and two set of parameter from the previous layer (state of previous layer h (i−1) t and ); one input from the left-to-right RNN and one from the right-to-left RNN.The input to each unit at level (i) is the output of RNN at layer (i − 1) at the same time-step t.The output, y at each time-step t, in Equation 3, is the result of propagating the input parameters through all hidden layers.
In the next section, we define lying posture prediction from the sequences of raw sensor data as an optimization problem, then design a deep learning architecture using Bi-LSTM networks as the solution.
Problem: We have N sequences of variable lengths where each sequence X i is assigned a label Y i = (y i1 , y i2 , ..., yik) using max-likelihood classification, where y ij shows the likelihood of j th class for i th sequence.Given these, the problem is to estimate a set of labels Ŷ = { ŷ1 , ŷ2 , ..., ŷN }, such that the difference between the actual and estimated label sets is minimized.We compute this difference as the cross-entropy of the estimated labels Ŷ and actual label Y for summed over the sequences.
where N is the number of the samples, K is the number of the classes (e.g., three lying postures).Symbols y ij and ŷij are the actual and estimated likelihood of j th class for i th sequence, respectively.Equation 4is not an accurate representation of the error as the sequences might adopt different lengths.Therefore, given a set of scalar numbers M = {m 1 , m 2 , ..., m i } where m i is the length of the sequence X i , Equation 4 can be formulated as below.
where the error is a weighted (e.g., length of sequences) sum of the crossentropy between the actual and estimated labels.The objective of the sequence classification model is to minimize Equation 5.
Deep Learning Architecture: To solve this problem, We design an Adaptive LSTM (AdaLSTM), an LSTM Network with an adaptive learning rate method with a decaying learning rate schedule.AdaLSTM receives the sequences of raw accelerometer sensor data as the input and estimates one label for each sequence.As shown in Figure 1, the input episodes/sequences of raw accelerometer data are fed to a Bi-LSTM layer with ten hidden units.The training process of the bi-LSTM includes back-propagation processes in two directions with the objective of minimizing the error.Three fully connected layers multiply the output of the Bi-LSTM layer (e.g., a sequence of tri-axial accelerometer data) by the matrix of weights and add it by the vector of bias.The output of the fully connected layer is fed to a softmax layer that is a multi-class generalization of the logistic sigmoid function.
We compute the cross-entropy loss for multi-class classification based on the likelihood of the softmax function.We set the maximum number of the epochs equal to 100.We set the initial learning rate and decay rate of the squared gradient moving average to 0.01 and 0.99, respectively.To shorten the amount of padding in the mini-batches and make the training more suitable for CPU, we chose the mini-batches to be short sequences of size 27.Adam optimizer [31] is used for training the neural network through backpropagation.

Datasets & Preprocessing
We perform training and validation of the models on two publicly available datasets.(1) Class-Act dataset [32] which contains three major lying postures including supine, prone, and right side from 22 users, and (2) Integration of the Class Act and Daily & Sport Activities Dataset (DAS) [33] which contain 4 major lying postures including supine, prone, left side, and right side from 8 users.

Class-Act: Datasets from a Human Posture/Activity classification
Class-Act is a human posture and activity classification dataset from 22 healthy participants (7 females and 15 males, ages between 20 and 36) [32].The participants worn nine accelerometer sensors sampled at 30 Hz on nine different body locations, including the chest, left and right thigh, left and right ankle, left and right arm, left and right wrist during the activities, as shown in Figure 2a.Class-Act was collected based on three pre-defined protocols with different combinations of activities or postures in a controlled manner.The activities were walking, sitting, standing, lying supine, lying prone, lying on the left side, kneeling, and crawling.Figure 2b shows the prevalence of the extracted episodes for only lying postures.The duration of different episodes for lying supine, prone, and on the left side were 12.2±3.6,11.9±3.6,and 12.10±3.3seconds, respectively.

Daily and Sports Activities Dataset (DAS)
The DAS dataset contains data from eight subjects (four females and four males, ages between 20 and 30) that performed 19 activities of daily living for five minutes each [33].The participants wore five inertial sensor units embedding a tri-axial accelerometer on the chest, right and left wrist, and right and left thigh.The sensors were calibrated to acquire data at a sampling frequency of 25 Hz.We only used lying supine and lying on

Integrated Dataset
We combined the lying posture episodes from the Class-Act and DAS datasets for the common sensor locations (i.e., chest, right and left wrists, and right and left thighs).We segmented each episode in the DAS dataset into 15 episodes of 500 samples (i.e.20 seconds).Since the DAS dataset only contained episodes of lying on the back (i.e., supine) and lying on the right side, we performed under-sampling in the DAS dataset prior to combining the datasets to maintain the balance.The final dataset contains 75 episodes of supine (i.e., 26.3%), 57 episodes of prone (i.e., 20.0%), 73 episodes of left side (i.e., 25.6%), and 80 episodes of right side (i.e., 28.1%) for the chest, right and left wrists, and right and left thighs.

Comparison Metrics and Implementation Details
We validated the proposed models based on 10 fold validation and leaveone-subject-out (LOSO) validation to minimize overfitting to a specific subject or a specific pattern of performing a lying posture.We report accuracy, F1-Score, and balanced accuracy as evaluation metrics of the proposed models.We perform the coefficient of variation (CoV) analysis [34] to compare the proposed algorithms against the state-of-the-art.
Accuracy is defined as the average effectiveness of the classifier over all the class labels.
where l is the number of class labels.F1-score is defined as the relations between the datas positive labels and the classifier predicted labels.
where l is the number of class labels.Balanced accuracy is defined as the average of the true positives and true negatives for each class label.
where l refers to the number of classes in the classification task.For each class C i (i.e., three lying posture classes), P i is the number of samples with positive label (i.e., C i ), N i is the number of the samples with negative labels (i.e., not C i ), T P i refers to the samples that are correctly classified as C i , F P i are the samples that are falsely classified as C i , T N i are defined as the samples that are correctly not classified as C i , and F N i are the samples that are falsely not classified as C i [35,36].
where precision refers to the average agreement of the actual class labels and classifier-predicted labels, and recall is the average effectiveness of the classifier to identify each class label.Precision and recall are computed by the following equations.
We compute the coefficient of variation (CoV) analysis [34] for each model over different body locations based on the equation below.
where σ and µ are respectively the standard deviation and average of the F1-Score in lying posture detection over different folds.

Results
In this section, we extensively evaluate the performance of the ensemble tree and AdaLSTM classifiers independently and against each other.We report the validation metrics, including accuracy, F1-score, and balanced accuracy for both classification models.We further perform a coefficient of variation (CoV) analysis to compare the performance of the models against the state-of-the-art.

Raw data Inspection
Before our main analysis, we investigate the variations in the pattern of the raw accelerometer sensor data across different subjects and different lying postures on the Class-Act dataset.We visually inspect the data by computing the mean and standard deviation of the acceleration data over all the episodes of data collected from different subjects while maintaining different lying postures.These patterns show two issues with the Class-Act dataset: (1) The accelerometer data captured from one of the subjects was not converted to gravity (g) and was stored in analog format.(2) The data collectors labeled multiple episodes of different lying posture as the same posture for one of the subjects.Figure 3 visualizes the changes in 3-axis acceleration for each lying posture on the Class-Act dataset after resolving the issues mentioned above (e.g., correcting sensor output and labels).The solid line demonstrates the mean, and the shaded area shows the standard deviation for different episodes in a specific lying posture.The results show that the y-axis (vertical) always reports values near 0g.In contrast, the x-axis (lateral) shows a mean value near -8.0g for lying on left side posture, while it almost always reports a mean near 0g for the other two postures.Moreover, the z-axis (horizontal) reports mean accelerations around +9.2g and -7.8g for supine and prone postures, respectively, but 0g for lateral posture.Therefore, that lateral axis (x-axis) appears to be sensitive to the lying on side posture, and the horizontal axis (z-axis) appears to be sensitive to the supine and prone postures.These numbers can be justified because while the user is lying on one-side, the x-axis of the accelerometer sensor on the chest is parallel to gravity, therefore reporting values around g (±10).The same result occurs when the user lies on the back (supine) or front (prone), except the z-axis, becomes parallel to the g and reports values near ±10g.Depending on the direction of the body (supine, prone, and lateral lying postures), these values could be negative or positive.Consequently, we expect these two axes to be more informative in classification compared with the y-axis.

Traditional Machine Learning
In this section, we validate the feature-based ensemble tree classifier in detecting three major lying postures (supine, prone, and left side) using the Class-Act dataset including 12 subjects and nine sensor locations.

Feature Engineering
Figure 4 shows the feature importance of lying posture tracking as determined by the ensemble tree classifier that is trained on the data from the chest, left thigh and wrist sensors.The importance of the features is one of the outputs of the ensemble tree classification.The y-axis in this figure is the estimation of feature importance from the ensemble tree by summing over the changes in the mean squared error because of splits on every feature and dividing the sum by the number of the branch nodes in the tree [37].The x-axis shows features 1 to feature 48 as in Table 1.Based on the results, features 4, 7, 10, and 13 are the sets with the highest importance.Table 1 shows that these features are the median, mean, maximum, and minimum of the vertical axis (e.g., x-axis) of the tri-axial accelerometer sensor.Moreover, features 6, 9, 12, and 15 are the second important set of features.Based on Table 1.These predictors refer to features median, mean, maximum, and minimum of the z-axis of the tri-axial accelerometer signal.These results match the observations on the sensitivity of the lying on the left side to the x-axis and lying supine and lying prone postures to the frontal axis (e.g., z-axis) in Figure 3.

Lying Posture Detection
Table 2 reports the average and standard deviation of accuracy, balanced accuracy, and F1-score of lying posture detection using the ensemble tree classifier.Overall, the body locations such as the chest and the left thigh, which are less susceptible to nocturnal movement during sleep, demonstrate high performance in lying posture detection, While, sensor locations such as the arms and the wrists are poor in lying posture detection.In particular, the ensemble tree classifiers trained on the data collected from the chest, thighs or ankles achieve 89.8% -96.2% accuracy, 89.8%-96.2%balanced accuracy and, 89.8%-96.2%F1-Score.While, these values drop when the sensor is placed on the upper body parts such as the arms and the wrists (78.6%-89.5% accuracy, 62.9%-84.0%balanced accuracy, and 64.1%-81.6%F1-Score).
The performance decline in the upper body parts is originated from inter-subject variations in the placement of the arms and the wrists and nocturnal movements of them such as bending and rotating while maintaining the same lying posture.The trend in the standard deviation of accuracy, balanced accuracy, and F1-Score across different subjects is also in concordance with the hypothesis that more within-subject variation (e.g., 9.2% to 26.7% standard deviation in accuracy, balanced accuracy, and F1-Score) is observed when the sensor is worn on the wrists and the arms comparing to the chest, thighs and ankles (e.g., 6.2% to 15.6% standard deviation in accuracy, balanced accuracy and F1-Score).Moreover, Figure 5 visualizes the confusion matrix of lying posture detection training the ensemble tree classifier using the Class-Act dataset.The confusion matrices of the classifying the thighs, chest, and ankles data show more promising results than the arms and wrists.In particular, the classifiers trained on the left thigh, right thigh, and chest misclassify 5.5%, 8.1%, 5.5%, and 7.6% of the lying episodes.The misclassification rate increases to 15.2% and 15.7% for the left ankle and left arm locations, and 39.1%, 31.9%, and 28.93% for the right ankle, right arm, and right wrist locations.

Deep Sequence Learning
In this section, we evaluate the performance of the AdaLSTM classifier in detecting three major lying postures.Table 3 shows the mean and standard deviation of average accuracy, balanced accuracy, and F1-score of the model using the Class-Act dataset including nine sensor locations, and 12 subjects.AdaLSTM achieves 89.8%-96.2%average Accuracy, 89.8%-96.2%average balanced accuracy, and 89.8%-96.2%average F1-Score when applied to the data collected from the sensor worn on the chest, thighs, or ankles.However, the performance drops to 78.6%-89.5% average accuracy, 67.1%-84.0%average balanced accuracy, and 64.1%-81.6%average F1-Score, for the cases of the sensor on the arms and wrists.The within-subject standard deviation in the accuracy, balanced accuracy, and F1-Score is higher when the sensor is placed on the arms and the wrists (11.7%-23.8%)comparing to the thighs and chest (6.9%-13.7%).Such results could be justified according to the findings in a study by Skarpsno et al. were showed the duration of nocturnal movements while sleeping in the arms and upper back was higher than the thighs in 2,107 subjects [38].AdaLSTM is the most accurate when applied to the data collected from the sensor on the right thigh (96.2% ± 8.1 accuracy, 96.2% ± 8.1 balanced accuracy, and 96.2% ± 8.1 F1-score).The model on the left wrist achieves the lower performance such as 78.6%  ± 12.5% accuracy, 67.1% ± 19.1% balanced accuracy, and 64.1% ± 21.7% F1-Score.
Figure 6 shows the confusion matrices of the AdaLSTM classifier trained on the data from the thighs, ankles, arms, and wrists using Class-Act dataset.The models trained on the chest, left thigh, right thigh, left ankle, and right ankle confuses 2.5% 1.5%, 6.1%, 3.0%, and 8.1% of the lying episodes.However, the number of misclassified episodes increases to 33.5%, 19.8%, 30.9%, and 29.4% for the left arm, right arm, left wrist, and right wrist classifiers.We note that the higher misclassification rate when the sensor is on the left arm than the right arm sensor is due to the confusion of the left side and prone postures.

Deep Learning vs. Traditional Machine Learning
In this section, we investigate the possibility of replacing feature-based machine learning models with deep recurrent neural networks (RNNs).Figure 7 compares the mean and CoV of F1-Score and accuracy metrics between AdaLSTM and ensemble tree classifiers.These classifiers are evaluated using the Class-Act dataset including 12 subjects and nine sensor locations on the body.As shown in Figure 7a, AdaLSTM achieves 2%-10% higher accuracy and 3%-9% higher F1-Score than Ensemble tree classifier when applied to the data collected from the sensor on the chest, the thighs, or the ankles.The gap between the performance of the two classifiers increases to 3%-15% in accuracy when tested on the data collected from the arms or the wrists.Since CoV represents the ratio of variation to the mean of a metric, lower CoV values show a more promising classification performance.As shown in Figure 7b, AdaLSTM achieves lower CoV values over all the sensor locations, therefore, it adopts a better generalization to cross-subject variations comparing to the ensemble tree classifier.This gap between the performance of the models demonstrates that deep RNNs are more capable of capturing higher-level patterns in noisy data with high variance across subjects such as the data collected from the sensor on the wrists, or the arms.
In addition, we performed Kruskal's statistical test between the CoV values of the AdaLSTM and ensemble tree classifiers to identify any significant difference between the median of the two groups.The Kruskal's test on the CoV of F1-Score, and accuracy show p-value of 0.100, and 0.006, respectively.These results could not reject the null hypothesis, therefore, show no significant difference between the performance of the two classifiers.Note that both 0.100, and 0.007 are marginally bigger than the α = 0.005, which suggests collecting more samples.

Comparison with the State-of-the-Art
We compare the performance of the proposed models against the stateof-the-art in lying posture detection using a single accelerometer sensor.The proposed and competing models are described as follows.
• ET is the proposed feature-based classifier, which is an ensemble of decision trees trained on 48 time-domain features.
• AdaLSTM is the proposed deep learning model, which is an adaptive long short-term memory network with Adam optimizer and decaying learning rate.
• LDA, proposed by Zhang et al., is a linear discriminate analysis (LDA) classifier trained on the mean value of the signal [7].
• SVM, proposed by Jeng et al., is a multi-class linear kernel support vector machine classifier trained on the mean value of the tri-axial accelerometer signal [22].
• LSTM is a long short-term memory network with the same structure as the AdaLSTM but with a fixed learning rate of 0.01.

Class-Act Dataset
Table 4 compares the F1-score mean and CoV of the proposed models AdaLSTM and ET against the state-of-the-art deep learning and featurebased models on the Class-Act dataset.The class-act dataset contains data from three major lying postures including supine, prone, and left side and nine different sensor locations.Since CoV shows the ratio of variation to mean for a metric, a lower F1-Score CoV value represents a more promising model.The linear feature-based classifiers including LDA and SVM obtain >88.3% F1-Scores nd <0.26 CoV when applied to data from the thighs, ankles and chest, however, their performance significantly drops to 55.0%-82.2%F1-Score and 0.22-0.36CoV when the sensor is moved to the arms or wrists.The competing deep learning model, LSTM, maintains 83.7%-92.5% mean F1-Score for the thighs, ankles and chest locations, and 51.6%-75.5% mean F1-Score for the arms, and wrists location.AdaLSTM outperforms the competing deep learning and feature-based classifiers over all the body locations except for the Right thigh and right wrist.It achieves 91.5%-98.2%mean F1-Score and 0.06-0.17CoV for the thighs, ankles and chest body locations.It shows the most promising result when applied to the sensor on the left thigh with 98.2% F1-Score and 0.06.These results demonstrate the power of deep learning and salient of the left thigh in detecting the lying postures for a new subject.We note that neural networks with simpler structures such as LSTM with a fixed learning rate in this paper could not extract useful features and patterns from the raw data automatically from limited data, therefore choosing the proper parameters for the deep learning models is a crucial factor in their performance.6 show the confusion matrix of the ET and AdaL-STM classifiers for the sensor on the thighs, and the wrists locations.As shown, both classifiers mislabel a few of the episodes when applied to data from the sensor on the thighs.

Integrated Dataset
Table 5 shows the mean F1-score and CoV values of lying posture detection on the dataset of four major lying postures including supine, prone, left side and right side and five sensor locations including thighs, wrists, and chest.The results are leave-one-subject-out validated because it is a more realistic validation scenario for the application of human lying posture tracking.
As shown in Table 5, ET and AdaLSTM classifiers achieve a promising range of F1-Score (i.e., 63.3% for the right wrist to 97.0% for the chest) across all the body locations.ET classifier obtains the highest mean F1score when the sensor is placed on the right thigh (i.e., 97.3%) and right wrist (i.e., 78.6%) locations, while AdaLSTM achieves the highest F1-Score values for the chest (i.e., 95.3%) and the right wrist(i.e., 74.2%) among all the algorithms.The linear classifiers such as LDA [7] and SVM [22] achieve higher F1-Score (i.e ranged 90.4%-97.2%)than the proposed models when applied to data collected from the sensor on the thighs and the chest.The linear relationship between the lying posture and accelerometer readings causes the superiority of state-of-the-art for these locations.On the other hand, extra movements of the hands during lying introduce noise and non-linearity to the data collected by the sensor placed on these locations, therefore, F1-Score values of linear classifiers drop to 33.5%, and 44.6% for the left and right wrists.
Moreover, the proposed models show lower F1-Score variation to mean ratio comparing to the state-of-the-art techniques.Ensemble tree classifier achieves CoV of 0.14, and 0.15, for the right thigh, the chest, respectively, and AdaLSTM achieves CoV of 0.39, and 0.44 for the left and the right wrist, respectively.While CoV of the linear models including LDA and SVM increases to the range 0.53 − 0.80 for the left and right wrist locations.

Discussion
We compared the accuracy of lying posture tracking of nine different body locations in this study.When the ensemble tree classifier is trained on the data collected from the sensors on the chest and thighs the lying posture tracking achieves the highest performance and the least cross-subject variations, while the wrists and the arms classifiers show the least performance and highest within-subject variations.These results demonstrate that individuals might devise arbitrary and dissimilar hand movements during the same lying posture.Figure 8 Compares the confusion matrices of the ensemble tree classifiers trained on the data from the chest, thighs, and wrists from integrated dataset.The chest left thigh, and the right thigh classifiers confuse 3.7%, 6.6% and 2.1% of the lying episodes, respectively, While this ratio increases to 55.1% for the left wrist and 61.4% for the right wrist sensor.The confusion between the supine and prone postures in the wrists' sensors is caused by the wrist rotations while lying.In particular, The right side posture is mainly confused with the prone when the sensor is on the left wrist and confused with the supine when the sensor is on the right wrist.Moreover, the majority of the confusions between the right side and supine postures occur when the sensor is on the right wrist, and the left side and prone posture occur when the sensor in on the left wrist.These results are mainly due to the similar sensor position during the postures that are confused with each other.For example, the left wrist holds similar positions when the user lays on the back (i.e., supine) and lays on the left side depending on the rotation of the wrist.Also, the right wrist might adopt the same position when the user lays on the front (prone) and lays on the right side.
We further investigated the possibility of replacing traditional machine learning with deep learning.Our study showed that deep RNNs such as LSTM can replace the traditional machine learning classifiers as long as adequately designed.Figure 9 shows the confusion matrices for lying posture detection using AdaLSTM on data collected from the sensor on the chest, the thighs, and the wrists.These results follow a similar trend as the ensemble tree classifier.3.1%, 4.5%, and 2.1% of the lying episodes are misclassified when the sensor is worn on the chest, the left thigh, and the right thigh, respectively.While the misclassification rate increases to 20.3% and 12.2% for the left wrist, and the right wrist classifiers respectively.The AdaLSTM achieve confuses 34.8%, and 49.2% less lying episodes compared to the ensemble tree classifier when the sensor is placed on the left wrist, and the right wrist, respectively.These results show the ability of the deep  RRNS to capture non-linear relations in the data based on the nonlinear operations on a higher level of abstraction.In addition, deep RRNS such as AdaLSTM do not require feature-engineering.One major drawback of deep learning is the inability to interpret extracted features through the deeper layers of the network.Moreover, these models are computationally expensive and require large datasets for training to achieve promising results [29].
The fact that end-to-end deep learning neural networks could not improve the performance significantly comparing to the feature-based classifiers demonstrates the lack of sufficient data as a limitation to this study [18].We believe adding more data to the training dataset will further improve the performance of AdaLSTM especially for the data from the sensor on the wrists and the arms of the users.We are planning to address this issue in two directions: (1) conduct an extensive multi-modality data collection from a large number of participants performing different lying postures, including  main postures and their other variations; (2) produce signal-/feature-level synthesis data using data augmentation techniques such as rotation, permutation, time-wrapping, scaling, magnitude-wrapping, jittering [39], sequence to sequence learning techniques [40], and generative adversarial networks [41].

Conclusion
We implemented a traditional machine learning classifier, ensemble tree with time-domain features, and a deep recurrent neural network, AdaLSTM, with decaying learning rate to detect four major lying postures including, supine, prone, left side and right side, using a single tri-axial accelerometer sensor.We identified amplitude, mean, minimum, and maximum values of the lateral and vertical axes as the optimal set of time-domain features for accurate lying posture tracking using a single accelerometer sensor.We determined the optimal wearing sites of a single accelerometer sensor to accurately detect lying postures.Finally, we evaluated the performance of the proposed models against deep learning and feature-based state-of-the-art lying posture tracking methods using two publicly available human posture and activity datasets.The proposed AdaLSTM using data from the left thigh and AdaLSTM on the chest locations achieved the highest F1-Scores (98.5% for the left thigh and 97.8% for the chest) and lowest coefficient of variations (0.07 for the left thigh and 0.03 for the chest) compared to the other models and sensor locations for the Class-Act dataset.The proposed ensemble tree classifier achieved 97.1% F1-Score and 0.14 CoV when applied to the data from the sensor on the right thigh, and AdaLSTM obtained 95.3% F1-Score and 0.15 CoV when applied to the data from the chest sensor from Integration of Class-Act and DAS datasets including 20 subjects.These results demonstrated the thighs and the chest as the optimum location for the accelerometer sensor for accurate lying posture tracking.

Figure 1 :
Figure 1: The process of training feature-based ensemble trees and Deep LSTM classifiers.

Figure 2 :
Figure 2: (a) Visualization of accelerometer sensor positioning, and (b) activity prevalence for Class-Act dataset

Figure 3 :
Figure 3: Mean and standard deviation of the magnitude of the accelerometer sensor data for different lying postures over all the subjects.

Figure 4 :
Figure 4: Importance of the extracted features from sensor data for lying posture tracking.

Figure 5 :
Figure 5: Confusion matrix of the ensemble tree classifier in classifying lying postures into supine, prone, and left side for the thighs, ankles, chest, arms, and wrists locations.

Figure 6 :
Figure 6: Confusion matrix of AdaLSTM classifier in classifying lying postures into supine, prone, and left side for the thighs, ankles, arms, and wrists locations.

Figure 7 :
Figure 7: Comparison between the mean and CoV of F1-Score (%) of the ensemble tree and AdaLSTM classification models for nine body locations on the Class-Act dataset using LOSO validation.

Figure 8 :
Figure 8: Confusion matrix of ensemble tree classifier in classifying lying postures into supine, prone, and left side for the thighs, ankles, arms, and wrists locations.

Figure 9 :
Figure 9: Confusion matrix of AdaLSTM classifier in classifying lying postures into supine, prone, and left side for the thighs, ankles, arms, and wrists locations.

Table 2 :
Performance (%) of the ensemble tree classification in lying-posture detection for nine different body locations on the Class-Act dataset.

Table 3 :
Performance (%) of the sequence classification using AdaLSTM in lying-posture detection for nine different body locations on the Class-Act dataset.

Table 4 :
Comparison between the mean value and coefficient of variation for F1-score of detecting lying postures for different sensor placements and classifiers including Ensemble Trees (ET), Linear Discriminator Analysis (LDA), LSTM with fixed learning rate (LSTM), and Adaptive LSTM (AdaLSTM).We show the highest F1-Score value, and lowest CoV metric that models could achieve for each location for LOSO validations in bold.

Table 5 :
Comparison between the mean value and coefficient of variation for F1-score of detecting lying postures for different sensor placements and classifiers including Ensemble Trees (ET), Linear Discriminator Analysis (LDA), Support Vector Machine (SVM), LSTM with fixed learning rate (LSTM), and Adaptive LSTM (Ada-LSTM) for leave-one-subjectout validation.