Upper Body Posture Recognition Using Inertial Sensors and Recurrent Neural Networks

Featured Application: In this study, a wearable system that can recognize human posture was developed. By using long short-term memory-based recurrent neural network (LSTM-RNN) architecture, this system was able to classify posture with data measured by using an inertial measurement unit (IMU). Our results can serve as a reference for future developments of wearable systems in order to correct human posture and mitigate risks of spinal deformity. Abstract: Inadequate sitting posture can cause imbalanced loading on the spine and result in abnormal spinal pressure, which serves as the main risk factor contributing to irreversible and chronic spinal deformity. Therefore, sitting posture recognition is important for understanding people’s sitting behaviors and for correcting inadequate postures. Recently, wearable devices embedded with microelectromechanical systems (MEMs) sensors, such as inertial measurement units (IMUs), have received increased attention in human activity recognition. In this study, a wearable device embedded with IMUs and a machine learning algorithm were developed to classify seven static sitting postures: upright, slump, lean, right and left bending, and right and left twisting. Four 9-axis IMUs were uniformly distributed between thoracic and lumbar regions (T1-L5) and aligned on a sagittal plane to acquire kinematic information about subjects’ backs during static-dynamic alternating motions. Time-domain features served as inputs to a signal-based classiﬁcation model that was developed using long short-term memory-based recurrent neural network (LSTM-RNN) architecture, and the model’s classiﬁcation performance was used to evaluate the relevance between sensor signals and sitting postures. Overall results from performance evaluation tests indicate that this IMU-based measurement and LSTM-RNN structural scheme was appropriate for sitting posture recognition.


Introduction
The spine is an important bony structure that provides support to the human body while also providing other main functions, such as protecting the spinal cord and nerve roots, resisting external forces, and enabling the performance of all human body movements. Among five spine segments, the thoracic spine consists of 12 vertebrae (T1 to T12) and is the largest segmentof the spine. Its kyphotic curve allows the spine to bear loads anteriorly and to resist tension posteriorly, protecting the spinal cord while moving or bending the body. Since the thoracic spine provides most of the stability and support for the entire trunk [1], spine deformity resulting from poor posture often occurs in this segment.
Spinal deformities are abnormal alignments or curves of the vertebral column resulting from uneven loading on the contralateral side, with many contributing factors. For example, poor posture can cause acute or chronic forms of spinal deformities. Human body posture is primarily maintained by the musculoskeletal system, which includes the spine as an important component. In order to maintain body balance and stability, the relative positions of different spine segments are rearranged, resulting in changes in the spinal curve, and this type of structural support provided by the spine can result in some degree of deformity [2]. Spinal deformities can be divided into kyphosis, lordosis, and scoliosis, and the degree of deformation is often determined by X-ray radiography and Cobb angle measurements in clinical practice [3]. Poor posture refers to postures that increase joint stress [4] and is often the main cause of spinal deformities. According to Griegel-Morris et al. [5], poor posture is prevalent among healthy people. People with more severe postural abnormalities increase their risk of experiencing muscle pain. In addition, people suffering from chronic pain exhibit poor control over maintaining an upright posture while sitting and tend to unconsciously change their sitting postures to poor postures over time [6], resulting in a vicious cycle of pain and poor posture.
Several devices, including optoelectronic motion analysis systems [7,8], radiography [9,10], and pressure sensors [11,12], are used for monitoring and measuring posture in research and clinical practice. However, the aforementioned methods require the following: (1) specific sites, such as laboratories; (2) technical personnel for data collection; (3) lengthy periods for data analysis; and (4) high system setup costs [13]. Thus, the use of miniaturized monitoring devices [14][15][16][17] has grown rapidly in recent years due to their convenience. In addition, these devices can quickly analyze posture changes and correct poor postures, achieving long-term posture correction. Wong et al. [17] used three tri-axial accelerometers to monitor sitting postural changes and compared the results with those obtained from an optoelectronic motion analysis system. For postural measurements, the RMS error was approximately 2 • in quasi-static conditions, while the error was relatively large in dynamic conditions. Therefore, a gyroscope was added to the system in order to reduce errors in collecting dynamic information. In addition, Petropoulos et al. [14] used two inertial measurement units (IMUs) to perform binary classifications of normal and abnormal postures. Their results showed that the mean square error was approximately 0.15 for the pitch angle, and the heading angle was within ±10 • range of motion, indicating that collected IMU signals can be used to calculate back curvature. However, this system is only suitable for monitoring sitting posture under static conditions, since it has only been confirmed for small-angle motions. Even though the systems mentioned above demonstrated high accuracy for posture recognition in static conditions, recognition of postures under dynamic conditions still needs to be improved.
Human activity recognition (HAR) aims to identify specific movements based on different sensing signals. This technique has been developed for multiple applications, such as fall detection systems, disease prevention, the Internet of Medical Things (IoMT), etc. [18]. In addition, smart wearable devices present an application of HAR for collecting kinematic parameters of the human body by using sensors. Sensor data are used to identify human activity with the aid of deep learning and other machine learning algorithms. According to Demrozi et al. [19], although many researchers prefer classic machine learning (CML) models because of their small datasets and lower dimensionality of input data, deep learning (DL) models exhibit greater accuracy in large activity datasets because DL networks are capable of feature learning or automatic feature extraction. In addition, DL networks have shown the ability to recognize human activities, such as walking, climbing ladders, and falling [20]. Therefore, HAR research that uses DL models, such as the Convolutional Neural Network (CNN) [21][22][23], the Long Short-Term Memory Network (LSTM) [24][25][26], and the Recurrent Neural Network (RNN) [27,28], has recently increased in popularity.
In this study, we aim to recognize human unsupported sitting postures by using a deep learning algorithm. IMUs were uniformly distributed along subjects' spines in order to collect kinematic parameters of subjects' backs in static and dynamic conditions. For the deep learning algorithm, an LSTM-based RNN (LSTM-RNN) was used to classify seven common unsupported sitting postures by using raw data from IMUs as inputs. With the aim of preventing chronic spinal problems, this system may be useful for long-term tracking of sedentary people as well as for providing alerts regarding poor posture.

Description of Sitting Posture
Unsupported sitting is often performed in human activity research in clinical studies because of musculoskeletal and neural system involvement in human activity [29]. The human body relies on interactions between the musculoskeletal and neural systems in the trunk to maintain balance under static and dynamic conditions. In this study, seven common sitting postures, which were defined as single-axis rotations related to spinal deformities including upright, slump, lean, right and left bending, and right and left twisting postures, were selected for subjects to perform ( Figure 1). networks have shown the ability to recognize human activities, such as walking, climbing ladders, and falling [20]. Therefore, HAR research that uses DL models, such as the Convolutional Neural Network (CNN) [21][22][23], the Long Short-Term Memory Network (LSTM) [24][25][26], and the Recurrent Neural Network (RNN) [27,28], has recently increased in popularity.
In this study, we aim to recognize human unsupported sitting postures by using a deep learning algorithm. IMUs were uniformly distributed along subjects' spines in order to collect kinematic parameters of subjects' backs in static and dynamic conditions. For the deep learning algorithm, an LSTM-based RNN (LSTM-RNN) was used to classify seven common unsupported sitting postures by using raw data from IMUs as inputs. With the aim of preventing chronic spinal problems, this system may be useful for long-term tracking of sedentary people as well as for providing alerts regarding poor posture.

Description of Sitting Posture
Unsupported sitting is often performed in human activity research in clinical studies because of musculoskeletal and neural system involvement in human activity [29]. The human body relies on interactions between the musculoskeletal and neural systems in the trunk to maintain balance under static and dynamic conditions. In this study, seven common sitting postures, which were defined as single-axis rotations related to spinal deformities including upright, slump, lean, right and left bending, and right and left twisting postures, were selected for subjects to perform ( Figure 1).

Sensing Device
A Next Generation IMU (NGIMU, x-io Technologies, Bristol, UK) was used to collect kinematic data. This NGIMU is a 9-axis MEMs sensor that consists of a 3-axis accelerometer, 3-axis gyroscope, and 3-axis magnetometer and features an onboard attitude and heading reference system (AHRS) sensor fusion algorithm, which provides real-time spatial data on outputs, including quaternions and Euler angles. Sensor outputs included quaternions, rotation matrix, Euler angles (0~2000º/s), linear acceleration (0~16 g), and compass heading information (0~1300 µT). For the sitting posture monitoring device, 4 IMUs were uniformly distributed between thoracic and lumbar segments (T1-L5) and tightly fixed on cloth [14,15,17,30] along the sagittal plane ( Figure 2). The IMU's sampling rate was 50 Hz, which is sufficient for HAR tasks [31]. two sensors (1-2, 2-3, and 3-4 as a group), and one single sensor, respectively, during human trials. All collected data were used to train the same machine learning model, and performance was evaluated by comparing F1 scores (Equation (1)) between different sensor combinations, where the F1 score is the harmonic mean of precision (TP/(TP + FP) and recall (TP/(TP + FN), TP is the number of true positives, FP is the number of false positives, and FN is the number of false negatives [32]. F1 scores are useful for evaluating machine learning systems, especially in multi-class classification. Accuracy was defined as the ratio of true positives to all samples (Equation (2)).  In addition, the relevance between the number of IMU sensors used in this device and the machine learning classifier's performance was also investigated in this study. The number of IMUs used in the device was reduced to three sensors (1-3 and 2-4 as a group), two sensors (1-2, 2-3, and 3-4 as a group), and one single sensor, respectively, during human trials. All collected data were used to train the same machine learning model, and performance was evaluated by comparing F1 scores (Equation (1)) between different sensor combinations, where the F1 score is the harmonic mean of precision (TP/(TP + FP) and recall (TP/(TP + FN), TP is the number of true positives, FP is the number of false positives, and FN is the number of false negatives [32]. F1 scores are useful for evaluating machine learning systems, especially in multi-class classification. Accuracy was defined as the ratio of true positives to all samples (Equation (2)).

Procedure of Human Trials
This study involved 6 healthy adults (4 males and 2 females) with ages ranging from 20 to 65. Subjects were required to fulfill the following criteria: no critical illness within the previous year, no neuro-musculoskeletal injury within the previous year, and no recent spinal treatment. The study was conducted with the approval of the Office of Human Research designated by Taipei Medical University, Taiwan (TMU-JIRB N201802061).
Subjects were asked to sit on a chair with their backs straightened. They were then asked to follow instructions that were displayed on a screen in order to perform seven unsupported sitting postures. MATLAB (Mathworks, Natick, MA, USA) was used to guide subjects through a series of images of sitting postures, which changed every 5 s. Subjects were instructed to maintain the displayed posture until the next posture appeared on the screen. In addition, subjects were allowed to move without constraints with respect to speed, amplitude, and path of movement during posture changes. This lack of constraint in trial procedures is essential for the functionality of portable devices; for example, differences in velocity, travel path, and moved distance between trials add randomness to training datasets, thus underscoring the practicality of portable devices in daily activities, which usually exhibit few movement constraints. In total, 28 postures were performed in a single trial, with each posture displayed four times. Postures were randomly displayed in order to improve generalization in machine learning. Nine-axis IMU signals were collected by using a sampling rate of 50Hz and transmitted via Wi-Fi.

Data Labeling
In previous studies of human activity recognition, researchers manually labeled data according to video recordings, but this method was time consuming and manpower consuming. Hence, in order to simplify the data labeling process, accelerometer data for supervised learning were labeled by using a designated trial protocol. As described in the previous section, each trial contained 28 alternative postures and a 5-s period for each posture. The presentation order of each posture was also recorded. Within the 5-s period, movements were divided into two states: static, which was defined as holding a posture, and dynamic, which was defined as changing postures. Time rates of acceleration signals were compared to a threshold in order to determine these two motion states. Recognizing such regular transitions, the labeling task divided an entire trial into serial sub-problems with 5-s durations. Results and data labeling in our dataset build-up were automatically generated. When compared to manual labeling with video recordings, this protocol is more efficient, but the obscure "labeling midpoint" in motion transitions, e.g., changing movements from slump postures to lean postures, was difficult to define. Upright posture served as midpoints for changing phases as subjects could not avoid being upright when transitioning from slump postures to lean postures or from left twisting postures to right twisting postures. This caused the determination of upright postures to be more difficult than the determination of other postures. For simplification, in motion-changing phases, a dynamic period was separated into two classes by the ratio of previous and next static phases, and respective labels were assigned to each part. This provided more randomness to machine learning classifiers in order to improve the generalization of its real-world application.

Machine Learning Algorithm
For the machine learning algorithm, a recurrent neural network (RNN) was used as the architecture for the machine learning model. RNNs contain cyclic connections that make them more powerful in modeling sequence data when compared to feedforward neural networks. RNNs have been widely used in language modeling, objective detection, and speech recognition and have received increased attention in biomechanics. RNNs are often used to construct sequence-to-sequence models and have been applied to HAR tasks because of RNNs' competence in temporal domain processing [24,[33][34][35][36]. RNNs can transition between hidden states of previous time steps to subsequent time steps, processing and passing down time-dependent sequences in chronological order. Thus, sequences obtained from time-dependent signals are suitable for RNN inputs. However, as a result of long-term dependency [37], RNN performance decreases as input sequences increase. In order to solve this problem, a long short-term memory (LSTM) cell was designed. An LSTM contains four units in its structure, including an input gate, an output gate, a memory cell, and a forget gate, and can decide whether input data can be stored in memory cells, effectively solving long-term dependencies that occur in RNNs [38,39].
A long short-term memory (LSTM)-based RNN was used to improve information extraction in the temporal domain ( Figure 3). Two LSTM-based layers were present in the model structure [25,32]. Each contained 128 LSTM units, with a rectified linear unit (ReLu) applied as an activation function. The experimental setup for the training procedure is listed in Table 1. Data were trained on GPU for 600 epochs by using an Adam optimizer and 10-fold cross validation to prevent overfitting. One epoch occurs when an entire dataset is passed forward and backward through the neural network only once. An Adam optimizer is an algorithm for first-order gradient-based optimization of stochastic objective functions. Cross validation is a technique for evaluating predictive models by partitioning original samples into a training set to train models and a test set to evaluate them. Raw data from IMU sensors were used as model inputs. The model output was one of seven classes representing different sitting postures.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 6 of 13 because of RNNs' competence in temporal domain processing [24,[33][34][35][36]. RNNs can transition between hidden states of previous time steps to subsequent time steps, processing and passing down time-dependent sequences in chronological order. Thus, sequences obtained from time-dependent signals are suitable for RNN inputs. However, as a result of long-term dependency [37], RNN performance decreases as input sequences increase. In order to solve this problem, a long short-term memory (LSTM) cell was designed. An LSTM contains four units in its structure, including an input gate, an output gate, a memory cell, and a forget gate, and can decide whether input data can be stored in memory cells, effectively solving long-term dependencies that occur in RNNs [38,39]. A long short-term memory (LSTM)-based RNN was used to improve information extraction in the temporal domain (Figure 3). Two LSTM-based layers were present in the model structure [25,32]. Each contained 128 LSTM units, with a rectified linear unit (ReLu) applied as an activation function. The experimental setup for the training procedure is listed in Table 1. Data were trained on GPU for 600 epochs by using an Adam optimizer and 10-fold cross validation to prevent overfitting. One epoch occurs when an entire dataset is passed forward and backward through the neural network only once. An Adam optimizer is an algorithm for first-order gradient-based optimization of stochastic objective functions. Cross validation is a technique for evaluating predictive models by partitioning original samples into a training set to train models and a test set to evaluate them. Raw data from IMU sensors were used as model inputs. The model output was one of seven classes representing different sitting postures.

Results
Six healthy adults (four males and two females), aged 20~65, were involved in this study to simulate seven sitting postures in daily life: upright, slump, lean, right and left bending, and right and left twisting postures. Raw signals were collected by using IMUs

Results
Six healthy adults (four males and two females), aged 20~65, were involved in this study to simulate seven sitting postures in daily life: upright, slump, lean, right and left bending, and right and left twisting postures. Raw signals were collected by using IMUs during one single trial in each corresponding axis (Figure 4). Data labeling showed serial transitions during trials ( Figure 5). during one single trial in each corresponding axis (Figure 4). Data labeling showed serial transitions during trials ( Figure 5).  The confusion matrix of classification results was shown for the LSTM-RNN machine learning classifier ( Figure 6). For seven sitting postures, the model's average accuracy increased to 99.0 ± 0.3% in 600 epochs, and the average F1 score was approximately 0.966 ± 0.012 (Table 2). In order to evaluate the performance of the trained model, a testing sequence was input with labels. According to the leave-one-subject-out (LOSO) method, test sets consisted of persons not included in the training. Training and validation sets were created by using a 95% to 5% splitting [32]. The unused dataset including the sixth subject was tested (Figure 7), resulting in a model accuracy of 81.2%. Results showed the practicality and potential of IMUs-based applications using LSTM-RNNs in sitting posture detection.   The confusion matrix of classification results was shown for the LSTM-RNN machine learning classifier ( Figure 6). For seven sitting postures, the model's average accuracy increased to 99.0 ± 0.3% in 600 epochs, and the average F1 score was approximately 0.966 ± 0.012 (Table 2). In order to evaluate the performance of the trained model, a testing sequence was input with labels. According to the leave-one-subject-out (LOSO) method, test sets consisted of persons not included in the training. Training and validation sets were created by using a 95% to 5% splitting [32]. The unused dataset including the sixth subject was tested (Figure 7), resulting in a model accuracy of 81.2%. Results showed the practicality and potential of IMUs-based applications using LSTM-RNNs in sitting posture detection. The confusion matrix of classification results was shown for the LSTM-RNN machine learning classifier ( Figure 6). For seven sitting postures, the model's average accuracy increased to 99.0 ± 0.3% in 600 epochs, and the average F1 score was approximately 0.966 ± 0.012 (Table 2). In order to evaluate the performance of the trained model, a testing sequence was input with labels. According to the leave-one-subject-out (LOSO) method, test sets consisted of persons not included in the training. Training and validation sets were created by using a 95% to 5% splitting [32]. The unused dataset including the sixth subject was tested (Figure 7), resulting in a model accuracy of 81.2%. Results showed the practicality and potential of IMUs-based applications using LSTM-RNNs in sitting posture detection.   In order to determine the optimal number of sensors, we evaluated the relevance between the numbers of sensors and the classifier's performance. After excluding specific     In order to determine the optimal number of sensors, we evaluated the relevance between the numbers of sensors and the classifier's performance. After excluding specific In order to determine the optimal number of sensors, we evaluated the relevance between the numbers of sensors and the classifier's performance. After excluding specific sensors, the classifier's performance was compared with the performance of the foursensor IMU, which was designated as the control group in this test. According to the comparison (Figure 8), model accuracy decreased as the number of sensors was reduced. The combination of three sensors covering the thoracic area (B1~B3) exhibited an accuracy of 98.1%.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 9 of 13 sensors, the classifier's performance was compared with the performance of the four-sensor IMU, which was designated as the control group in this test. According to the comparison (Figure 8), model accuracy decreased as the number of sensors was reduced. The combination of three sensors covering the thoracic area (B1~B3) exhibited an accuracy of 98.1%.

Discussion
This study demonstrated a process of data labeling and investigated unconstrained movements to improve HAR research. Sitting postures investigated in this study are sustained for a relatively long time in daily life. Hence, this protocol not only simplifies timeconsuming procedures but is also appropriate for sitting postures detection. In Table 2, the highest and lowest accuracies were 99.3 and 98.3% for lean and upright postures, respectively. These results show the importance of postural features in this application. Upright postures, as described in the data labeling section, served as the "labeling mid-point" for many posture-changing phases, and our model showed the poorest accuracy on this class. In contrast, holding lean or bending postures results in large tilt/bending angles in sagittal and coronal planes, respectively, and these changes allow IMUs to collect detectable changes when compared to other postures.
Accelerometers have been widely used in human motion analysis [16,17,19]. In this study, we used acceleration data from IMUs to conduct data labeling. Prior to processing classification, 28 intermittent phases were extracted by using a static-dynamic alternating sequence with approximately 5 s each. During posture transition, the signal was divided into two parts and assigned to the previous posture and the next posture, respectively, in order to enhance the model's general application. However, this form of labeling for such transition phases is imprecise and causes misclassifications for obscure definitions of posture, e.g., upright postures during changes from right bending to lean postures, as the third green rectangle shows in Figure 8. For real-world sitting posture correction devices, upright postures are more important to classify than compared to other postures. Misidentification of upright postures (such as when transitioning from one posture to another) may be an issue with our labeling method and can be fixed by adding more explanations about upright postures before trials.
In this study, we used an LSTM-RNN to conduct sequence-to-sequence classification. High classifier accuracy indicates the effectiveness of this IMUs-RNN scheme for sitting posture detection. In addition, the model exhibited a higher classification accuracy for flexion/extension (lean postures) than for axial rotation (twisting postures) due to the larger motion range of sensors during flexion/extension. This is related to the sensors' axis; twisting supplies rotation to all sensors along our axis, especially the lowest sensor. However, signals for other axes did not change significantly when compared to other postures, and this small difference caused relatively poorer performance in these classes. However, this model showed slight overfitting. As shown in Figure 7, when subjects were in a static state, the model recognized postures more accurately when compared to dynamic transition states as mentioned by Rivera et al. [40]. Some techniques of feature extraction were suggested for solving this problem; for example, an averaging filter can

Discussion
This study demonstrated a process of data labeling and investigated unconstrained movements to improve HAR research. Sitting postures investigated in this study are sustained for a relatively long time in daily life. Hence, this protocol not only simplifies time-consuming procedures but is also appropriate for sitting postures detection. In Table 2, the highest and lowest accuracies were 99.3 and 98.3% for lean and upright postures, respectively. These results show the importance of postural features in this application. Upright postures, as described in the data labeling section, served as the "labeling midpoint" for many posture-changing phases, and our model showed the poorest accuracy on this class. In contrast, holding lean or bending postures results in large tilt/bending angles in sagittal and coronal planes, respectively, and these changes allow IMUs to collect detectable changes when compared to other postures.
Accelerometers have been widely used in human motion analysis [16,17,19]. In this study, we used acceleration data from IMUs to conduct data labeling. Prior to processing classification, 28 intermittent phases were extracted by using a static-dynamic alternating sequence with approximately 5 s each. During posture transition, the signal was divided into two parts and assigned to the previous posture and the next posture, respectively, in order to enhance the model's general application. However, this form of labeling for such transition phases is imprecise and causes misclassifications for obscure definitions of posture, e.g., upright postures during changes from right bending to lean postures, as the third green rectangle shows in Figure 8. For real-world sitting posture correction devices, upright postures are more important to classify than compared to other postures. Misidentification of upright postures (such as when transitioning from one posture to another) may be an issue with our labeling method and can be fixed by adding more explanations about upright postures before trials.
In this study, we used an LSTM-RNN to conduct sequence-to-sequence classification. High classifier accuracy indicates the effectiveness of this IMUs-RNN scheme for sitting posture detection. In addition, the model exhibited a higher classification accuracy for flexion/extension (lean postures) than for axial rotation (twisting postures) due to the larger motion range of sensors during flexion/extension. This is related to the sensors' axis; twisting supplies rotation to all sensors along our axis, especially the lowest sensor. However, signals for other axes did not change significantly when compared to other postures, and this small difference caused relatively poorer performance in these classes. However, this model showed slight overfitting. As shown in Figure 7, when subjects were in a static state, the model recognized postures more accurately when compared to dynamic transition states as mentioned by Rivera et al. [40]. Some techniques of feature extraction were suggested for solving this problem; for example, an averaging filter can condense signals in dynamic periods with parts of signals in static periods to improve the accuracy of a shorter period. Further investigations of feature selection or preprocessing are not discussed in this article. In addition, even though seven postures were performed by one single subject, differences of velocity, travel path, and moved distance between trials added randomness to the training dataset. Due to the fact that movements were unconstrained and, therefore, natural and despite a sample of five research subjects, we believe that the resulting data represent a significant portion of the range of normal spinal postures, which is, of course, limited to a certain extent.
In addition, we observed the importance of the "order" of postures in trials; more specifically, postures around (changed from or into) slump and upright postures caused relatively greater classification error, as shown in green rectangles in Figure 7. This type of error was caused by the order of labeling procedures. It is difficult to define the changing point from right bending to slump postures (around the 1000th sample in Figure 7) because upright postures are similar to the combination of these two postures. Consequently, our classifier assigned upright postures to this changing period rather than slump postures. A more appropriate method of defining ground truth is using fuzzy logic, since most human body motions are dynamic or "quasi-static" rather than totally static, i.e., no change in signals. Changing postures occur frequently in daily activity, and it is impossible for changes from one posture to another to occur discontinuously, as our ground truth shows in Figure 7. In addition, as a result of anatomical differences between people and unconstrained trials, upright postures become more difficult to differentiate from all other postures, and misclassification error is higher than in other classes as illustrated by accuracy and F1 scores in Table 2.
For the number of sensors, the average classification accuracy was reduced with a decreasing number of sensors and dimensionality of training features. Accuracy and F1 scores were significantly reduced by using fewer than three sensors, demonstrating that a minimum number of three sensors were required to cope with posture recognition, especially in dynamic conditions. Among devices using three sensors, the three-sensor series (B1~B3) positioned at the upper trunk displayed greater accuracy than the threesensor series positioned at the lower trunk (B2~B4). This is probably because the B4 sensor was positioned at L5, which remained almost static between coplanar rotations, such as right and left bending [41]. Conversely, the upper arrangement of sensors B1~B3 exhibited a higher accuracy of 98.1%, which was comparable to the accuracy of the device using four sensors, indicating an optimal arrangement in the sensing system. This study has some limitations. Firstly, we evaluated only six subjects. With more subjects, it may be possible to match user-independent subjects with a subset of overall subjects based on weight, height, and various habits of activities. This would likely allow for improved user-independent classification performance. Secondly, the average accuracy when training and testing by using 10-fold cross validation was 99.0%. When testing with the excluded subject, average accuracy decreased to 81.2%. Similar variations were also reported in other papers related to human motion classification [42,43]; similarly, our model seems to perform adequately when recognizing similar patterns but performs imperfectly when used on all segments of a population. Further studies will be conducted on more postures and more subjects in order to develop a neural network that can achieve comparable generalization performances.

Conclusions
In this study, a wearable device was developed for sitting posture monitoring by combining IMU technology with a machine learning algorithm. An LSTM-based recurrent neural network was used to classify seven common sitting postures by inputting raw signals collected by using IMUs during human trials. The results showed that this system can effectively distinguish different sitting postures. In addition, a strategy for reducing the number of sensors is suggested, since fewer sensors would result in improved power management in such devices. The recommended number of sensors is three, which still provides good performance without compromising model robustness. With further research, this procedure may be improved by using faster automatic methods for data processing as well as by including more data for training and augmenting the database.