Development of Classiﬁcation Algorithms for the Detection of Postures Using Non-Marker-Based Motion Capture Systems

: The rapid development of algorithms for skeletal postural detection with relatively inexpensive contactless systems and cameras opens up the possibility of monitoring and assessing the health and wellbeing of humans. However, the evaluation and conﬁrmation of posture classiﬁcations are still needed. The purpose of this study was therefore to develop a simple algorithm for the automatic classiﬁcation of human posture detection. The most a ﬀ ordable solution for this project was through using a Kinect V2, enabling the identiﬁcation of 25 joints, so as to record movements and postures for data analysis. A total of 10 subjects volunteered for this study. Three algorithms were developed for the classiﬁcation of di ﬀ erent postures in Matlab. These were based on a total error of vector lengths, a total error of angles, multiplication of these two parameters and the simultaneous analysis of the ﬁrst and second parameters. A base of 13 exercises was then created to test the recognition of postures by the algorithm and analyze subject performance. The best results for posture classiﬁcation were shown by the second algorithm, with an accuracy of 94.9%. The average degree of correctness of the exercises among the 10 participants was 94.2% (SD1.8%). It was shown that the proposed algorithms provide the same accuracy as that obtained from machine learning-based algorithms and algorithms with neural networks, but have less computational complexity and do not need resources for training. The algorithms developed and evaluated in this study have demonstrated a reasonable level of accuracy, and could potentially form the basis for developing a low-cost system for the remote monitoring of humans.


Introduction
Demographic ageing in humans means that to date, 12% of the global population are aged over 60 years, and this number is likely to double within a few decades [1]. Ageing leads to a higher prevalence of complications that may benefit from exercise therapy. Such an increase in ageing will mean that the rapid development of science and medicine, as well as the introduction of new technologies and methodologies utilized by health systems, will be needed. Increased knowledge utilized the recognition of postures together with trajectories, which resulted in an accuracy of posture estimation of 91.9%, and detection of movements of 95.16% [21].
Recent advances in machine learning have led to the use of machine learning algorithms in many studies, including posture classification [22,23]. The objective of these studies is to classify the sitting postures via conventional algorithms and deep learning-based algorithms using the body pressure distribution data from pressure sensors [22]. After classifying the sitting postures using several classifiers, average and maximum classification rates of 97.20% and 97.94%, respectively, were obtained from nine subjects with a support vector machine using the radial basis function kernel. Through a comparison of the application of the convolutional neural network (CNN) and conventional machine learning algorithms, the effectiveness of an approach [23] wherein the CNN algorithm is applied was shown (average value of accuracy = 0.953). However, machine learning-based algorithms have problems with a computational complexity that lead to an inability of real-time implementation (in reference [22], the authors stressed this point) and the need for resources for training.
These examples of previous research in the use of posture recognition algorithms provide strong arguments for the continued research and development of such algorithms.
The aim of this research was to develop simpler and more efficient identification algorithms for posture and exercise classification within healthy participants, as well as to evaluate these using Kinect V2. The main contributions of our work can be summarized as follows. Three algorithms for the classification of different postures were developed and evaluated. The effectiveness of these algorithms was based on a total error of vector lengths and a total error of angles, and the multiplication of these two parameters was proved. To compare the effectiveness of classification algorithms, a database was created from the descriptions of the 573 known postures, as well as 903 postures which were not related to them. It was shown that the algorithms presented in this study were demonstrated to be reasonably accurate, and could potentially form the basis for developing a simple system for the remote monitoring of rehabilitation involving exercise therapy.
The remainder of this paper is organized as follows. In Section 2, we describe the Microsoft Kinect V2-based approach to the automatic classification of human exercise movement and present three algorithms for posture classifications. In Section 3, we compare the effectiveness of the three developed classification algorithms by means of a database that was created from the descriptions of the 573 known postures and 903 postures which were not correctly performed. In Section 4, we discuss the results and how they can be interpreted from the perspective of previous studies, and of the working hypotheses. Future research directions also are highlighted. Finally, we present the conclusions in Section 5.

Participants
Ten healthy young adults (mean ± standard deviation age: 23.4 ± 4.1 years; six males with body mass: 72.7 ± 4.7 kg and height: 179.7 ± 4.2 cm; four females with body mass: 51.5 ± 2.6 kg and height: 163.3 ± 2.8 cm) participated in forming the exercise database. A healthy male (age 35, weight 75 kg and height 184 cm) and a healthy female (age 23, weight 50 kg and height 165 cm) were used to form the independent reference posture database. This research was completed as part of the state project of the Ministry of Health of Russia and was approved by the Ethics Committee of the Ilizarov Scientific Center for Restorative Traumatology and Orthopaedics (17 May 2018, protocol No.2(57)). All participants read the information sheet before the experiment. Written informed consent was obtained from all the participants.

Posture Description
A 3D Sensor (Microsoft Kinect V2) was used to record movement, as it is able to recognize different subjects, track their movement and create a skeleton comprising 25 points (Figure 1), which may be described by three-dimensional coordinates (i.e., by using X, Y and Z planes of motion). Any movement consists of a series of postures. Eighteen joints were used to describe a posture in a series of volunteer subjects. It was decided to exclude joints such as those numbered 16, 20, 21, 22, 23, 24 and 25 ( Figure 1) from algorithms, as they demonstrated high inconsistency in tracking accuracy. A total of 40 parameters were therefore calculated, based on 18 points: 17 were vector lengths (Table 1) and 23 were angles. However, each algorithm used a different number of parameters, as described in Section 2.3.
The vector lengths were calculated relative to a position on the centerline of the torso (see point "2", Figure 1), as it had minimal errors in tracking. As each subject had a different body shape, this meant lengths between joints were not consistent, and it was therefore decided to normalize them using the participants' heights using the following formula [24] where , and represent coordinates of the midpoint of the back, and x, y, z are the coordinates of the point for which the distance is calculated. Eleven angles were used in algorithms to describe postures and movements, as shown in Figure  2 and Table 2. For all 11 joints, the angles were between two vectors in 3D space. However, for the shoulder, hip and knee, the angles were calculated in the frontal and sagittal planes only.  (Table 1) and 23 were angles. However, each algorithm used a different number of parameters, as described in Section 2.3. Table 1. Vector lengths used for the algorithm, where numbers represent the joint as shown in Figure 1.

No.
Vector The vector lengths were calculated relative to a position on the centerline of the torso (see point "2", Figure 1), as it had minimal errors in tracking. As each subject had a different body shape, this meant lengths between joints were not consistent, and it was therefore decided to normalize them using the participants' heights using the following formula [24] where x 0 , y 0 and z 0 represent coordinates of the midpoint of the back, and x, y, z are the coordinates of the point for which the distance is calculated.
Eleven angles were used in algorithms to describe postures and movements, as shown in Figure 2 and Table 2. For all 11 joints, the angles were between two vectors in 3D space. However, for the shoulder, hip and knee, the angles were calculated in the frontal and sagittal planes only.  The angles were calculated as the angle between two 3D vectors = arccos 1 2 1 2 1 2 where , and are the coordinates of vectors obtained by the differences between points, according to Table 1.

Experemental Protocol
A database of 12 postures was created to validate the algorithms containing postures and exercise movements by ten subjects (Table 3, Figures 3 and 4). Each subject was asked to do 13 exercises and repeat each one at least 25 times. Subjects were allowed to rest if they felt fatigued. On average, it took around four hours to record 13 exercise movements for each participant. Exercise movements were randomized for each subject.  The angles were calculated as the angle between two 3D vectors where x n, y n and z n are the coordinates of vectors obtained by the differences between points, according to Table 1.

Experemental Protocol
A database of 12 postures was created to validate the algorithms containing postures and exercise movements by ten subjects (Table 3, Figures 3 and 4). Each subject was asked to do 13 exercises and repeat each one at least 25 times. Subjects were allowed to rest if they felt fatigued. On average, it took around four hours to record 13 exercise movements for each participant. Exercise movements were randomized for each subject. Table 3. Reference database of postures for the two people recorded and used for the classification of other participants.

1
Hand outstretched 2 Hands down (neutral posture) 3 Hands on waist 4 Right hand up 5 Left hand up 6 Both hands up 7 Hands forward 8 Right knee up (hands on waist) 9 Left knee up (hands on waist) 10 Both hands to the head 11 Right hand to the side 12 Left hand to the side Appl. Sci. 2020, 10, x 6 of 15     The movement exercises were described as a sequence of postures. The simplest movement was described by the start and the end position. In some cases, however, there were more complex sequences of movements where the middle phase movement comprised a combination of several postures. A total of thirteen different exercise test movements were eventually used in the study, as shown in Table 4. Hands down-hands up 3 Hands at the sides-right hand up 4 Hands at the sides-left hand up 5 Hands at the sides-hands to the head 6 Hands on the belt-right knee up 7 Hands on the belt-left knee up 8 Hands at the sides-hands forward 9 Hands down-hands forward 10 Hands up-hands forward 11 Hands forward-right hand to the side 12 Hands forward-left hand to the side 13 Hands down-hands forward-hands up-hands outstretched

Accuracy Evaluation of Postures and Movement Exercises
The accuracy, specificity and sensitivity were calculated based on formulas described in the article [25]. The classification of postures was made by comparing the recorded posture descriptors (D i ) with a reference database (D j ). The distance Er i for each pose i between the reference and reordered posture could be calculated as: A descriptor is composed of two parameters (angles and vectors), and thus two types of errors were calculated: the total error of the length of vectors and the total error of angles.
The first was calculated using absolute differences between them where D i (k), k = between 1 and 17-parameters that are responsible for the length of the vectors. The total error angles for postures i were calculated using the formula where D i (k), k = between 18 and 40-parameters responsible for the values of angles. Based on those types of errors, three algorithms for the posture classifications assessment were developed. To classify the posture, the results should be equal to or almost equal to the reference database, so that the algorithm can define the correct posture classification from the data set collected. This was achieved by setting a threshold for the three algorithms: To evaluate the most accurate algorithm for posture detection, the classification database was made using the descriptions of either "correct" or "incorrect" postures. In our study, all subjects were young and healthy, therefore it was enough to use two people for the posture reference database. However, the reference database would be more complex if participants had some disabilities and varied in age group.
To justify the accuracy of exercise movement classification, the database, with a set of sequenced postures in the correct order, was made, as shown in the examples in Figure 5.
where Di(k), k = between 1 and 17-parameters that are responsible for the length of the vectors. The total error angles for postures i were calculated using the formula where Di(k), k = between 18 and 40-parameters responsible for the values of angles. Based on those types of errors, three algorithms for the posture classifications assessment were developed. To classify the posture, the results should be equal to or almost equal to the reference database, so that the algorithm can define the correct posture classification from the data set collected. This was achieved by setting a threshold for the three algorithms: To evaluate the most accurate algorithm for posture detection, the classification database was made using the descriptions of either "correct" or "incorrect" postures. In our study, all subjects were young and healthy, therefore it was enough to use two people for the posture reference database. However, the reference database would be more complex if participants had some disabilities and varied in age group.
To justify the accuracy of exercise movement classification, the database, with a set of sequenced postures in the correct order, was made, as shown in the examples in Figure 5. Matlab was used for data collection, analysis.

Classification Algorithms
To compare the effectiveness of different classification algorithms, a database was created from the descriptions of the 573 known postures, as shown in Table 3, and 903 postures which were not

Classification Algorithms
To compare the effectiveness of different classification algorithms, a database was created from the descriptions of the 573 known postures, as shown in Table 3, and 903 postures which were not correct. Using this database, three algorithms were obtained that tested the sensitivity, specificity and accuracy of values. (Figures 6 and 7).  The mean sensitivity for the first algorithm was 92.5%, while for the second it was 98.95% and for the third it was 96.5%. Table 5 demonstrates detailed statistical results for three algorithms. Figure  8 shows receiver operator characteristic (ROC) curve results for three algorithms.    The mean sensitivity for the first algorithm was 92.5%, while for the second it was 98.95% and for the third it was 96.5%. Table 5 demonstrates detailed statistical results for three algorithms. Figure  8 shows receiver operator characteristic (ROC) curve results for three algorithms.   The mean sensitivity for the first algorithm was 92.5%, while for the second it was 98.95% and for the third it was 96.5%. Table 5 demonstrates detailed statistical results for three algorithms. Figure 8 shows receiver operator characteristic (ROC) curve results for three algorithms. The mean intersection of sensitivity and specificity for the first algorithm was 75.7%, while for the second it was 94.1% and for the third it was 87.7%. The mean accuracy for the first algorithm was 76.6%, while for the second it was 94.9% and for the third it was 89.3%. The area under the ROC curves for the first algorithm was 0.862, while for the second it was 0.986 and for the third it was 0.966. The mean sensitivity for the first algorithm was 92.5%, while for the second it was 98.95% and for the third it was 96.5%. Table 5 demonstrates detailed statistical results for three algorithms. Figure  8 shows receiver operator characteristic (ROC) curve results for three algorithms.
Appl. Sci. 2020, 10, x 10 of 15 The mean intersection of sensitivity and specificity for the first algorithm was 75.7%, while for the second it was 94.1% and for the third it was 87.7%. The mean accuracy for the first algorithm was 76.6%, while for the second it was 94.9% and for the third it was 89.3%. The area under the ROC curves for the first algorithm was 0.862, while for the second it was 0.986 and for the third it was 0.966.

Number of Exercises Performed by Participants
Each participant performed at least 390 exercises in total. Table 6 demonstrates detailed information on the number of exercises performed by each participant. The highest values of accuracy for movement exercises was demonstrated by the second algorithm, with 94.3% (SD 1.7%), as shown in Figure 9.    The average identification ratio of correct movement classification among participants was 94.3% (SD 1.7%). The average identification of correct exercises was 94.2% (SD 1.8%).

Discussion
The aim of this study was to determine accurate posture and exercise classification algorithms with low-cost sensors such as Microsoft Kinect, which has also led to the development of different virtual rehabilitation programs [13,26]. The use of such sensors can have many advantages. Firstly, they highlight interactivity and motivation, and they can also be used at home. This is important for people who live in remote areas, where there may not be experts who are locally available. In addition, the technique can be adapted to the needs of any patient group [27], or animals [28][29][30][31].
The comparison of this sensor with a professional optical motion capture system has demonstrated that it has the accuracy sufficient for both the tasks and data generation capability needed by specialists in the field of rehabilitation [8].
However, the question of how to evaluate the correctness of the exercise is still not certain, as the literature is only represented by a limited number of articles [7,21]. The previous research has demonstrated a most accurate posture classification of 91.9%, and for movement, a most accurate posture classification of 95.16% [21]. This study demonstrated a slight increase in the accuracy by using three different algorithms and by setting up a threshold level for: total error of vector lengths; total error of angles; and multiplication of vector errors by angle errors (as in [21]). Calculating sensitivity and specificity, the classification accuracy of the algorithms was obtained, with the best result shown by the algorithm using the total error of angles (94.9%). This algorithm showed better results when compared with previous research based on a multiplication of the total errors algorithm. This new algorithm also requires considerably fewer parameters for the classification of postures and exercise movements. The previous study, which showed the best accuracy for the posture classification, used 30 variables of the posture descriptor, such as angles and vector lengths [21]. However, the second algorithm in this research used only 17 variables of posture descriptor, which significantly improved the efficiency of the method.
In our study, when evaluating the classification accuracy of the exercises, we used results for the average accuracy of each participant and the average accuracy of the exercises, which were 94.3% (SD 1.7%) and 94.2% (SD 1.8%), respectively. Those results are practically the same as those of the previous research [21], but our algorithm, as mentioned above, requires considerably fewer parameters for the classification of postures and exercise movements. More advanced marker-based motion capture systems can also be used to improve the classification accuracy of algorithms. Previous research [32] has demonstrated that the static error of tracking passive markers with Oqus (Qualisys) cameras was 0.15 mm and a dynamic 0.26 mm, with much higher tracking frequencies than those used by the Kinect V2 sensor. The average identification ratio of correct movement classification among participants was 94.3% (SD 1.7%). The average identification of correct exercises was 94.2% (SD 1.8%).

Discussion
The aim of this study was to determine accurate posture and exercise classification algorithms with low-cost sensors such as Microsoft Kinect, which has also led to the development of different virtual rehabilitation programs [13,26]. The use of such sensors can have many advantages. Firstly, they highlight interactivity and motivation, and they can also be used at home. This is important for people who live in remote areas, where there may not be experts who are locally available. In addition, the technique can be adapted to the needs of any patient group [27], or animals [28][29][30][31].
The comparison of this sensor with a professional optical motion capture system has demonstrated that it has the accuracy sufficient for both the tasks and data generation capability needed by specialists in the field of rehabilitation [8].
However, the question of how to evaluate the correctness of the exercise is still not certain, as the literature is only represented by a limited number of articles [7,21]. The previous research has demonstrated a most accurate posture classification of 91.9%, and for movement, a most accurate posture classification of 95.16% [21]. This study demonstrated a slight increase in the accuracy by using three different algorithms and by setting up a threshold level for: total error of vector lengths; total error of angles; and multiplication of vector errors by angle errors (as in [21]). Calculating sensitivity and specificity, the classification accuracy of the algorithms was obtained, with the best result shown by the algorithm using the total error of angles (94.9%). This algorithm showed better results when compared with previous research based on a multiplication of the total errors algorithm. This new algorithm also requires considerably fewer parameters for the classification of postures and exercise movements. The previous study, which showed the best accuracy for the posture classification, used 30 variables of the posture descriptor, such as angles and vector lengths [21]. However, the second algorithm in this research used only 17 variables of posture descriptor, which significantly improved the efficiency of the method.
In our study, when evaluating the classification accuracy of the exercises, we used results for the average accuracy of each participant and the average accuracy of the exercises, which were 94.3% (SD 1.7%) and 94.2% (SD 1.8%), respectively. Those results are practically the same as those of the previous research [21], but our algorithm, as mentioned above, requires considerably fewer parameters for the classification of postures and exercise movements. More advanced marker-based motion capture systems can also be used to improve the classification accuracy of algorithms. Previous research [32] has demonstrated that the static error of tracking passive markers with Oqus (Qualisys) cameras was 0.15 mm and a dynamic 0.26 mm, with much higher tracking frequencies than those used by the Kinect V2 sensor.
The definition of human posture can be applied not only to the creation of applications for rehabilitation, but also for monitoring the lives of older people, such as in the recording of a sudden fall. According to statistics, 28-35% of people over 65 years of age experience a fall [33], after which they often need a period of rehabilitation. Such a monitoring system could detect a person's posture, and alert relatives, neighbors or close friends in cases where the person's positional data indicates the possibility of a heart attack, stroke or other complication; such a posture, for example, could be lying down on the floor. The time factor in attending to such situations is very crucial, being directly correlated to the person's recovery.
More studies are required to develop classification algorithms for the various medical applications mentioned, as this study had a number of limitations, outlined below.

1.
Limited tested sample size and reference database for healthy subjects.

2.
Healthy and young subjects were recruited without any disabilities.

3.
Different races, nationalities and type of disability may influence the results, as well as affect anthropometric data.

4.
Kinect sensors are not consistent in data collection for different environments, and different types of clothing can significantly change the accuracy of the detection of joints, as was noticed in our study.
Future planned research is to use the Qualisys system to improve the algorithm by reducing the number of limitations.
Video analysis is widely applied in the context of human movement detection, and real-time implementation using reliable algorithms based on the postural recognition of healthy persons should provide postural data that can be used to assess the effectiveness of clinically prescribed exercise regimes for patients, as well as allow for variations in exercise regime, dependent on the data collected. Such data would be useful in optimized treatment by exercise therapy.
The advantages of such an approach could also be extended to veterinary applications. Very few studies address automatic video-based analysis of animals-for example, canine behavior as a means of monitoring animal health and wellbeing [28][29][30]-with some of these studies using a 3D Kinect camera to detect joint position. In [28], the authors present a system capable of identifying static postures for canines that does not rely on hand-labeled data at any point, although the system can only identify the "standing," "sitting" and "lying" postures with approximately 70%, 69% and 94% accuracy, respectively. Paper [29] presents a depth-based tracking system for the automatic detection of animals' postures and body segments, as well as an exhaustive evaluation on the performance of several classification algorithms, based on both a supervised and a knowledge-based approach. Furthermore, Barnard et al. addressed a problem of automatic behavioral analysis of kenneled dogs using 3D video monitoring [30]. Dog body segment detection was done using standard Structural Support Vector Machine classifiers, and the automatic tracking of the dog was also implemented. However, this tool has a high margin for improvement.
A number of studies were also found in the literature using wide-ranging applications in the biomechanics of animals, as well as in prosthetics to prevent injuries, monitoring rehabilitation after surgical operations, choosing the appropriate orthopedic devices and prostheses, training and others [34][35][36]. Therefore, the classification algorithm of posture can also be useful in not only human medicine, but also veterinary applications, influencing veterinary intervention using exercise regimes, as well as monitoring animals' health and behavior. Further studies using the Qualisys system and neural network, which would be trained to recognize a dog's skeleton using cost-effective video cameras, are planned; so far, such work has only been carried out for humans.

Conclusions
Virtual or home rehabilitation using modern technologies can improve health and quality of life for many people and animals. The algorithms for posture and movement classification used in this study demonstrated good results using an optical sensor. These algorithms can also be used in other motion capture systems as a simpler and less resource-intensive alternative to machine learning and neural network algorithms, thus increasing accuracy.
The posture and movement classification algorithm may also be used to monitor incidental falls in the elderly population that can be associated with heart failure or a stroke, and initiate a call for help.
As for animals, this technique may also be applied for measuring the time budget of animals, indicating the amount or proportion of time that animals spend in different behaviors as a measure for common ethological and welfare parameters [37].