Differences in Motion Accuracy of Baduanjin between Novice and Senior Students on Inertial Sensor Measurement Systems

This study aimed to evaluate the motion accuracy of novice and senior students in Baduanjin (a traditional Chinese sport) using an inertial sensor measurement system (IMU). Study participants were nine novice students, 11 senior students, and a teacher. The motion data of all participants were measured three times with the IMU. Using the motions of the teacher as the standard motions, we used dynamic time warping to calculate the distances between the motion data of the students and the teacher to evaluate the motion accuracy of the students. The distances between the motion data of the novice students and the teacher were higher than that between senior students and the teacher (p < 0.05 or p < 0.01). These initial results showed that the IMU and the corresponding mathematical methods could effectively distinguish the differences in motion accuracy between novice and senior students of Baduanjin.


Introduction
Traditional Chinese sport has been a compulsory component of Physical Education (PE) in universities in China since 2002 [1]. Although there are various traditional Chinese sports to choose from, 76.7% of universities taught martial arts in their PE curriculum [2]. In 2016, the Communist Party of China and the Chinese government adopted the 'Healthy China 2030' national health plan [3]. In this plan, Baduanjin was identified as a traditional Chinese sport that was promoted and supported by the government. This resulted in increased Baduanjin teaching and research in universities throughout the country [4].
Although universities in China must incorporate traditional Chinese sports into their PE curriculum, there have been problems with its implementation. These include a high student-teacher ratio, uninteresting forms of teaching-learning resources, and an incomplete assessment system. These three problems adversely affected the requirements for teaching quality set by the People's Republic of China Ministry of Education [5,6]. Although the high student-teacher ratio has been a problem since 2005, it has yet to be resolved [7,8]. Teachers are not able to provide individual guidance to each student because of the large number of students in the class. As a result, teachers cannot correct all the students' mistakes, and students are not aware of their incorrect movements [9].
In recent years, motion capture (Mocap) has been widely applied in fields such as clinical and sports biomechanics to distinguish between different types of motions or analyze differences between

Overview
This study consists of three sections, namely recruiting and selecting participants, capturing motion data of Baduanjin participants on IMU, and processing and analyzing the motion data. We invited teachers and students from a university in Southwest China to participate in the study. We divided them into three groups-teachers, novice students, and senior students. We captured motion data of all participants on IMU when they practised Baduanjin. The motion data were converted to quaternions and analysed in two different ways. The first way was based on the quaternions of motion, where dynamic time warping (DTW) was used to calculate the distances between the quaternions of the teacher and the two groups of students (novice and senior). The motion accuracy of the students was expressed by distances. DTW is a classic similarity method to solve the time-warping issue in similarity computation of time series [27]. Compared with the other methods, namely the hidden Markov model (HMM) and symbolic aggregate approximation (SAX), the taken time of DTW is shorter [28,29]. Considering that, in the actual teaching, students need to get feedback information and a large number of student data in realtime, we adopted DTW in the study. The second way used the extracted key-frames to calculate distances. Based on the quaternions of key-frames, DTW was used to calculate the distances. Finally, based on data of the distances, an independent sample T-test or Mann-Whitney U test was used to define whether the motions of the two groups of students (novice and senior) were different in motion accuracy (see Figure 1).

Recruiting and Selecting Participants
In this study, we invited a martial arts PE teacher and undergraduate students to participate in the study. The inclusion criteria for participants are as follows: Teacher: martial arts PE teacher, former national martial arts athlete, with an undergraduate and master's degree in traditional Chinese sports (martial arts specialization), and more than ten years' experience teaching Baduanjin.
Novice students: undergraduate students in the university with no experience of Baduanjin, without a disability and no clinical or mental illness.
Senior students: undergraduate students in the university who have passed Baduanjin in their PE course, without a disability, and no clinical or mental illness.

Recruiting and Selecting Participants
In this study, we invited a martial arts PE teacher and undergraduate students to participate in the study. The inclusion criteria for participants are as follows: Teacher: martial arts PE teacher, former national martial arts athlete, with an undergraduate and master's degree in traditional Chinese sports (martial arts specialization), and more than ten years' experience teaching Baduanjin.
Novice students: undergraduate students in the university with no experience of Baduanjin, without a disability and no clinical or mental illness.
Senior students: undergraduate students in the university who have passed Baduanjin in their PE course, without a disability, and no clinical or mental illness. Participants read the information sheet that outlined the purpose and procedure of the study. Those who agreed to participate were given the consent form to sign.

Capturing Motion Data of Participants on IMU
Baduanjin is a traditional Chinese martial art for fitness. The speed of motions is relatively slow [30]. We used IMU to capture the motion data of the teacher, novice, and senior students for eight standard motions of Baduanjin as shown in Figure 2.
Sensors 2020, 20, x FOR PEER REVIEW 5 of 25 Participants read the information sheet that outlined the purpose and procedure of the study. Those who agreed to participate were given the consent form to sign.

Capturing Motion Data of Participants on IMU
Baduanjin is a traditional Chinese martial art for fitness. The speed of motions is relatively slow [30]. We used IMU to capture the motion data of the teacher, novice, and senior students for eight standard motions of Baduanjin as shown in Figure 2.    This IMU includes 17 inertial sense units and each unit comprised a  3-axis gyroscope, 3-axis accelerate, and 3-axis magnetometer, which measures and records the rotation angle data of 17 position points of human movement. Sers et al. [32] compared the IMU used in our study with a gold standard optoelectronic system (Vicon), and confirmed the IMU's effectiveness in measuring motion accuracy. The supporting software of the IMU, Axis neuron software developed by Noitom, transforms the recorded data into Biovision Hierarchy (BVH) motion files.

Capturing Motion Data
Before measuring the motion data, the teacher and senior students practised Baduanjin for 30 min. As the novice students had not learned Baduanjin, they followed the demonstration of the teacher practising Baduanjin for 30 min. After the practice, the motion data of participants were measured by IMU. No feedback was given to students during practice.

Extracting and Converting Raw Data
The raw data was converted into BVH file by the Axis neuron software. The BVH file is a file format developed by the BVH Company to store skeleton hierarchy information and three-dimensional motion data [33]. The BVH file comprises two parts: one is used to store skeleton hierarchy information and the other to store motion information. The skeleton hierarchy information includes the connection relationship between joint points and the offsets of the child joint points from their parent skeleton points. In the skeleton hierarchy, the first skeleton point is defined as Root. Root is the parent of all other skeleton points in the skeleton hierarchy. Motion information stores the global translation amount and the rotation amount of Root in each frame of the movement. The global translation amount is the position coordinate: X position, Y position, and Z position in the world coordinate system and the rotation amount is the rotation component: X rotation, Y rotation, and Z rotation in the Euler angle [33]. The motion information of other skeleton points is recorded on the rotation amount related to the parent points. The IMU used 17 sensors to measure motion data on 17 points of the body and the recorded order of the rotation amount of each point is Z rotation, Y rotation, and X rotation. The skeleton hierarchy information of BVH on the IMU and the skeleton model are shown in Figure 3.
In the BVH file, the rotation data is recorded on the Euler angle of 17 skeleton points. Some issues with rotation data expressed on the Euler angle (gimbal lock and singularity problems) were overcome using quaternion [34]. Quaternion is a 4-dimensional hyper-complex number, expressing a three-dimensional vector space on real numbers [35]. We used four-tuple notation to represent quaternion as follows: In this quaternion, w is the scalar component, and x, y, z are the vectors. Therefore, the format of the rotation data from BVH files was converted from Euler angle to quaternion. If the order of rotation in Euler angle is z, y, x, we used α, β, γ to represent the rotation angles of the object around x, y, and z axes. The corresponding quaternion can be converted as follows: cos(γ/2) cos(β/2) cos(α/2) + sin(γ/2) sin(β/2) sin(α/2) cos(γ/2) cos(β/2) sin(α/2) − sin(γ/2) sin(β/2) cos(α/2) cos(γ/2) sin(β/2) cos(α/2) + sin(γ/2) cos(β/2) sin(α/2) sin(γ/2) cos(β/2) sin(α/2) − cos(γ/2) sin(β/2) sin(α/2)  In the BVH file, the rotation data is recorded on the Euler angle of 17 skeleton points. Some issues with rotation data expressed on the Euler angle (gimbal lock and singularity problems) were overcome using quaternion [34]. Quaternion is a 4-dimensional hyper-complex number, expressing a three-dimensional vector space on real numbers [35]. We used four-tuple notation to represent quaternion as follows: In this quaternion, w is the scalar component, and x, y, z are the vectors. Therefore, the format of the rotation data from BVH files was converted from Euler angle to quaternion. If the order of rotation in Euler angle is z, y, x, we used α, β, γ to represent the rotation angles of the object around x, y, and z axes. The corresponding quaternion can be converted as follows:

Extracting Key-Frames
After extracting the motion data, we used key-frames extraction to reduce the motion data. Due to the limited storage and bandwidth capacity available to users, the large amount of motion data collected on Mocap may restrict its application [36]. Key-frames extraction, which extracts a small number of representative key-frames from a long motion sequence, is widely used in motion analysis. This technology can reduce the data amount, which facilitates data storage and subsequent data analysis [36,37].

Extraction of Key-Frames on Inter-Frame Pitch
We used the distance between quaternions to evaluate the inter-frame pitch between frames and set a threshold of inter-frame pitch to extract key-frames [38]. The method is based on the rotation data of each skeleton point which is represented as a quaternion and uses a simple form to evaluate the distance between two quaternions. The inter-frame pitch between the two frames is assessed by the sum of the distances between the quaternions of every point. The process is constructed with three sections: calculating the distance between quaternions, calculating the inter-frame pitch between frames, and extracting key-frames on the set threshold of inter-frame pitch.

1.
The distance between quaternions To evaluate the distance between two quaternions, the conjugate quaternion q* of a quaternion is defined as follows: Sensors 2020, 20, 6258 8 of 23 and the quaternion norm ||q|| is defined as follows: then: when a quaternion norm ||q|| is 1, which means: the quaternion is a unit quaternion. A quaternion is converted to a unit quaternion by dividing it by its norm. From the definitions of conjugate quaternion, quaternion norm, and unit quaternion, we can define the inverse of a quaternion (q −1 ) as follows [39]: According to Shunyi et al. [38], if there are two quaternions: q 1 , q 2 are unit quaternions and: the distance between the quaternions q 1 and q 2 is: Therefore, we converted the rotation of a skeleton point based on Euler angles into quaternion, then normalized and converted the quaternion into unit quaternion, and finally calculated the difference between any two quaternions of the point according to Equation (9).

Calculation of Inter-Frame Pitch between Two Frames
We used the sum of the differences between the quaternions at 17 skeleton points to evaluate the inter-frame pitch between two frames. The human motion represented by the BVH file are discrete-time vectors, which is the same after conversion to quaternions [38]. The weightage for different points needs to be taken into account when calculating the inter-frame pitch due to the tree-structure (parent-child) of the BVH format. Referring to the methods used in previous research [38,40], and the relationship structure between the skeleton points on the IMU in this study (see Figure 3), we assigned the weightage values of the 17 skeleton points as shown in Table 1.
If t 1 and t 2 are the two frames in a sequence of frames, we defined the inter-frame pitch between two frames: t 1 and t 2 as the following equation: In Equation (10), n represents the total number of skeleton points (n = 17), W i represents the weightage of each skeleton point (shown in Table 1), and q i represents the quaternions of each skeleton point.

Key-frames extraction on the set threshold of inter-frame pitch
Based on the inter-frame pitch between two frames, we set: key_frame as an array to store the quaternion corresponding to the key-frames of motion; key_num as a set of vector to store the serial number corresponding to a key-frame; key_num1 presents the time series number corresponding to the first key-frame; current_key as the last frame in the set of key_num. λ is a preset threshold value of inter-frame pitch which is mainly determined based on the demand for a compression rate of frames. The algorithm steps are shown in Figure 4.

Motion reconstruction error
The purpose of motion reconstruction is to rebuild the same number of frames as the original frames based on interpolation reconstruction of non-key-frames between adjacent key-frames [38,41]. First, individually, the position coordinates (in the world coordinate system) of points were

Motion reconstruction error
The purpose of motion reconstruction is to rebuild the same number of frames as the original frames based on interpolation reconstruction of non-key-frames between adjacent key-frames [38,41]. First, individually, the position coordinates (in the world coordinate system) of points were calculated on the point hierarchy and relative rotation angle between the points in the BVH file. Second, given that p t1 and p t2 are the positions of a point of adjacent key-frames in time t 1 and t 2 , then p t (representing the position of a point of non-key-frame in time t) is calculated by linear interpolation between p t1 and p t2 as follows [41]: The algorithm steps of motion reconstruction are shown in Figure 5. In this study, we used the position error of the human posture to calculate the reconstruction error between the reconstructed frames and the original frames [38]. Assuming m1 is the original motion sequence, m2 is the reconstruction motion sequence from the key-frames, the reconstruction error E(m1, m2) is evaluated as [42]: The distance of human posture is used to measure the position error of human posture: In this equation, m represents the total number of skeleton points, , is the position of k point in i frame of the original motion sequence, and , is the position of k point in i frame in the reconstruction sequence.

Extraction of Key-Frames on Clustering
A problem with the key-frames extraction on inter-frame pitch is that the compression rate of the key-frames with the same inter-frame threshold for different actions may vary considerably [40]. As the eight motions of Baduanjin are quite different, the key-frames extraction on the inter-frame pitch may cause some motions to be compressed too much, and some motions not compressed enough. Therefore, we also chose another way to extract key-frames on clustering. This method was used for key-frames with the pre-set compression rate [43].
1. K-means clustering algorithm In this study, we used the position error of the human posture to calculate the reconstruction error between the reconstructed frames and the original frames [38]. Assuming m 1 is the original motion sequence, m 2 is the reconstruction motion sequence from the key-frames, the reconstruction error E(m 1 , m 2 ) is evaluated as [42]: The distance of human posture is used to measure the position error of human posture: In this equation, m represents the total number of skeleton points, p i 1,k is the position of k point in i frame of the original motion sequence, and p i 2,k is the position of k point in i frame in the reconstruction sequence.

Extraction of Key-Frames on Clustering
A problem with the key-frames extraction on inter-frame pitch is that the compression rate of the key-frames with the same inter-frame threshold for different actions may vary considerably [40]. As the eight motions of Baduanjin are quite different, the key-frames extraction on the inter-frame pitch may cause some motions to be compressed too much, and some motions not compressed enough. Therefore, we also chose another way to extract key-frames on clustering. This method was used for key-frames with the pre-set compression rate [43].

1.
K-means clustering algorithm K-means clustering algorithm is an iterative partition clustering algorithm. In this key-frame extraction method, we used the K-means clustering algorithm to cluster the 3D coordinates ([x, y, z]) of the skeleton points in the original frame. Assuming that the total length of the original frames is N, i represents the i frame in N. p i is the vectors of the 3D coordinate positions of all relevant skeleton points of the i frame in the original frames. Therefore, the vectors collection of the 3D coordinate data of every point of original frames is (p 1 , p 2 , . . . , p i ), p i ∈ R N . According to the K-means clustering algorithm, the data of skeleton points (R N ) in the frames is clustered into K (K ≤ N) clusters as follows [44]: Step 1: Randomly select K cluster centroids from R N are u 1 , u 2 . . . u K ; Step 2: Repeat the following process to get convergence. For the p i corresponding to one frame, we calculated the distances from each cluster centroid (u j , j ∈ K) and classified it into the class corresponding to the minimum distance [45]: In this equation, D represents the minimum distance between the cluster centroid and the centre of p i , and when D is the smallest, p i is classified into class j.
For each class j, the cluster centroid (u j ) of that class was recalculated: In this equation, r ij indicates that when p i is classified as j, it is 1; otherwise, it is 0.

Key-frames extraction
Using the above k-means clustering algorithm, we extracted K cluster centroids from the original frame. Each cluster is clustered from the 3D coordinates of the 17 points in the original frames. Therefore, one cluster centroid is constructed with 51 (17 × 3) vectors. Based on these cluster centroids, we extracted the key-frames by calculating the Euclidean distance between the cluster centroid of each point and the corresponding point coordinates in the original frames. The steps to extract key-frames are as follows: Start Input the 3D coordinate data of every point of the original frames: and the number of key-frames to be extracted is K; Step 1: Using the k-means clustering algorithm to calculate cluster centroids of the K clusters are expressed as: Step 2: Calculate the Euclidean distance of 3D coordinates between each point of the cluster and the corresponding point of the original frames: min(dis(u mj , p i j )) means that after calculating the distances between m cluster and all original frames, the j point of pi which value of dis(u mj , p i j ) is minimum is recorded as 1; otherwise, it is recorded as 0. i of p i corresponding to the maximum value of C m is a sequence of key-frames.
Step 3: Sequences of key-frames are arranged from small to large after extraction. If the first frame and the last frame in the original frames are not included in the key-frames, the first frame and the last frame must be added into key-frames. End In this key-frames extraction, the number of key-frames can be preset. The key-frames of the corresponding compression rate is obtained by presetting the compression rate as follows [42]: where K is the number of key-frames to be extracted, c_rate is the compression rate of the key-frame to be obtained, and N is the total number of original frames. After extracting key-frames, we continued with the ways to motion reconstruction and evaluate reconstruction error as described above.

Evaluate Motion the Accuracy of Motions Data
In this study, we referred to previous studies [13,46] to evaluate the motion accuracy of student motions by assessing the differences between students' motions and teacher's motions. Due to the difference in speed between individual movements, different time series were considered when assessing the difference between two motions. We chose DTW, a well-established method, to account for different time series to evaluate the difference in the motions between teachers and students [47]. Since DTW compares the other methods, i.e., HMM and SAX, without a training stage, the taken time is shorter. First, the derived quaternions were normalized in unit length of a quaternion: q = [w, x, y, z] can be described as: ||q|| = 1 and w 2 + x 2 + y 2 + z 2 = 1. Therefore, three components (x, y, z) out of the four components (w, x, y, z) of the quaternions can be used to represent the rotations of the skeleton points over a temporal domain. Then, we used DTW to evaluate the difference between two sequences of motions on the skeleton points. First, we assessed the difference between two motions on a single skeleton point. For example, there are two motion data on quaternions for a skeleton point from a teacher and a student, one from the teacher: q tea (t), one from a student: q stu (t). The length of the two sequences of quaternions are n and m: q tea (t) = q tea (1), q tea (2), . . . , q tea (i), . . . , q tea (n) q stu (t) = q stu (1), q stu (2), . . . , q stu ( j), . . . , q stu (m) (20) Sensors 2020, 20, 6258

of 23
The vector in the quaternion arrays consists of three components (x, y, z) of quaternions. A distance matrix (n × m) is constructed to align the quaternions of two sequences. The elements (i, j) in the matrix represent the Euclidean distance: dis(qtea(i), qstu(j)) between the two points q tea (i) and q stu (j): In the distance matrix, many paths are from the upper-left corner to the lower-right corner of the distance matrix. We used Φk to represent any point on these paths: Φk = (Φ tea (k), Φ stu (k)) where: Φ tea (k): the value of k is 1, 2, . . . , n, Φ stu (k): the value of k is 1, 2, . . . , m, Φk, the value of k is 1, 2, . . . , T, (T = n × m) We found a suitable path as the warping path, where the cumulative distance of path is the smallest of all paths [39]: Then, the distance of DTW(qtea(t),q stu (t)) is obtained through dynamic programming as follows [47]: To prevent the wrong matching by excessive time warping, the warping path was constrained near the diagonal of the matrix by setting the global warping window for DTW [48,49]. In this study, the global warping window is set as 10 percent of the entire window span: 0.1 × max(n, m). The cumulative distance of the warping path represents the difference of rotation between teacher and student on the skeleton points is shown in Equation (22). Then, the macro difference between students' motions and teacher's motions was evaluated by taking the average of the cumulative distances of all the skeleton points as follows: In this equation, m tea represents the teacher motion sequence; m stu represents the students' motion sequence, q i is the vectors of the quaternion of i skeleton point in the two motion sequences, and the total number of skeleton points is n.
Finally, data of the differences were analysed using IBM SPSS Statistics 25.0 to assess if there were significant differences in the motion accuracy of the two groups of students (novice and senior students) on the whole and each point. We used the independent sample T-test on data with normal distribution and the Mann-Whitney U test on data with non-normal distribution.

Demographic Characteristics of Participants
We recruited 21 participants for this study, including a martial arts teacher, nine undergraduate students who have not learned Baduanjin (novice students), and 11 undergraduate students who had completed the Baduanjin course (senior students). All participants gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the University of Malaya Research Ethics Committee (UM.TNC2/UMREC-558). The demographic characteristics of the students are shown in Table 2. For each mean duration of the eight motions shown in Table 3, we measured all participants three times with IMU, resulting in 63 motion data.

Differences in Motion Accuracy between Novice and Senior Students on Original Frames
Algorithms explained in the data analysis section were coded with Matlab R2018b. Independent sample T-tests and Mann-Whitney U tests were used to assess the differences in motion accuracy of novice and senior students.
Before assessing macro differences, we assessed the normality of original frames data using the Shapiro-Wilk test (see Table 4).  Table 4, we can see that the data of the groups on Motions 2, 3, and 4 were normally distributed (p > 0.05), whereas the others were not. Therefore, we assessed the differences in motion accuracy of Motions 2, 3, and 4 between novice and senior students using independent sample T-tests (see Table 5). The differences in the motion accuracy of other motions between novice and senior students were assessed using Mann-Whitney U tests (see Table 6). Table 5. Differences in motion accuracy between novice and senior students on original frames (using the independent sample T-test).  From Tables 5 and 6, we can see significant differences (p < 0.05 or p < 0.01) in motion accuracy of all eight motions between novice and senior students. The differences in motion accuracy between the teacher and senior students were lower than the differences in motion accuracy between the teacher and novice students.

Motion
We also evaluated the difference in motion accuracy on each skeleton point between novice and senior students ( Figure 6).
From Figure 6, we found that out of the 17 points on eight motions of Baduanjin, there were significant differences in the motion accuracy between novice and senior students for some points. For example, in Motion 1, there were significant differences in motion accuracy between the two groups at the head and neck (points 8 and 9) and the right upper limb (points 10, 11, and 12).
From Tables 5 and 6, we can see significant differences (p < 0.05 or p < 0.01) in motion accuracy of all eight motions between novice and senior students. The differences in motion accuracy between the teacher and senior students were lower than the differences in motion accuracy between the teacher and novice students.
We also evaluated the difference in motion accuracy on each skeleton point between novice and senior students ( Figure 6). From Figure 6, we found that out of the 17 points on eight motions of Baduanjin, there were significant differences in the motion accuracy between novice and senior students for some points. For example, in Motion 1, there were significant differences in motion accuracy between the two groups at the head and neck (points 8 and 9) and the right upper limb (points 10, 11, and 12).

Compression Rate and Reconstruction Error of Two Different Key-Frames Extraction Methods
Motion accuracy is assessed based on key-frames. In this study, we chose two methods to extract key-frames. In the key-frames extraction method on inter-frame pitch, we selected different thresholds (0.1, 0.5, 1.0, 1.5, 2.0) to extract key-frames and evaluated the compression rate and the reconstruction error of corresponding key-frames on different thresholds. The results are shown in Table 7.

Compression Rate and Reconstruction Error of Two Different Key-Frames Extraction Methods
Motion accuracy is assessed based on key-frames. In this study, we chose two methods to extract key-frames. In the key-frames extraction method on inter-frame pitch, we selected different thresholds (0.1, 0.5, 1.0, 1.5, 2.0) to extract key-frames and evaluated the compression rate and the reconstruction error of corresponding key-frames on different thresholds. The results are shown in Table 7. Table 7. Compression rate and reconstruction error of corresponding key-frames on inter-frame pitch.  Table 7 shows significant differences in the compression rates of the different motions extracted under the same threshold. We can see when the threshold value is set to 1 for obtaining key-frames using the inter-frame pitch, there was a difference in average compression rates ranging from 7.08% to 20.78% for the eight motions of Baduanjin. Moreover, when the threshold value increased, the number of key-frames decreased, which decreased the compression rate. However, the error of motion reconstruction also increased. Based on the data in Table 7, it can be seen that in the five preset values, the compression rate and reconstruction error of the extracted key-frames are relatively reasonable when the threshold is 1. In the other key-frames extraction method on clustering, we chose different compression rates (5,10,15,20,25) to extract key-frames and evaluate the reconstruction error on different key-frames. The results are shown in Table 8. From Table 8, we can see that as the compression rate increases, the error of motion reconstruction decreases. When the compression rate increased from 5% to 15%, the reconstruction error dropped sharply. But when the compression ratio increased from 15% to 25%, the reconstruction error decrease tended to be smooth. It can be seen that, in the five preset values, the compression rate and reconstruction error of the extracted key frames were relatively reasonable when the preset compression rate is 15%.

Differences in Motion Accuracy on Key-Frames
The differences in motion accuracy on key-frames between novice and senior students are shown in Tables 9 and 10.
From the results of the key-frames on clustering, the motion accuracy of the eight motions of novice and senior students were significantly different. This result is consistent with the result based on the original frames. However, on the key-frames of inter-frame pitch on five different thresholds, there was no significant difference in motion accuracy between the two groups in Motion 7.
The differences in motion accuracy of points between the two groups on key-frames were also evaluated. Figure 7 shows the results on the key-frames of inter-frame pitch when the setting threshold = 1. From Figures 6 and 7, we find that there was a difference between the results on the original frame and the key-frames on the inter-frame pitch. When there was a significant difference in motion accuracy between the two groups, we set the point to 1, otherwise, it was 0. Then, we evaluated the correlation between the results on the original frames and different key-frames on the Kendall correlation coefficient test (see Figures 8 and 9). From Figures 6 and 7, we find that there was a difference between the results on the original frame and the key-frames on the inter-frame pitch. When there was a significant difference in motion accuracy between the two groups, we set the point to 1, otherwise, it was 0. Then, we evaluated the correlation between the results on the original frames and different key-frames on the Kendall correlation coefficient test (see Figures 8 and 9).
The results for key-frame extraction on inter-frame pitch show that when the threshold value was 0.1, the result of the differences in motion accuracy on the key-frames was highly correlated with the result based on the original frame (Kendall coefficient of points in each motion is higher than 0.7 except for Motion 7). However, when the threshold was 0.1, the compression rates of the key-frames were higher. As shown in Table 7, when the threshold was 0.1, the compression rate of each motion exceeded 50%. For key-frames extraction on clustering, there is a high correlation when the compression rate is 0.1. The Kendall coefficient of points in each motion is higher than 0.7 except for Motion 5, where the coefficient was 0.63.
We also tested the mean processing time for using DTW to calculate the distances between motions on original frames and key-frames (Table 11). Sensors 2020, 20, x FOR PEER REVIEW 20 of 25 Figure 8. The Kendall coefficient of differences between skeleton points based on two difference methods (on the original frames and the key-frames on inter-frame pitch). The results for key-frame extraction on inter-frame pitch show that when the threshold value was 0.1, the result of the differences in motion accuracy on the key-frames was highly correlated with the result based on the original frame (Kendall coefficient of points in each motion is higher than 0.7 except for Motion 7). However, when the threshold was 0.1, the compression rates of the key-frames were higher. As shown in Table 7, when the threshold was 0.1, the compression rate of each motion exceeded 50%. For key-frames extraction on clustering, there is a high correlation when the  The results for key-frame extraction on inter-frame pitch show that when the threshold value was 0.1, the result of the differences in motion accuracy on the key-frames was highly correlated with the result based on the original frame (Kendall coefficient of points in each motion is higher than 0.7 except for Motion 7). However, when the threshold was 0.1, the compression rates of the key-frames were higher. As shown in Table 7, when the threshold was 0.1, the compression rate of each motion exceeded 50%. For key-frames extraction on clustering, there is a high correlation when the Figure 9. The Kendall coefficient of differences between skeleton points based on two difference methods (on the original frames and the key-frames on clustering). From Table 11, the processing time on the key-frames is lower than original frames. Therefore, using key-frames can effectively decrease data processing time.

Discussion
When using mathematical methods, the macro differences between the motion data of novice students and the teacher were higher than the distances between the motion data of senior students and the teacher on eight motions of Baduanjin. Because the motion data of the experimental analysis are the rotation data of specific skeleton points measured by the IMU, if the teacher's motions were taken as the standard, the results show that the motions of senior students were closer to the standard motions. Therefore, IMU can effectively distinguish the differences in motion accuracy in Baduanjin between novice and senior students.
When using the original frames to evaluate the differences at 17 skeleton points in eight motions between novice and senior students, the results show the differences in motion accuracy between the two groups on skeleton points varied for the different motions. For Motion 1, the differences between the two groups were mainly concentrated on the head-spine segment and upper limbs, especially the right upper limb. The differences mean that the motion errors of novice students relative to senior students were mainly concentrated on these joints. The results are consistent with the common motion errors described in the official book: "When holding the palms up, the head is not raised enough, or the arms are not raised enough" [30]. However, for Motion 4, the common motion errors are described in the official book as: "Rotating head and arm are insufficient" [30]. The description shows that the main errors occur in the head-spine and bilateral upper limbs. However, significant differences of skeleton points were at bilateral upper limbs but not head-spine. This difference may be related to the small number of participates in this study.
In this study, we also used two methods to extract key-frames. The raw data can be effectively compressed to decrease the data storage space using extracting key-frames [40,41]. The repetitiveness of action exercises in the teaching process will generate an extremely large amount of raw data. From the results, both key-frames extraction methods can effectively compress the raw data. We also found that the data processing speed could be accelerated on key-frames. However, the compression rates of key-frames on different motions when using key-frames on inter-frame pitch were different. We found that the differences in skeleton points on the key-frames on inter-frame pitch were not consistent with the results on the original frames. However, there was high consistency between the results on the key-frames on clustering and the results on the original frames, especially when the compression rate was 15%. Therefore, we can use key-frames to replace the original frames to evaluate motion accuracy of Baduanjin in order to decrease data storage space and processing time.
However, the small number of participants in our study limits the application of the results. As the participants were from a university in China, the results might only be suitable for university students in China because different populations have variations in anatomical characteristics, physiological characteristics, and athletic ability.
Based on our results, IMU can effectively distinguish the difference in the motion accuracy of Baduanjin between novice and senior students. Therefore, in the following work, we can develop a system using IMU to evaluate the motion quality of students and provide feedback to teachers and students. Thus, it would be able to assist teachers in correcting errors in the motions of students immediately.

Conclusions
These initial results show that, based on the original frames, the IMU and the corresponding mathematical methods can effectively distinguish the motion accuracy of all eight motions of Baduanjin between novice and senior students. Furthermore, the IMU can identify the differences between the novice and senior students on the specific skeleton points of the eight motions of Baduanjin. The results regarding key-frames on clustering were highly correlated with the results of the original frames, which means, to a certain extent, that key-frames can replace the original frame to decrease the data storage space and processing time.