1. Introduction
Traditionally, educational evaluation has focused on the ultimate achievement of students at a given point in time. Formative evaluation breaks this pattern and shifts the focus to the whole process of learning. Formative assessment is a method of educational evaluation that emphasizes real-time observation, feedback, and guidance of students during the learning process [
1]. Through continuous monitoring and timely feedback of students’ learning processes, formative assessment provides educators with a powerful tool to improve students’ comprehensive literacy in sports activities [
2]. In physical education (PE) teaching, formative assessment is not only a tool, but also a key factor to promote students’ all-round development [
3].
In the context of physical education, formative assessment serves as a dynamic and process-oriented feedback mechanism that has increasingly been recognized as a crucial approach for enhancing both student learning outcomes and instructional quality [
4]. Unlike summative assessment, which focuses on evaluating final performance, formative assessment emphasizes continuous feedback, diagnosis, and instructional adjustment during the learning process [
5]. This approach is particularly valuable in skill-based physical education activities, such as Baduanjin or Tai Chi, where students exhibit considerable individual variability, and final performance alone is insufficient to reflect their learning trajectories. Baduanjin is a traditional Chinese Qigong exercise that dates back to the Song Dynasty (960–1279 AD). Unlike vigorous physical exercise, Baduanjin is characterized by a low-to-moderate intensity and is suitable for individuals across age groups, including older adults and patients with chronic conditions. Each movement targets different muscle groups and is believed to promote energy flow (Qi), improve flexibility, and enhance mind–body coordination. Given its accessibility and holistic benefits, Baduanjin has been widely implemented as a non-pharmacological intervention in physical and cognitive health programs [
6,
7]. A complete demonstration of the Baduanjin routine can be accessed online at
https://www.youtube.com/watch?v=bZhlLAoDzPA, accessed on 14 March 2023. The details of each motion are provided in manuscript-
Supplementary S1. Through formative assessment, instructors can identify subtle errors in movement execution, provide timely and personalized feedback, and ultimately support students in developing body awareness, self-regulation, and autonomous learning skills. Furthermore, recent studies have demonstrated that integrating digital tools, such as wearable devices and motion capture systems, into formative assessment enhances the objectivity and precision of evaluations while also increasing students’ engagement and motivation [
8].
In implementing formative assessment, various methods are employed in physical education to comprehensively understand students’ performance during physical activities. These methods include real-time observation and feedback, student self-assessment, peer evaluation, and video analysis [
5,
9,
10]. While these approaches facilitate timely correction of students’ movements and provide guidance, they also exhibit certain limitations. Specifically, subjective assessments such as real-time observation and peer evaluation may suffer from inconsistencies in evaluation criteria, thereby compromising the objectivity of assessment [
11]. Moreover, some assessment methods may be time-consuming, particularly when dealing with a large number of students, making it challenging for teachers to provide feedback and correction for each individual student’s movements [
12,
13]. Additionally, the utilization of technological aids is constrained by equipment limitations and insufficient school resources, resulting in some students being unable to fully benefit from such tools. Therefore, it is increasingly necessary to construct an objective, automated, and scalable movement assessment tool in physical education that can provide real-time, data-driven support for formative assessment.
The integration of digital technologies into PE teaching, such as motion capture systems, computer vision, and wearable sensors, has opened up new opportunities for formative assessment [
14]. Among them, inertial measurement unit (IMU)-based motion capture technology has demonstrated high potential due to its low cost, portability, and ability to provide objective quantitative data on body movements in real-time [
14,
15,
16]. Previous studies have applied IMU in domains such as gait analysis, rehabilitation, and athletic training to track movement patterns, assess motion quality, and provide biofeedback [
17,
18]. Recent studies have demonstrated the growing potential of wearable IMUs in sports and human motion analysis. Deep learning models trained on raw IMU data can achieve high accuracy in estimating motion speed, especially when sensors are placed on the lower limbs, such as the shoe [
19]. Distributed IMU arrays have also been successfully applied in outdoor sports equipment testing, such as skill evaluation, enabling objective, high-frequency data collection in real-world conditions [
20]. As wearable technology advances, IMUs and GPS sensors are becoming increasingly central in delivering real-time, personalized feedback in both professional and consumer sports contexts [
21].
Furthermore, recent pedagogical research emphasizes that technological innovation in formative assessment must be grounded in both educational theory and subject-specific practice. Digital assessment tools can foster more individualized, inclusive, and efficient PE learning environments by enabling real-time feedback, self-regulation, and teacher support [
18]. Thus, integrating IMU into PE instruction aligns with current trends in evidence-based pedagogical reform. Baduanjin, a traditional Chinese qigong routine consisting of eight standardized movements, is widely used in school and community physical education for its safety, accessibility, and benefits to both physical and mental health [
22,
23,
24]. However, its accurate practice requires coordination of body posture, balance, and breathing rhythm—making it challenging to evaluate in large-group PE classes using subjective observation alone.
The present study aims to develop an objective movement assessment system using IMU to support formative assessment in Baduanjin-based PE. The system is designed to recognize motion sequences and evaluate motion accuracy based on predefined criteria, enabling teachers to receive accurate real-time feedback on students’ practice. This research not only contributes to the development of innovative assessment tools but also bridges modern technology and traditional physical culture. By establishing this objective assessment framework, we seek to empower physical educators with evidence-based tools to enhance teaching efficiency, support personalized learning, and improve overall instructional quality in PE.
2. Methods
2.1. Study Design and Participants
This study employed a quasi-experimental, cross-sectional design to evaluate whether an IMU-based motion capture system (Perception Neuron 2.0) can effectively differentiate Baduanjin motion accuracy across learners with different experience levels. A total of 20 undergraduate students (aged 18–22 years; 12 females and 8 males) from a university in Southwest China participated. They were assigned to either a Novice Group (n = 9) or an Experienced Group (n = 11) based on whether they had completed and passed the university’s Baduanjin course. Novice participants had no previous experience in Baduanjin and received a 30 min introductory session before motion capture. Expert grading was performed independently by two instructors with over ten years of Chinese martial arts teaching experience. All evaluation data were anonymized by replacing student names with randomly assigned participant IDs, and no identifying information was provided to the evaluators. The experts were blinded to participants’ experience levels and group assignments. Inclusion criteria required that participants had no previous formal instruction in Baduanjin, demonstrated the physical capacity to engage in low-to-moderate intensity exercise as verified through a pre-participation health screening, and provided written informed consent prior to enrolment. Participants were excluded if they presented with musculoskeletal injuries, cardiovascular disorders, or other medical conditions likely to interfere with safe participation; possessed advanced or competitive-level training experience in martial arts, Tai Chi, or Qigong; were unable to commit to the full schedule of instructional and assessment sessions; or accumulated an absence rate exceeding 20% over the course of the intervention. Ethical approval was obtained from the Research Ethics Committee of the University of Malaya (UM.TNC2/UMREC-558), and informed consent was collected from all participants.
2.2. Procedure
This research is divided into three stages (
Figure 1). Stage 1 assessed the capability of the IMU system to differentiate between motion accuracy levels. Stage 2 involved the development and verification of motion recognition methods using a labeled dataset of Baduanjin movements. Stage 3 implemented and evaluated the formative assessment system in an actual Baduanjin PE course. In section one, two groups of students with different motor accuracy levels of Baduanjin were recruited to verify the ability of IMU MoCap to distinguish the difference between the motion accuracy. In section two, methods for assessing and recognizing Baduanjin motions were developed and verified using a dataset of Baduanjin motions. Students and teachers from a university in southwestern China were recruited to create the database of Baduanjin motions. In this study, the verified methods are two commonly used types of methods for recognizing motions: sample-based and sequence-based methods. The final section applied the selected methods to develop a formative assessment system in Baduanjin PE and evaluated the system’s efficacy in Baduanjin courses. Students were recruited to test and record their Baduanjin motions while teaching Baduanjin using the built system.
2.3. Stage 1: Verifying IMU Effectiveness in Accuracy of Baduanjin Motions Discrimination
This study employed a commercial IMU system, the Perception Neuron 2.0 (Noitom Technology, Miami, FL, USA, 2018)), which has been widely used in motion capture research [
25,
26]. To our knowledge, no prior research has investigated the use of an IMU to distinguish the motion accuracy of Baduanjin movements. Each participant completed three motion capture trials using the Perception Neuron 2.0 system, following standardized instructions and wearing the full-body IMU suit (
Figure 2). The captured data were processed to extract joint rotation data, which were then converted into quaternion format for analysis. The dynamic time warping (DTW) algorithm was used to compare student performance with the expert model, providing quantitative distance measures that represent motion accuracy.
Perception Neuron 2.0 contains multiple inertial sensor units, including a three-axis gyroscope, three-axis accelerometer, and three-axis magnetometer [
27]. A total of 17 inertial sensing units were used in the investigation. The output file for the motion data captured by Perception Neuron 2.0 is the Biovision Hierarchy (BVH) file generated by the supporting software (Axis Neuron Pro, version 3.8, Noitom Technology, Miami, FL, USA)) of Perception Neuron 2.0. The BVH file format was established to store skeleton information and motion data [
28]. In this research, the rotation data for each skeleton point in the BVH of motions was extracted to identify and assess motions. BVH skeletons are formed of 17 skeleton points, and rotation data is expressed in Euler angles. The rotation data was transformed into quaternions because Euler angles have gimbal lock and singularity [
16,
24]. All participants completed three motion captures using Perception Neuron 2.0. The novice group captured the motions immediately after 30 min of initial Baduanjin learning. The captured motion data were converted into quaternions and used dynamic time warping (DTW) to calculate the distances between the standard motions (captured from the invited teacher) and the motions of the two student groups to assess the motion accuracy of the students’ motions [
18]. Since the captured motions consist of 17 skeleton points, it is necessary to calculate the distances between the corresponding skeletal points through DTW and then average the distances. In addition, considering the issue of the wrong matching by excessive time warping in DTW, the global warping window was set as 10% of the entire window span in this research to constrain the warp path to be near the diagonal of the matrix [
29].
2.4. Stage 2: Development and Verification of Motion Recognition Methods
In the research, two common different types of methods (the sample-based and the sequence-based methods) were applied to assess the motion accuracy and recognize the motions within Baduanjin. In the sample-based methods, the features were extracted from the motion data and reduced in dimensionality to prevent data redundancy. In sequence-based methods, keyframes were extracted to prevent data redundancy and reduce processing time. Both methods were classification methods that used supervised learning based on the dataset of Baduanjin motions captured by students and teachers. Motions were assigned labels for motion accuracy or motion name. The methods used the motion data to train the classifiers based on the labels. After the model parameters of the classifier were trained, unlabeled motions could be classified. Experts in Baduanjin were invited to grade the motion accuracy of the captured motions, and then use the assessment result as labels to train the classifiers.
2.4.1. Developing a Dataset of Baduanjin Motions
In this research, the dataset of Baduanjin motions was captured from undergraduate students, including 20 students and 1 professional teacher recruited in the first stage and 35 students recruited later. The dataset includes all eight standard Baduanjin motions, as described in manuscript-
Supplementary S1. The second batch of recruited students carried out the Baduanjin motions once. Two professional Chinese martial art teachers with more than ten years of experience teaching Baduanjin at the university were invited to assess the motion accuracy according to the videos recorded when capturing the students’ motions. The assessment applied the grading method used in the Baduanjin course in which the motion accuracy of Baduanjin motions is divided into three grades: Fail, Pass, and Good. The Kendall test for the assessment results of the two teachers shows that the Kendall consistency of the evaluation of the eight Baduanjin movements is above 0.8, indicating that the assessment of the teachers is highly consistent.
2.4.2. The Sample-Based Methods
In sample-based methods, each motion is represented by features taken from motion data. The multiple classifier models are trained on the extracted features and the labels of motions and used to classify unlabeled motions. Previous researchers have used time-domain, frequency-domain, and wavelet features to extract features to recognize motions [
30,
31]. This research obtained time-domain features such as mean, variance, standard deviation, skewness, kurtosis, and quartile deviation. Since the motion data comprises 17 skeleton points, the number of features extracted by one motion dataset was 17 × 3 × 6 = 306. The extracted features need to be normalized and reduced dimensionality for subsequent training models. The extracted features were normalized to the range [0, 1], and Principal component analysis was used to reduce the dimensionality of features [
32]. Based on the features and labels of motions, the classifiers, including
k-Nearest Neighbor (
k-NN), Support Vector Machines (SVM), Naive Bayes (NB), Logistic Regression, Decision Tree (DT), Back Propagation neural network (BPNN), Radial basis function neural network (RBFNN) and One-dimensional CNN (1D-CNN) were trained to assess and recognize motions. The sample-based methods involved in this research were constructed and verified using Matlab 2020b as the platform.
2.4.3. The Sequence-Based Methods
The difference from sample-based methods is that frequency-based methods do not extract features but analyze motion data on quaternions as time-series data. Considering the limited storage space and bandwidth capacity available to users in teaching, extracting keyframes is used to reduce motion data to improve application adaptability. The study used k-means clustering to extract keyframes corresponding to the compression rate using a preset compression rate [
33]. In order to confirm the efficacy of extracting keyframes and obtain a reasonable compression ratio, the interpolation method was applied for reconstructing motions, calculating motion reconstruction error, and then setting the extraction compression ratio to 15% [
34,
35]. Based on the keyframes and labels of motions, the sequence-based methods, including DTW combined with classifiers (
k-NN, SVM, NB, Logistic Regression, DT, BPNN, and RBFNN), Hidden Markov Model (HMM), and Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLTSM), and Gated Recurrent Units (GRUs), were applied to train the models to assess and recognize motions. The frequency-based methods involved in this research were constructed and verified using Matlab 2020b as the platform.
2.5. Stage 3: Developing a Formative Assessment System and Taking Objective User Test
The objective user test was conducted to evaluate the effectiveness of the developed formative assessment system in Baduanjin PE and to examine whether the implemented motion assessment and recognition methods could be applied in a teaching context. Undergraduate students with no prior experience in Baduanjin, no physical disabilities, and no clinical or mental illnesses were recruited. All participants completed the eight-week Baduanjin PE curriculum, which comprised eight lessons. After each lesson, students were required to repeat the movements learned in that session. Their performance was evaluated both by the course instructor using traditional manual assessment and by the formative assessment system. The evaluation considered two aspects: the accuracy of each individual motion and the integrity of the motion sequence, including missing motions and sequence errors. This process was repeated after every lesson, and the results from the instructor and the system were compared at the end of the course to assess the system’s validity.
The formative assessment system was designed as a general framework for assessing motion accuracy and detecting missing or incorrectly ordered motions. The framework supports the integration of different classification algorithms; in the present implementation, the system served as a platform to test motion analysis methods without claiming optimality for any specific model. Further comparative studies with alternative algorithms will be undertaken to determine the most effective approach for this application. The system processes IMU-captured motion data in a continuous workflow. Motion data in BVH file format are imported, skeletal information is extracted, and joint rotation data are converted into quaternion format. Feature values are then computed and reduced in dimensionality using principal component analysis to minimize redundancy. The chosen classification algorithm is applied to assess individual motion accuracy and to recognize motion types. Recognized sequences are compared with a reference sequence to detect missing motions or ordering errors, and the results are output for further analysis and instructional feedback.
2.6. Statistical Analysis
Based on the calculated distances, IBM SPSS Statistics 25.0, using independent samples T-test for normally distributed data and Mann–Whitney U test for normally distributed data, was used as a platform to evaluate the significance of the differences in motion accuracy between motions
4. Discussion and Implications
This study aimed to address a pressing challenge in Chinese physical education—the difficulty of conducting formative assessment in large-class settings, particularly for traditional Chinese sports such as Baduanjin. The results demonstrated that commercial IMU-based motion capture, specifically Perception Neuron 2.0, could effectively distinguish between students of differing skill levels in terms of motion accuracy. The verification stage showed statistically significant differences in DTW-based distance metrics between novices and experienced practitioners, affirming the feasibility of using this technology to support formative assessments in physical education.
The results of Section One show significant differences in the distance between the two groups of students’ motions and the teacher’s motions that verify that the motion data captured by the chosen commercial IMU MoCap: Perception Neuron 2.0 could effectively distinguish Baduanjin motions with different motion accuracy (
Table 1 and
Table 2). Based on the verification results from Section One, Section Two developed and selected the appropriate methods for assessing motion accuracy and recognizing the motions of Baduanjin. Two different types of methods (sample-based and sequence-based methods) were applied to assess movement accuracy and recognize the motions of Baduanjin. Using the built dataset of Baduanjin motions, the results show that the sample-based
k-NN method was selected for assessing motion accuracy for high accuracy and short processing time. In recognizing motions, although there were several methods with accuracy over 99%, there is no significant difference in the chi-square test between the methods in the current results. The sample-based SVM method was selected for recognizing motions considering the processing time. The findings clearly indicate that experienced students performed Baduanjin movements with significantly greater accuracy than novices. This aligns with previous research on skill acquisition in traditional and modern sports, which shows that experienced practitioners typically display smoother kinematic trajectories, reduced intra-movement variability, and enhanced postural control. Our results extend these observations to the context of Baduanjin and underscore that the level of practice and familiarity plays a key role in improving biomechanical execution. Moreover, the system successfully identified specific student learning difficulties—such as forgetting previously learned motions during the acquisition of new ones—highlighting its potential for real-time educational intervention. The formative assessment system in Baduanjin PE was developed based on the optimal assessment of motion accuracy and recognizing motion methods selected by the verification methods. Moreover, the objective user test of the system was carried out. The objective user test results show that the accuracy of the formative assessment system in the motion recognition of students reaches 99.77%. The consistency test (Kendall test) of the formative assessment system and teacher on assessing the motion accuracy of students exceeds 0.8. These objective user test results show that the developed formative assessment system effectively assesses motion accuracy and recognizes motions. In addition, using the formative assessment system, problems students face in the learning process can be detected immediately. For example, using recognizing the motions of students as a metric, the system shows the problem of forgetting motions that often occur when learning Baduanjin. The system detected three students who forgot Motion-3 in learning Baduanjin in Lesson 4. Lesson 4 requires students to learn new motions, which leads some students to forget the previously learned motions when learning the new motions. This phenomenon is seen for Student ID 1. The formative assessment system shows that the student was unable to remember Motion 3 (during Lesson 4) and Motion 6 (during Lesson 6). Lastly, the developed formative assessment system could trace the learning process. As an example, for Student ID 1, the recorded result clearly shows the learning progress of all the motion accuracy throughout the eight-week learning process. Therefore, the objective user test results reflect that the first-generation formative assessment system can assess students in the learning process to discover the mistakes made by students.
Formative assessment refers to the assessment to discover the problems of students during the learning process, and it usually consists of a small number of items but requires frequent measurement. Formative assessment can assess how well students are progressing and provide teachers with important information about managing instruction. In contrast, summative assessment does not consider the development of students and problems in the process of learning and feedback from teachers [
3]. While previous research on formative assessment tools has primarily focused on mainstream sports or dynamic activities, this study is one of the few to validate such technologies in slow, meditative traditional Chinese exercises. Compared to yoga, Tai Chi, or martial arts, which have also benefited from similar analyses, Baduanjin poses unique challenges due to its subtle motion characteristics and emphasis on internal flow. Despite this, our approach demonstrated similar efficacy in quantifying motion accuracy and identifying learning gaps. This reinforces the argument that motion tracking technologies are not limited to high-intensity sports and can be effectively adapted to diverse physical disciplines. Similar trends have been observed in studies on other sports, such as Tai Chi, yoga, or martial arts: experienced practitioners typically demonstrate smoother joint trajectories, reduced movement variability, and better postural stability [
36,
37]. Our study provides additional empirical support that motion capture-based assessments can distinguish between skill levels not only in dynamic or competitive sports but also in slow, meditative forms of exercise like Baduanjin. This contributes to the growing literature on the quantification of traditional Chinese exercises using wearable technology.
The results of this study hold significant implications for the field of physical education. Firstly, the successful development of a formative assessment system capable of real-time monitoring of student movements and accurate evaluation of motion accuracy, achieved through the use of commercial IMU technology, provides physical education teachers with an effective tool. This tool enables them to better understand students’ performance during the learning process, promptly identify issues, and offer personalized guidance and support. This contributes to enhancing teaching quality and enables students to develop their physical literacy more comprehensively. Secondly, the study results demonstrate the immense potential of commercial IMU technology in the context of traditional Chinese physical education. Despite limited research in this area, the study validates the effectiveness of this technology in recognizing motion accuracy. This provides a reliable theoretical basis for future efforts to promote the use of IMU MoCap technology in traditional Chinese physical education. Furthermore, the study confirms the practicality and feasibility of the formative assessment system. Through objective user testing, the system demonstrates excellent performance in student motion recognition and high consistency with teacher evaluations. This indicates that the system can serve as an effective tool to help teachers manage classrooms, provide personalized guidance to students, and promptly identify and address issues encountered during the learning process. Practical implications of this study extend to physical education and health promotion. The ability to objectively assess movement quality using wearable sensors offers a promising tool for PE teachers to provide personalized instruction and track student progress. Additionally, the framework developed here may be adapted for other sports or rehabilitation programs where form and accuracy are crucial. For example, similar motion evaluation systems could be applied in yoga, gymnastics, or elderly balance training, aligning well with current trends in digitized, data-driven sport pedagogy [
38,
39].
Although the present study focused on Baduanjin movements, the proposed IMU-based formative assessment framework is not limited to this discipline. With appropriate adjustments to the reference motion library and accuracy criteria, the system could be applied to assess other structured movement practices, such as Tai Chi, yoga, and martial arts, as well as technical skills in sports like gymnastics, golf, and swimming. Beyond sports and traditional exercises, the approach could be extended to rehabilitation and physical therapy settings to monitor patients’ progress, provide objective feedback, and support remote or home-based training. These examples illustrate the potential for broader application of this technology in movement assessment and skill acquisition monitoring across diverse domains.
This study has several limitations that should be considered when interpreting the findings. The sample consisted of full-time students from a single institution, which may limit the generalizability of the results to other populations. Future research should recruit participants from a wider range of age groups, backgrounds, and physical ability levels to improve external validity. The IMU-based assessment system was tested in a controlled indoor setting, and its performance under different environmental conditions was not evaluated. Further studies should assess its reliability and accuracy in varied real-world contexts, such as outdoor classes or settings with limited space. The evaluation framework focused primarily on kinematic accuracy and did not incorporate physiological or perceptual measures, which could provide a more comprehensive understanding of learning outcomes. Future work could integrate motion data with physiological indicators and self-reported measures. Finally, while the system demonstrated potential for enhancing formative assessment, the need for technical setup and calibration may hinder large-scale adoption in schools with limited resources. Streamlining system deployment and automating data processing should be priorities for future development to support broader implementation.