Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm

Sheu, Yung-Hoh; Huang, Cheng-Yu; Tai, Li-Wei; Tai, Tzu-Hsuan; Wu, Sheng K.

doi:10.3390/bdcc10020062

Open AccessArticle

Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm

by

Yung-Hoh Sheu

¹

,

Cheng-Yu Huang

¹,

Li-Wei Tai

¹

,

Tzu-Hsuan Tai

¹ and

Sheng K. Wu

^2,*

¹

Computer Science & Information Engineering, National Formosa University, Yunlin 632, Taiwan

²

Department of Sport Performance, National Taiwan University of Sport, Taichung 404, Taiwan

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2026, 10(2), 62; https://doi.org/10.3390/bdcc10020062

Submission received: 10 December 2025 / Revised: 24 January 2026 / Accepted: 13 February 2026 / Published: 15 February 2026

(This article belongs to the Section Artificial Intelligence and Multi-Agent Systems)

Download

Browse Figures

Versions Notes

Abstract

This study addresses the issue of inaccurate results in traditional table tennis player classification, which is often influenced by subjective judgment and environmental factors, by proposing a youth table tennis player classification system based on sensor fusion and the random forest algorithm. The system utilizes an embedded intelligent table tennis racket equipped with an ICM20948 nine-axis sensor and a wireless transmission module to capture real-time acceleration and angular velocity data during players’ strokes while synchronously employing a camera with OpenPose to extract joint angle variations. A total of 40 players’ stroke data were collected. Due to the limited sample size of top-tier players, the Synthetic Minority Over-sampling Technique (SMOTE) was applied, resulting in a final dataset of 360 records. Multiple key motion indicators were then computed and stored in a dedicated database. Experimental results showed that the proposed system, powered by the random forest algorithm, achieved a classification accuracy of 91.3% under conventional cross-validation, while subject-independent LOSO validation yielded a more conservative accuracy of 70.89%, making it a valuable reference for coaches and referees in conducting objective player classification. Future work will focus on expanding the dataset of domestic high-performance athletes and integrating precise sports science resources to further enhance the system’s performance and algorithmic models, thereby promoting the scientific selection of national team players and advancing the intelligent development of table tennis.

Keywords:

IMU sensor; table tennis; random forest

1. Introduction

With the rapid development of sports technology, athlete performance assessment and classification have gradually shifted from traditional experience-based judgments to data-driven and AI-assisted intelligent approaches [1]. Classification systems play an essential role in competitive sports, influencing training planning, performance management, competition grouping, and fairness in athlete selection. However, most existing classification methods still rely heavily on subjective judgments by referees or coaches, making the results vulnerable to external factors such as venue conditions or on-site performance, thereby reducing consistency and objectivity.

In the context of table tennis, the sport is characterized by fast and delicate movements, and subtle technical differences are often too fine to be distinguished by the naked eye. As a result, traditional classification methods based on age or ranking cannot accurately represent actual technical proficiency. Establishing a systematic, standardized, and objective classification mechanism would therefore not only enhance fairness but also serve as a valuable reference for training and potential assessment.

Existing studies mainly focus on single-modality data, such as deep-learning-based pose estimation [2,3] or inertial measurement unit (IMU) sensor recordings [4]. However, relying on a single modality is insufficient to capture the full complexity of movement characteristics. Although IMUs can capture acceleration and angular velocity with real-time responsiveness and convenience, they lack spatial reference. Conversely, vision-based methods are susceptible to lighting and camera angles and often struggle to quantify speed or force with precision—issues that critically affect evaluation accuracy in fast-paced sports such as table tennis. Additionally, most existing research focuses on adult athletes or specific stroke types and lacks datasets and analyses concerning youth players, whose movement patterns are more variable. Insufficient samples may lead to model instability and poor generalization [5]. Furthermore, current systems typically provide only action classification without linking results to technical indicators or training recommendations, limiting their practical utility.

To address these gaps, this study proposes an intelligent classification system for table tennis players by integrating embedded sensors, computer vision, and machine learning. The system utilizes a smart table tennis racket equipped with an ICM20948 nine-axis IMU sensor [6] to capture acceleration and angular velocity during strokes. IMU-based approaches have shown promise in racket sports; for instance, Yang et al. developed the TennisMaster system capable of real-time motion analysis in tennis [7]. However, IMUs alone cannot fully capture detailed posture variations. Therefore, this study additionally incorporates OpenPose-based human pose estimation [8] to extract temporal changes in key joints such as the shoulders, elbows, and hips. By combining sensor data and visual information through multimodal fusion with temporal synchronization and feature integration, the system provides a more comprehensive and objective representation of player stroke characteristics. Supported by machine learning models, this approach is expected to overcome the limitations of subjective evaluation and enhance classification accuracy and scientific rigor.

During data processing, sensor and pose data were synchronized, segmented, and processed for feature extraction. SMOTE was applied during training to mitigate class imbalance and reduce model bias [9]. A Random Forest classifier [10,11] was used to construct the classification model, and its performance was compared with SVM, XGBoost, and Logistic Regression [12,13].

A dataset consisting of 40 youth table tennis players was collected to establish a comprehensive stroke database. Experimental results demonstrate that the proposed multimodal system achieved a classification accuracy of 91.3% under conventional cross-validation, confirming the feasibility of multimodal fusion for technical evaluation and the reliability of multimodal fusion for technical evaluation. The system effectively compensates for subjective scoring limitations and enhances the scientific and fair nature of player classification. Future work will expand the dataset and refine the model to further advance intelligent training applications in table tennis.

2. System Principle

2.1. Embedded Smart Table Tennis Racket

The smart table tennis racket developed in this study integrates an embedded motion-sensing and wireless transmission system, as illustrated in Figure 1. The core processing unit is an NUC240SE3AE microcontroller (Nuvoton Technology Corporation of America, America), which is responsible for sensor data acquisition and communication with external devices. Motion signals are captured using an ICM20948 inertial measurement unit (IMU), which provides nine-axis measurements, including tri-axial acceleration, tri-axial angular velocity, and tri-axial magnetic field data.

Although the ICM20948 sensor (TDK InvenSense, Taiwan) supports full nine-axis sensing, only six-axis inertial signals—comprising three-axis acceleration and three-axis angular velocity—were utilized in subsequent data processing and feature extraction. Magnetometer data were intentionally excluded because indoor table tennis environments are prone to magnetic disturbances caused by surrounding metallic structures and electronic equipment, which may introduce instability and measurement noise. Furthermore, acceleration and angular velocity signals are more directly associated with swing dynamics and temporal motion characteristics, making them more suitable for stroke analysis and skill classification. Accordingly, a six-axis inertial data configuration was adopted to enhance signal robustness and analytical reliability.

In addition, the system incorporates a physical button and an RGB LED module to support basic human–machine interactions, such as device pairing and operational status indication.

The racket is powered by a lithium battery regulated through an AP2112 voltage (Diodes Incorporated, Taiwan) regulator to provide a stable 3.3 V system output. A TP4059 charging controller is included to support lithium battery charging via an external USB Type-C 5 V power source. The overall design ensures reliable data acquisition and efficient power management, enabling long-duration stable operation suitable for practical training environments.

Compared with other racket-type sports equipment, table tennis rackets have a significantly smaller form factor, and the limited internal space of the handle poses challenges for embedding sensing devices without compromising grip comfort. Commercial development boards are typically too large to be directly integrated into a table tennis racket. To address this issue, this study designed a custom 43 × 15 mm dual-layer circuit board that integrates both sensing and communication components and can be embedded inside the wooden handle. This design maintains structural integrity, weight balance, and user comfort while ensuring functional performance.

2.2. OpenPose-Based Human Pose Estimation

OpenPose is a real-time human pose estimation system developed by Carnegie Mellon University, capable of extracting multiple human body keypoints from static images or video sequences, including those of the head, torso, and limbs, and representing them through a skeleton-based structural model. Built on deep learning techniques, OpenPose integrates a Convolutional Neural Network (CNN) with the Part Affinity Field (PAF) framework to simultaneously predict joint locations and limb-connection vector fields, achieving high accuracy and efficiency in multi-person pose estimation.

Compared with traditional image-feature-based methods, OpenPose demonstrates superior robustness and real-time performance, maintaining stable operation under varying backgrounds, lighting conditions, and even partial occlusion. Its output includes the two-dimensional coordinates and confidence scores of each keypoint, and it has been widely applied in fields such as human motion analysis, sports performance evaluation, rehabilitation training, interactive entertainment, and human–computer interaction.

In recent years, with the rapid advancement of sports technology, OpenPose has been increasingly adopted in studies focused on posture correction, athlete technique analysis, and performance monitoring, further enhancing the scientific and intelligent development of sports training.

2.3. Random Forest Algorithm

Random Forest is a supervised machine learning algorithm constructed based on the principles of ensemble learning, and it is widely applied to both classification and regression tasks. The method consists of multiple decision trees trained through bootstrap sampling, where each tree selects the optimal splitting condition from a randomly chosen subset of features during node division. This mechanism increases model diversity and reduces the risk of overfitting. Compared with a single decision tree, which is often sensitive to specific data distributions, Random Forest aggregates multiple weak learners and employs a majority voting strategy to effectively average prediction errors, thereby enhancing generalization performance and stability.

Random Forest is particularly suitable for handling high-dimensional datasets with heterogeneous features and noise, and it offers advantages such as fault tolerance and intrinsic feature importance evaluation. In the field of sports technology, the algorithm has been widely utilized for sensor data processing, activity classification, technical movement assessment, and intelligent training systems. For instance, when analyzing motion data related to swinging, running, or throwing, Random Forest can effectively identify action types and technical proficiency levels, while also providing quantitative indicators to support training decisions and classification tasks.

2.4. SMOTE Technology

The Synthetic Minority Over-sampling Technique (SMOTE) is a simple yet effective data augmentation method proposed to address the problem of class imbalance. Developed by Professor Nitesh V. Chawla of the University of Notre Dame, SMOTE generates new synthetic samples rather than merely duplicating existing minority class instances, thereby increasing data diversity and expanding the feature distribution of the minority class.

However, SMOTE has certain limitations. When the number of original minority samples is extremely small, the linear interpolation mechanism used to synthesize new samples may produce data points that deviate from the true underlying distribution or even introduce noise, potentially affecting the accuracy of the classification model. To mitigate these issues, subsequent studies have proposed improved variants such as Borderline-SMOTE and SMOTEENN, which aim to enhance applicability and improve the representativeness of the generated samples.

3. System Design

The system design of this study aims to build an intelligent platform that simultaneously captures swing data and pose features by integrating an IMU with OpenPose human pose estimation technology. The overall workflow consists of three core modules, namely data collection, data processing, and algorithm modeling, all integrated around a central database.

First, the embedded IMU records the acceleration and angular velocity of the table tennis racket during swings. Simultaneously, a camera coupled with the OpenPose algorithm extracts the dynamic changes in key joint angles of the player’s swing, forming the image-based data. Next, the two datasets are synchronized and feature-extracted, creating a multimodal dataset that combines sensor and image features, enhancing the completeness and accuracy of motion analysis. Finally, the system employs a Random Forest algorithm for training and testing the classification model, enabling intelligent assessment of player skill levels.

3.1. Data Collection

To establish a precise and representative player dataset, this study invited professional table tennis players from the National Taiwan University of Sport to participate in swing experiments. This study focused on forehand strokes as the primary action, requiring each player to perform 27 swings per trial, with multiple repeated sessions to ensure sufficient and stable data.

Sensor data were captured in real time by the embedded IMU in the smart racket and transmitted via an RF wireless module to a computer application for storage and management, ultimately creating an individual dataset for each participant to facilitate subsequent feature analysis and modeling.

Simultaneously, cameras were set up to record the swing process at 60 Hz. OpenPose was then used to estimate the player’s pose, extracting the angles of five key joints: left and right elbows, left and right shoulders, and hips. These joints were selected because the shoulders drive upper-limb power, the elbows influence stroke accuracy and swing trajectory, and the hips are central to force transfer and body coordination. This combination effectively represents the main motion characteristics of a table tennis stroke, balancing conciseness with analytical value.

3.2. Data Processing

The data processing workflow of this study primarily consists of sensor waveform segmentation and calculation, OpenPose image data conversion and alignment, and timestamp synchronization.

In the sensor data processing, this study utilized signals measured by an IMU, with Z-axis acceleration selected as the primary analysis axis. The Z-axis, defined as perpendicular to the plane of the racket swing, was used to reflect the velocity and displacement changes in the arm during each stroke. Waveforms were segmented by detecting peaks and troughs, resulting in 27 independent stroke waveforms. For each waveform, four feature indicators—efficiency deviation, mean efficiency, standard deviation, and coefficient of variation—were calculated. These metrics were designed not to quantify physiological efficiency in a strict biomechanical sense, but rather to provide consistency measures reflecting the temporal stability and repeatability of swing motions.

Prior to processing the raw data returned by the IMU, data conversion was performed. The ICM-20948 sensor used in this study employs a 16-bit analog-to-digital converter (ADC) with an output range of −32,767 to 32,768. Therefore, the sensor outputs integer values that cannot directly correspond to actual physical quantities. To ensure accurate analysis, the raw data were normalized according to the sensor scale and scaled using the formula n × 16/32,678, allowing the measurements to correctly reflect the actual magnitudes of acceleration and angular velocity.

Even after data conversion, the raw signals may contain high-frequency interference due to environmental vibration, minor wrist tremors, or electronic noise, resulting in irregular waveform fluctuations that can affect the accuracy of motion recognition and feature calculation. Therefore, the signals were further filtered to improve data quality. The unfiltered raw waveforms are shown in Figure 2, where pronounced high-frequency oscillations and noise are observable.

A Butterworth low-pass filter was applied to smooth the signals, with a cutoff frequency of 5 Hz, a sampling frequency of 85 Hz, and a fourth-order configuration to balance response speed and smoothness. Forward-backward filtering was employed to avoid phase delay, ensuring that the timing of stroke peaks remained intact. The filtered waveforms are presented in Figure 3, showing a significant reduction in high-frequency noise and clearer main waveform features, which facilitates subsequent feature extraction and velocity analysis.

For waveform segmentation, it was necessary to first identify corresponding cutting points, which required feature point detection. Potential troughs were detected using an inverted peak-finding method (negating the signal) and a minimum distance threshold was applied to prevent a single stroke from being mistakenly identified as multiple feature points. To eliminate noise-induced false signals, only troughs with negative values were retained, consistent with the physical interpretation of the racket’s backswing motion.

To determine the final 27 valid stroke feature points, candidate troughs were sorted by depth and the minimum distance threshold was adjusted iteratively. The threshold was gradually reduced, and at each step, the system checked whether exactly 27 appropriately spaced, non-overlapping troughs could be selected. As shown in Figure 4, once 27 qualified troughs were successfully detected, they were designated as the final stroke feature points and arranged in chronological order for subsequent feature calculation and stroke analysis.

After identifying the trough feature points, each Z-axis acceleration waveform was segmented and features were extracted. Zero-crossing points were used to determine where the signal transitioned from positive to negative or vice versa, representing natural boundaries of the stroke waveforms. For each detected trough, the system identified the nearest subsequent zero-crossing point as the endpoint of that stroke; if no zero-crossing point existed after the trough, the end of the signal was used to ensure that each waveform fully encompassed the entire stroke motion. Figure 5 illustrates that the segment between the two markers represents a complete segmented stroke waveform.

Average Efficiency (1)

\frac{1}{N} \sum_{i = 1}^{N} {D E V}_{i}

(1)

Efficiency Deviation (2)

\frac{E_{1} - E_{5}}{E_{1}} \times 100 %

(2)

Standard Deviation (3)

\sqrt{\frac{\sum_{i = 1}^{5} {(E_{i} - E F F_A v g)}^{2}}{5}}

(3)

Coefficient of Variation (4)

\frac{S D}{E F F_A v g} \times 100 %

(4)

During data acquisition, the IMU signals from the smart table tennis racket and the video data used for OpenPose-based joint angle estimation were collected independently and were not hardware-synchronized. As a result, temporal misalignment between the two sensing modalities may occur due to manual recording and system latency.

To address this issue, a post-processing temporal alignment procedure was applied during data processing. Specifically, the first detected swing motion was used as a common reference point for synchronization. For the IMU data, the onset of the initial swing was identified based on the prominent peak in the Z-axis acceleration signal. For the vision-based data, the corresponding initial motion was determined from temporal changes in joint kinematics. The time stamps of these initial motion events were aligned, and the subsequent data streams were synchronized accordingly.

As shown in Figure 6, based on the Z-axis acceleration segmentation points, the joint signals from OpenPose are marked in orange and green according to peak and trough values, respectively.the angular waveforms of the five key joints captured by OpenPose were synchronously segmented to ensure correspondence between sensor data and image data for each individual swing. The same four statistical feature indicators were then calculated for each segmented waveform. Finally, the resulting 24-feature dataset was compiled and imported into the database to serve as the basis for Random Forest-based classification in subsequent player skill assessment.

3.3. Algorithm Design

This study primarily employed the Random Forest algorithm for player skill classification. Random Forest is an ensemble learning method that constructs multiple decision trees and determines the final output through majority voting. This approach effectively reduces the instability caused by the bias or noise of a single decision tree, making it particularly suitable for handling the high-dimensional and heterogeneous data in this study, which included both sensor-based and pose-estimation features.

During model training, hyperparameter optimization was performed using Python 3.11. Different combinations of parameters were recursively tested to identify the optimal model configuration. The final selected hyperparameters were as follows: maximum tree depth (max_depth) of 20; minimum number of samples required to split a node (min_samples_split) of 2; minimum number of samples required at a leaf node (min_samples_leaf) of 1; and a total of 600 decision trees (n_estimators = 600) in the Random Forest. This configuration allowed the model to sufficiently learn diverse feature patterns while maintaining a balance between computational efficiency and classification accuracy.

To prevent overfitting, the dataset was split into training and testing sets, and model generalization was further evaluated using cross-validation. Additionally, the Random Forest algorithm provides feature importance rankings, enabling researchers to assess the relative contribution of each feature to skill classification. These insights can serve as references for subsequent training guidance or motion optimization.

4. Experiments and Results

4.1. Dataset Construction and Experimental Protocol

This study collected forehand swing motion data from 40 table tennis players across different skill levels. Each participant performed up to four forehand strokes under a controlled experimental protocol. After excluding incomplete or low-quality recordings, a total of 158 valid swing instances were retained for analysis. Each swing instance was treated as an individual sample for feature extraction and represented by a 24-dimensional feature vector extracted from synchronized inertial and vision-based signals, while subject-level dependency was explicitly addressed through LOSO validation. The participants were categorized into three skill levels as summarized in Table 1, which provides the criteria used for this classification.

To address class imbalance among skill groups, the Synthetic Minority Over-sampling Technique (SMOTE) was applied exclusively to the training data during model development. SMOTE-generated samples were not used for testing or performance evaluation. All reported test results were obtained using original, non-synthetic samples to avoid data leakage and preserve evaluation independence. Therefore, SMOTE was employed as a training-time data balancing strategy rather than as a means to increase the effective dataset size.

Two complementary evaluation protocols were adopted. First, conventional cross-validation was used to compare different machine learning models and assess the discriminative capability of the extracted features. Second, leave-one-subject-out (LOSO) cross-validation was conducted as a subject-independent validation strategy to evaluate generalization performance on completely unseen players.

4.2. Cross-Validation Results and Model Comparison

Under the conventional cross-validation setting, the proposed Random Forest-based framework achieved an overall classification accuracy of 91.3%. This evaluation protocol was primarily designed to analyze feature effectiveness and to compare the performance of different classification models under balanced training conditions(Table 2).

Figure 7 presents the confusion matrix obtained from a representative train–test split, in which all test samples were correctly classified under this specific evaluation setting. While this result demonstrates strong feature separability under controlled conditions, it should be emphasized that this performance corresponds to a swing-instance-level split and does not represent subject-independent generalization. Consequently, the observed 100% accuracy should be interpreted cautiously and should not be considered indicative of real-world deployment performance.

Comparative experiments were conducted using XGBoost, Logistic Regression, Gradient Boosting, and SVC. To ensure a fair comparison, hyperparameters for all competing models were either empirically tuned or set according to commonly adopted best practices (Table 3). Across all tested models, the Random Forest classifier consistently achieved competitive and stable performance in terms of accuracy and stability, supporting its suitability for the proposed skill classification task.

4.3. Feature Importance Analysis and System Robustness

To further investigate the contribution of individual features to classification performance, feature importance scores were derived from the trained Random Forest model. The resulting feature importance distribution is illustrated in Figure 8, with quantitative comparisons summarized in Table 4.

As shown in Figure 8, Z-axis acceleration and right-shoulder joint angle features exhibited the highest importance values among all extracted parameters. These features capture fundamental swing dynamics and temporal motion characteristics that are less sensitive to player-specific styles. The dominance of such generalizable motion features partially explains why the proposed system maintains reasonable performance under subject-independent validation, as will be further discussed in the subsequent LOSO analysis.

Figure 9 illustrates the multimodal system architecture and sensing redundancy. The integration of inertial and vision-based modalities provides complementary information for skill assessment. In scenarios where one sensing modality experiences degraded signal quality, the remaining modality retains discriminative capability. This multimodal design enhances system robustness and contributes to the stability observed in both cross-validation and subject-independent evaluations.

4.4. Subject-Independent Validation Using LOSO

To rigorously evaluate generalization performance under unseen-subject conditions, LOSO cross-validation was employed. In this protocol, data from one player were entirely excluded from training and used solely for testing, while data from the remaining players formed the training set. This procedure was repeated for all 40 participants, and the final performance was obtained by aggregating results across all folds.

In the LOSO evaluation, SMOTE was not applied in order to avoid generating synthetic samples for unseen subjects and to ensure a conservative and realistic assessment. The LOSO analysis yielded an overall classification accuracy of 70.89%. Although this accuracy is lower than the 91.3% obtained under conventional cross-validation, it more accurately reflects the challenges posed by inter-subject variability and limited dataset size.

Figure 10 shows the confusion matrix derived from the LOSO evaluation. Despite the stricter validation conditions, the system demonstrates a meaningful level of discriminative capability, indicating that it can provide reasonable skill-level estimation for players not included in the training process. This result supports the practical feasibility of the proposed framework as a training assistance tool.

4.5. Discussion of Experimental Findings and Practical Implications

The observed performance gap between conventional cross-validation and LOSO validation underscores the substantial influence of individual motion variability and limited data availability on model generalization. Such discrepancies are well documented in human motion analysis, particularly in studies involving a small number of participants. Rather than signaling model overfitting, the LOSO results offer a more conservative and realistic estimate of real-world performance, reflecting deployment scenarios in which the system must operate on entirely unseen players.

It should be noted that the present study focuses exclusively on forehand strokes performed under controlled experimental conditions. Consequently, the generalization of the proposed approach to other stroke types or to highly dynamic and unpredictable match environments remains an open research question. Moreover, although temporal synchronization between the IMU and vision modules was addressed through post-processing alignment, real-world applications may still be affected by factors such as occlusion, intermittent sensor dropout, or data transmission latency. In the current implementation, trials with severe signal loss were repeated; however, future work will aim to enhance system robustness through improved synchronization mechanisms, sensor redundancy, and more resilient learning algorithms.

In addition to the formal LOSO subject-independent validation, a supplementary field test was conducted using newly recruited players for illustrative purposes. This validation employed a dual-track comparison between on-site referee assessments and intelligent sensor-based classification, thereby representing a realistic application scenario in which the trained model is applied to new players without any prior exposure during training.

As summarized in Table 5, the 2025 new-player tests indicate that, out of 18 participants, 15 were correctly classified and 3 were misclassified, yielding an overall accuracy of 84%. Although this supplementary evaluation does not constitute a formal subject-independent cross-validation protocol such as LOSO, it provides qualitative evidence suggesting potential practical usability.

Overall, these findings demonstrate stable performance, reasonable classification accuracy, and promising practical applicability, highlighting its potential as a reliable auxiliary tool for assessing player skill levels in real-world training and evaluation scenarios. Moreover, the combination of feature importance analysis (Figure 8), the multimodal system architecture (Figure 9), and the subject-independent LOSO evaluation results (Figure 10) collectively indicates that the framework successfully captures consistent motion characteristics that generalize across different players. Despite the inherent limitations of the dataset, these experimental outcomes provide a solid foundation for future large-scale validation and further refinement of the system.

5. Conclusions

This study addresses the limitations of traditional table tennis player classification methods, which rely heavily on subjective judgment and are easily influenced by external and environmental factors. To improve objectivity and consistency in skill-level assessment, this research proposed an intelligent classification framework for youth table tennis players by integrating an embedded inertial sensing system with vision-based human pose estimation and machine learning techniques.

By embedding an IMU into a smart table tennis racket and synchronizing its motion signals with OpenPose-based joint kinematic features, the proposed system is able to capture both racket dynamics and whole-body movement characteristics during forehand strokes. Multimodal feature fusion and Random Forest-based modeling were employed to construct an objective skill classification model. Under controlled experimental conditions with balanced training data, the system achieved a classification accuracy of 91.3%, demonstrating its effectiveness in distinguishing different skill levels under controlled conditions and its potential value as a practical auxiliary tool for coaches and referees.

To further evaluate generalization capability, subject-independent validation was conducted using a LOSO protocol. Although performance under this stricter evaluation setting was lower than that observed in conventional cross-validation, the results provide a more realistic reflection of inter-individual variability in human motion and highlight the inherent challenges of skill classification with limited datasets. These findings indicate that the proposed framework is best positioned as a decision-support system that complements expert judgment rather than replacing it, particularly in real-world training and evaluation scenarios involving previously unseen players.

In practical applications, the system may encounter challenges such as partial body occlusion, intermittent sensor disconnection, or temporal misalignment between sensing modalities. To mitigate these issues, quality control mechanisms were incorporated into the experimental procedure, whereby trials affected by severe signal loss or unreliable pose estimation were excluded and repeated. While this approach ensured data reliability in the present study, future implementations will benefit from improved synchronization strategies, enhanced sensor redundancy, and more robust learning algorithms to further increase system resilience.

Future work will focus on expanding the dataset with a greater number of participants, particularly elite and national-level athletes, and on extending the framework to additional stroke types and more dynamic match conditions. Alternative strategies for handling class imbalance, such as class-weighted learning and advanced oversampling techniques, will also be explored. Leveraging resources from the National Science and Technology Council’s Precision Sports Science initiatives, this research aims to further refine intelligent skill evaluation methods and contribute to data-driven athlete development, training optimization, and scientific player selection in competitive table tennis.

Author Contributions

Conceptualization, S.K.W.; methodology, Y.-H.S.; software, C.-Y.H. and L.-W.T.; validation, C.-Y.H., L.-W.T. and T.-H.T.; formal analysis, C.-Y.H.; investigation, C.-Y.H., L.-W.T. and T.-H.T.; resources, Y.-H.S. and S.K.W.; data curation, C.-Y.H.; writing—original draft preparation, L.-W.T.; writing—review and editing, C.-Y.H.; supervision, Y.-H.S. and S.K.W.; project administration, C.-Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

Integration of Information Technology and Sports Medicine in Intelligent Table Tennis: From Taiwan to International Perspectives" NSTC 115-2425-H-028-003 from National Science and Technology, Taiwan.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of Dar-Li Hospital, Jen-Ai Medical Foundation (pro-tocol code: 202200001B0C102).

Informed Consent Statement

Informed consent is provided to the each participant in this project.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IMU	Inertial Measurement Unit
CNN	Convolutional Neural Network
PAF	Part Affinity Fields
SMOTE	Synthetic Minority Over-sampling Technique
SVM	Support Vector Machine
SVC	Support Vector Classification
XGBoost	Extreme Gradient Boosting
MAE	Mean Absolute Error
RMSE	Root Mean Square Error
LED	Light-Emitting Diode
LOSO	Leave-One-Subject-Out

References

Ma, K. A real-time artificial intelligent system for tennis swing classification. In Proceedings of the 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), Herl’any, Slovakia, 21–23 January 2021; pp. 21–26. [Google Scholar]
Arvind, S.; Hemalatha, K.L.; Upendra Roy, B.P.; Sandeep, K.S.; Pareek, P.K. Advanced sports performance analysis using deep learning for posture and movement identification. In Proceedings of the 2023 International Conference on Data Science and Network Security (ICDSNS), Tiptur, India, 28–29 July 2023; pp. 1–6. [Google Scholar]
Sun, Y.; Li, Y. A deep learning method for intelligent analysis of sports training postures. Comput. Intell. Neurosci. 2022, 2022, 2442606. [Google Scholar] [CrossRef] [PubMed]
Li, H.; Liu, X.; Liu, Z.; Zhang, X. Deep Learning-based Human Motion Recognition for Sports Training. IEEE Access 2020, 8, 183947–183957. [Google Scholar] [CrossRef]
Lee, M.; Lee, H. Development of a real-time table tennis training system using computer vision and motion analysis. J. Sports Sci. Med. 2023, 22, 103–110. [Google Scholar]
Chou, C.-Y.; Chen, Z.-H.; Sheu, Y.-H.; Chen, H.-H.; Sun, M.-T.; Wu, S.K. TTSwing: A dataset for table tennis swing and racket kinematics analysis. Sci. Data 2025, 12, 339. [Google Scholar] [CrossRef] [PubMed]
Yang, D.; Tang, J.; Huang, Y.; Xu, C.; Li, J.; Hu, L.; Shen, G.; Liang, C.J.; Liu, H. TennisMaster: An IMU-based online serve performance evaluation system. In Proceedings of the 8th Augmented Human International Conference (AH ’17); ACM: New York, NY, USA, 2017; pp. 1–8. [Google Scholar]
Cao, Z.; Hidalgo, G.; Simon, T.; Wei, S.-E.; Sheikh, Y. OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields. arXiv 2018, arXiv:1812.08008. [Google Scholar] [CrossRef] [PubMed]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Cervantes, J.; Garcia-Lamont, F.; Rodríguez-Mazahua, L.; Lopez, A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020, 408, 189–215. [Google Scholar] [CrossRef]
Long, X.; Ma, L.; Sun, R.; Mu, Y. Binomial Logistic Regression and XGBoost Model of Multiple Factors on Employee Well-being. In Proceedings of the 2025 IEEE 8th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 14–16 March 2025; pp. 844–848. [Google Scholar] [CrossRef]

Figure 1. Hardware Architecture of the Embedded Smart Table Tennis Racket.

Figure 2. Raw waveform of IMU.

Figure 3. Filtered waveform.

Figure 4. Cutting point marking.

Figure 5. Cutting completed.

Figure 6. Openpose joint cutting point.

Figure 7. Confusion matrix under conventional cross-validation (SMOTE = 120).

Figure 8. Feature importance comparison derived from the Random Forest model.

Figure 9. Human posture recognition aided by sensor data anomalies.

Figure 10. Confusion matrix of player skill classification under LOSO subject-independent validation.

Table 1. Grouping of Youth Table Tennis Players.

Groups	Definition
Youth Players	Under 18 years
College A	Top-tier collegiate players
College B	Intermediate collegiate players

Table 2. Model Accuracy under Different SMOTE Augmentation Levels.

Dataset	Total Samples	Accuracy (%)
Original Data	158	79.8%
SMOTE to 80 Samples	240	82.2%
SMOTE to 100 Samples	300	86.4%
SMOTE to 120 Samples	360	91.3%

Table 3. Comparison of Random Forest with Other Algorithms.

	Random Forest	Gradient Boosting	XGBoost	Logistic Regression	SVC
MAE	0.0105	0.0139	0.0043	0.1667	0.3727
RMSE	0.0154	0.0197	0.0073	0.0393	0.1967
Recall	1.00	0.79	0.79	0.72	0.67
F1-Score	0.91	0.73	0.73	0.72	0.60

Table 4. Quantitative Feature Importance Derived from Random Forest.

Code	Full Term	Key
Rshoulder EFF_Value	Right Shoulder Efficiency Deviation	1.91%
Rshoulder EFF_Average	Right Shoulder Mean Efficiency Value	7.96%
Rshoulder E_SD	Right Shoulder Standard Deviation	1.34%
Rshoulder E_CV	Right Shoulder Coefficient of Variation	1.06%
Lshoulder EFF_Value	Left Shoulder Efficiency Deviation	1.77%
Lshoulder EFF_Average	Left Shoulder Mean Efficiency Value	15.23%
Lshoulder E_SD	Left Shoulder Standard Deviation	10.77%
Lshoulder E_CV	Left Shoulder Coefficient of Variation	5.02%
Relbow EFF_Value	Right Elbow Efficiency Deviation	1.90%
Relbow EFF_Average	Right Elbow Mean Efficiency Value	2.43%
Relbow E_SD	Right Elbow Standard Deviation	3.42%
Relbow E_CV	Right Elbow Coefficient of Variation	3.91%
Lelbow EFF_Value	Left Elbow Efficiency Deviation	3.26%
Lelbow EFF_Average	Left Elbow Mean Efficiency Value	5.35%
Lelbow E_SD	Left Elbow Standard Deviation	1.54%
Lelbow E_CV	Left Elbow Coefficient of Variation	3.91%
Hip EFF_Value	Hip Joint Efficiency Deviation	1.76%
Hip EFF_Average	Hip Joint Mean Efficiency Value	5.98%
Hip E_SD	Hip Joint Standard Deviation	6.00%
Hip E_CV	Hip Joint Coefficient of Variation	2.17%
Z EFF_Value	Z Axis Efficiency Deviation	4.38%
Z EFF_Average	Z Axis Mean Efficiency Value	5.23%
Z E_SD	Z Axis Standard Deviation	2.08%
Z E_CV	Z Axis Coefficient of Variation	3.00%

Table 5. Comparison of System Predictions and Actual Levels for New Players in 2025.

Player ID	Predicted Level	Actual Level	Player ID	Predicted Level	Actual Level
009	College A	Youth Players	068	College B	Youth Players
049	College B	College B	069	Youth Players	Youth Players
053	College B	College B	070	Youth Players	Youth Players
054	College B	College B	072	Youth Players	Youth Players
056	College A	College A	074	Youth Players	Youth Players
057	College B	College B	075	Youth Players	Youth Players
059	College A	College A	076	College B	Youth Players
063	Youth Players	Youth Players	079	College B	College B
065	Youth Players	Youth Players	080	College B	College B

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sheu, Y.-H.; Huang, C.-Y.; Tai, L.-W.; Tai, T.-H.; Wu, S.K. Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm. Big Data Cogn. Comput. 2026, 10, 62. https://doi.org/10.3390/bdcc10020062

AMA Style

Sheu Y-H, Huang C-Y, Tai L-W, Tai T-H, Wu SK. Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm. Big Data and Cognitive Computing. 2026; 10(2):62. https://doi.org/10.3390/bdcc10020062

Chicago/Turabian Style

Sheu, Yung-Hoh, Cheng-Yu Huang, Li-Wei Tai, Tzu-Hsuan Tai, and Sheng K. Wu. 2026. "Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm" Big Data and Cognitive Computing 10, no. 2: 62. https://doi.org/10.3390/bdcc10020062

APA Style

Sheu, Y.-H., Huang, C.-Y., Tai, L.-W., Tai, T.-H., & Wu, S. K. (2026). Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm. Big Data and Cognitive Computing, 10(2), 62. https://doi.org/10.3390/bdcc10020062

Article Menu

Skill Classification of Youth Table Tennis Players Using Sensor Fusion and the Random Forest Algorithm

Abstract

1. Introduction

2. System Principle

2.1. Embedded Smart Table Tennis Racket

2.2. OpenPose-Based Human Pose Estimation

2.3. Random Forest Algorithm

2.4. SMOTE Technology

3. System Design

3.1. Data Collection

3.2. Data Processing

3.3. Algorithm Design

4. Experiments and Results

4.1. Dataset Construction and Experimental Protocol

4.2. Cross-Validation Results and Model Comparison

4.3. Feature Importance Analysis and System Robustness

4.4. Subject-Independent Validation Using LOSO

4.5. Discussion of Experimental Findings and Practical Implications

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI