1. Introduction
Professional sports are a massive industry with an annual revenue of USD 159 billion [
1]. This leaves great market potential for any technology that could provide a competitive advantage during training. To improve the learning process of athletes, the concept of a biomechanical feedback loop was created [
2]. It is a system designed to capture the movement of athletes, recognise mistakes, and provide feedback that athletes can use to improve their technique. It consists of four parts: the athlete who performs the movement; sensors that measure the movement; a processing unit that processes signals from the sensors, recognises useful information, and generates feedback; and actuators that provide the feedback to the athlete. The athlete is part of the loop, as they need to be able to respond to feedback and improve their technique for the next repetition of the movement.
The movement can be captured by using either cameras or embedded and wearable sensors. Most studies that use deep learning techniques focus on movement captured by cameras [
3]. However, the problem with cameras is that they require setup and calibration, which are often too time-consuming for professional athletes. For the applications aimed at recreational athletes, sensors have the advantage of lower cost. Inertial sensors are mass-produced and very cost-effective [
4]. Due to these two reasons, we focus our research only on applications based on data from wearable or embedded sensors.
In a recent review of the field of using machine learning in sports, we have found that a significant portion of studies used models specific to one subject [
5], highlighting the need for personalised models. This is not surprising, as each individual has their own personal style or technique while performing sports [
6]. There are not many studies comparing personalised versus general approaches, and between those that do, the results are not consistent. We found a study that concluded that building a personalised model greatly outperformed using a universal model for everyone [
7] and another that found no statistical difference between the two options [
8]. Based on that, it is most likely that in some applications, we can provide feedback on a universal technique; in others, feedback needs to be personalised to accommodate individual-level differences. This applies regardless of whether feedback is provided using machine learning methods or not. In such applications, it is useful to recognise the person performing the movement, so we can provide the correct personalised feedback. For example, in gait rehabilitation, the field is shifting into more personalised assistance systems [
9], a direction that is also worth exploring in sports.
We envision the personalised feedback system as a modular framework composed of several components. The first component is responsible for recognising the individual or even a specific group of individuals with similar personal techniques. The second component analyses the movements, considering the ideal movement patterns tailored to that individual. Additional components involve the sensors used to capture motion and the actuators that deliver feedback. This paper focuses on person identification needed as the first component.
Due to significant differences in personal techniques, the problem of recognising a person performing sports based only on signals from inertial movement sensors or similar is relatively easy to solve. Previous research has achieved good accuracy in multiple studies, such as a study that recognised people based on their gait [
10], where they achieved a maximum of 97.09% accuracy, or a study where they achieved 100% accuracy in recognising a small number of golfers based on recorded golf swings [
11]. A person can be identified using wearable IMU sensor signals not only during sports activities but also during a variety of other everyday tasks [
12,
13].
In this paper, we focus on methods with practical applications. The two gaps that were not covered by the aforementioned studies are that they do not explore the minimum data needed for building such models or the fact that the model needs to be trained on data from all subjects that it recognises. We address these gaps by examining the minimum data requirements in terms of the number of required samples for each person. For practical application, this is a key consideration. Person recognition is a unique problem in the way that we cannot possibly train the model without the final user’s own data. Simultaneously, we cannot expect users to record their movement many times before the application starts working. Eliminating the need to retrain the model on the user’s device when adding a new person significantly enhances the application’s usability.
The challenge specific to few-shot person recognition based on signals from inertial measurement sensors is that the signals between individuals vary a lot. A model trained on data from a subset of people will easily learn features of the signals needed to recognise these individuals. However, these features can be very unique to each person. The problem arises when we want to generalise this model and use it to recognise people whose data was not present during the training.
Furthermore, in sports, data collection is often challenging, as each measurement typically requires the deliberate use of specialised sensors, which must be accurately positioned on the athlete’s body or equipment for each specific movement. We have found that the majority of machine learning research in sports is currently conducted on relatively small datasets [
5]. This highlights the need to develop movement recognition methods in sports that remain effective with limited data, extending beyond the current focus on person identification.
The main objective of our paper is to develop methods that can recognise a person based on signals from an inertial measurement sensor using only a few samples per person. The final solution should support open-set classification, meaning it must be capable of recognising and correctly labelling samples from unknown individuals as unknown.
To summarise, the main contribution of this work is as follows:
The supporting contributions can be summarised as follows:
A comprehensive analysis of person identification methods using IMU signals across four scenarios—closed-set classification, open-set classification, few-shot classification, and few-shot open-set classification;
An in-depth investigation into the limitations of standard few-shot learning methods when applied to this task.
The paper is structured as follows. First, we describe the dataset and the data collection process. The Related Work section then introduces the machine learning methods evaluated in this study, followed by a summary of prior research on IMU-based person recognition. The Methods section details the preprocessing procedures, the proposed model, and the evaluation methodology. The Results section presents findings across four scenarios: classification, open-set classification, few-shot classification, and few-shot open-set classification. The proposed model is used in few-shot classification, and few-shot open-set classification scenarios. Finally, we conclude with a discussion of the results and a summary of key findings.
2. Dataset Collection
The experiments presented in this study were conducted using a custom dataset comprising 2850 recordings of dart throws collected from 20 individuals. All participants consented to participate in the study. All recordings were anonymous, and no personal data was collected.
Each recording corresponds to one throw, recorded in real time with a fixed duration of 0.5 s (100 samples), without any temporal normalisation. Throwing darts was selected as the example activity due to the short duration of each motion, which enables efficient data collection. Dart throwing also represents a typical sports-related arm motion, sharing characteristics with other throwing or swinging actions. Therefore, the database is relevant not only for this relatively uncommon sport, but also as a transferable benchmark for broader applications in sports motion analysis. Below, we provide a brief overview of the data collection process.
Throws were recorded using a custom-built device that is attached to the back of the athlete’s hand. For attaching the device, we used double-sided adhesive tape, but protected the athlete’s hand with kinetic tape. The holes in the device are aligned with the knuckles, to reduce sensor placement variations from one measurement session to another. The device consists of an Adafruit Feather M0 microcontroller and a Bosch BNO085 sensor. The sensor is capable of achieving a 100 Hz sampling rate in a configuration where all built-in sensors are turned on. It provides linear acceleration measurements with an accuracy of 0.35 m/s
2 and estimates the gravity vector with an angular error of 1.5° [
14]. Based on our initial results, we have concluded that the sampling rate of 100 Hz is sufficient for the identification of individuals. Moreover, the accelerations during motion are significantly larger than the sensor’s accuracy threshold, ensuring that the measurement accuracy is adequate as well. In
Figure 1, we show the device and its attachment to the hand.
We recorded raw data from the accelerometer, magnetometer, and gyroscope, as well as additional signals computed via the sensor’s internal sensor fusion algorithms. This enabled us to capture device orientation, gravity-compensated 3D acceleration, and the gravity vector. Sensor data was transmitted wirelessly via Wi-Fi. The measurement system collected data at 200 Hz, while the sampling frequency of the sensor was 100 Hz. In addition to the inertial sensor data, the position and orientation of the measurement device were tracked using a Qualisys motion capture system for external reference and sensor validation, which is not part of this paper. The data from Qualisys confirms that the measurements from IMU are correct. The velocity calculated using the path recorded by Qualisys closely matches the velocity calculated from acceleration recorded using IMU.
The experiment was designed to replicate a natural darts-playing experience with minimal changes from standard conditions. A standard-size dartboard (45.1 cm diameter, 173 cm centre height) was used, and players threw from the regular distance of 237 cm. To ensure consistent aiming, we used a concentric circle dartboard variant. The participants threw darts in sets of three.
Figure 2 illustrates the hand motion during the throw.
In
Table 1, we report the statistics of the dataset used. Participants took part in recording sessions over a span of three months. The number of recordings and recording sessions per participant varied. As we are performing person identification, the number of participants is equal to the number of classes.
The collected signals show relatively large differences between individuals, which can be discerned by the naked eye. In
Figure 3, we show signals for two individuals. The preprocessing steps outlined in
Section 4.1 were applied to them.
Author Contributions
Conceptualization, V.V., A.K., R.B., L.J., H.W., A.U., Z.Z., and S.T.; methodology V.V., R.B., L.J., H.W., and A.U.; software, V.V.; validation, A.U.; formal analysis, V.V., R.B., L.J., and H.W.; investigation, V.V., R.B., L.J., and H.W.; resources, V.V., A.K., and A.U.; data curation, V.V., and A.U.; writing—original draft preparation, V.V.; writing—review and editing, V.V., A.K., R.B., L.J., H.W., Z.Z., and A.U.; visualisation, V.V.; supervision, A.K., R.B., L.J., H.W., and A.U.; project administration, R.B. and A.U.; funding acquisition, S.T. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded in part by the Slovenian Research Agency within the research program ICT4QoL—Information and Communications Technologies for Quality of Life, grant number P2-0246. This work is financed in part by the Fundamental Research Funds for the Central Universities of China grant number 2024ZKPYZN01.
Institutional Review Board Statement
All measurements were non-invasive, collected during regular leisure activity, and no sensitive or personally identifiable data were collected. The Rules for the Processing of Applications by the Committee of the University of Ljubljana for Ethics in Research that Includes Work with People (KERL UL) state that if “the research does not go beyond normal everyday (occupational, educational, leisure and other) activities of participants or requires only minimal participation of those involved in the research, and it does not involve the collection of identified personal data”, it does not require ethical assessment. Our study is therefore exempt under these rules. We confirm that all procedures complied with the ethical standards of the 2013 revision of the Declaration of Helsinki and the Code of Ethics of the University of Ljubljana, including respect for informed consent, privacy, and voluntary participation.
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study. Participants were informed of the study’s purpose and their right to withdraw at any time without consequence.
Data Availability Statement
Acknowledgments
During the preparation of this manuscript/study, the authors used Grammarly version 1.2.201.1767 and ChatGPT versions 4 and 5 for the purposes of grammar correction. The authors have reviewed and edited the output and take full responsibility for the content of this publication.
Figure 2 was initially generated using AI and then edited.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Sim, J. Global Sports Industry Revenues to Reach US$260bn by 2033. SportsPro. 2024. Available online: https://www.sportspro.com/news/global-sports-industry-revenue-projection-2033-two-circles/ (accessed on 1 May 2025).
- Kos, A.; Umek, A. Biomechanical Biofeedback Systems and Applications; Springer: Cham, Switzerland, 2018; ISBN 978-3-319-91348-3. [Google Scholar]
- Naik, B.T.; Hashmi, M.F.; Bokde, N.D. A Comprehensive Review of Computer Vision in Sports: Open Issues, Future Trends and Research Directions. Appl. Sci. 2022, 12, 4429. [Google Scholar] [CrossRef]
- Lee, J.; Wheeler, K.; James, D.A. Wearable Sensors in Sport: A Practical Guide to Usage and Implementation; Springer Briefs in Applied Sciences and Technology; Springer: Singapore, 2019; ISBN 9789811337765. [Google Scholar]
- Vec, V.; Tomažič, S.; Kos, A.; Umek, A. Trends in Real-Time Artificial Intelligence Methods in Sports: A Systematic Review. J. Big Data 2024, 11, 148. [Google Scholar] [CrossRef]
- Schöllhorn, W.; Bauer, H. Identifying Individual Movement Styles in High Performance Sports by Means of Self-Organizing Kohonen Maps; Universitätsverlag Konstanz: Konstanz, Germany, 1998. [Google Scholar]
- Sharma, A.; Arora, J.; Khan, P.; Satapathy, S.; Agarwal, S.; Sengupta, S.; Mridha, S.; Ganguly, N. CommBox: Utilizing Sensors for Real-Time Cricket Shot Identification and Commentary Generation. In Proceedings of the 2017 9th International Conference on Communication Systems and Networks (COMSNETS), Bengaluru, India, 4–8 January 2017; pp. 427–428. [Google Scholar]
- Schneider, O.S.; MacLean, K.E.; Altun, K.; Karuei, I.; Wu, M.M.A. Real-Time Gait Classification for Persuasive Smartphone Apps: Structuring the Literature and Pushing the Limits. In Proceedings of the 2013 International Conference on Intelligent User Interfaces, Santa Monica, CA, USA, 19–22 March 2013; ACM: Santa Monica, CA, USA, 2013; pp. 161–172. [Google Scholar]
- Wall, C.; McMeekin, P.; Walker, R.; Hetherington, V.; Graham, L.; Godfrey, A. Sonification for Personalised Gait Intervention. Sensors 2024, 24, 65. [Google Scholar] [CrossRef] [PubMed]
- Yao, Z.-M.; Zhou, X.; Lin, E.-D.; Xu, S.; Sun, Y.-N. A Novel Biometrie Recognition System Based on Ground Reaction Force Measurements of Continuous Gait. In Proceedings of the 3rd International Conference on Human System Interaction, Rzeszow, Poland, 13–15 May 2010; pp. 452–458. [Google Scholar]
- Zhang, Z.; Zhang, Y.; Kos, A.; Umek, A. Strain Gage Sensor Based Golfer Identification Using Machine Learning Algorithms. Procedia Comput. Sci. 2018, 129, 135–140. [Google Scholar] [CrossRef]
- Luo, F.; Khan, S.; Huang, Y.; Wu, K. Activity-Based Person Identification Using Multimodal Wearable Sensor Data. IEEE Internet Things J. 2023, 10, 1711–1723. [Google Scholar] [CrossRef]
- Retsinas, G.; Filntisis, P.P.; Efthymiou, N.; Theodosis, E.; Zlatintsi, A.; Maragos, P. Person Identification Using Deep Convolutional Neural Networks on Short-Term Signals from Wearable Sensors. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 3657–3661. [Google Scholar]
- Datasheet—LCSC Electronics. Available online: https://www.lcsc.com/datasheet/C5189642.pdf (accessed on 24 September 2025).
- Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D Convolutional Neural Networks and Applications: A Survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
- Cover, T.; Hart, P. Nearest Neighbor Pattern Classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Wang, Z.; Oates, T. Imaging Time-Series to Improve Classification and Imputation. In Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Mariani, M.; Appiah, P.; Tweneboah, O. Fusion of Recurrence Plots and Gramian Angular Fields with Bayesian Optimization for Enhanced Time-Series Classification. Axioms 2025, 14, 528. [Google Scholar] [CrossRef]
- Ni, J.; Zhao, Z.; Shen, C.; Tong, H.; Song, D.; Cheng, W.; Luo, D.; Chen, H. Harnessing Vision Models for Time Series Analysis: A Survey. In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 16–22 August 2025. [Google Scholar]
- Hasani, R.; Lechner, M.; Amini, A.; Rus, D.; Grosu, R. Liquid Time-Constant Networks. Proc. AAAI Conf. Artif. Intell. 2021, 35, 7657–7666. [Google Scholar] [CrossRef]
- Hendrycks, D.; Gimpel, K. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks. arXiv 2018, arXiv:1610.02136. [Google Scholar] [CrossRef]
- Balasubramanian, L.; Kruber, F.; Botsch, M.; Deng, K. Open-Set Recognition Based on the Combination of Deep Learning and Ensemble Method for Detecting Unknown Traffic Scenarios. arXiv 2021, arXiv:2105.07635. [Google Scholar] [CrossRef]
- Su, Y.; Kim, M.; Liu, F.; Jain, A.; Liu, X. Open-Set Biometrics: Beyond Good Closed-Set Models. arXiv 2024, arXiv:2407.16133. [Google Scholar]
- Wang, Y.; Yao, Q.; Kwok, J.T.; Ni, L.M. Generalizing from a Few Examples: A Survey on Few-Shot Learning. ACM Comput. Surv. 2020, 53, 1–34. [Google Scholar] [CrossRef]
- Snell, J.; Swersky, K.; Zemel, R.S. Prototypical Networks for Few-Shot Learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; Kavukcuoglu, K.; Wierstra, D. Matching Networks for One Shot Learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2017. [Google Scholar]
- Jeong, M.; Choi, S.; Kim, C. Few-Shot Open-Set Recognition by Transformation Consistency. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Golf Swing Classification with Multiple Deep Convolutional Neural Networks. Available online: https://journals.sagepub.com/doi/epub/10.1177/1550147718802186 (accessed on 19 September 2025).
- Andersson, R.; Bermejo-García, J.; Agujetas, R.; Cronhjort, M.; Chilo, J. Smartphone IMU Sensors for Human Identification through Hip Joint Angle Analysis. Sensors 2024, 24, 4769. [Google Scholar] [CrossRef] [PubMed]
- Yue, Z.; Wang, Y.; Duan, J.; Yang, T.; Huang, C.; Tong, Y.; Xu, B. TS2Vec: Towards Universal Representation of Time Series. Available online: https://arxiv.org/abs/2106.10466v4 (accessed on 20 June 2025).
- Train_Test_Split. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html (accessed on 11 June 2025).
- TSNE. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html (accessed on 18 June 2025).
- Jiao, L.; Gao, W.; Bie, R.; Umek, A.; Kos, A. Golf Guided Grad-CAM: Attention Visualization within Golf Swings via Guided Gradient-Based Class Activation Mapping. Multimed. Tools Appl. 2024, 83, 38481–38503. [Google Scholar] [CrossRef]
- Tonekaboni, S.; Eytan, D.; Goldenberg, A. Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding. arXiv 2021, arXiv:2106.00750. [Google Scholar]
- Eldele, E.; Ragab, M.; Chen, Z.; Wu, M.; Kwoh, C.K.; Li, X.; Guan, C. Time-Series Representation Learning via Temporal and Contextual Contrasting. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 19–27 August 2021. [Google Scholar]
Figure 1.
Sensor device with microcontroller, two sensors, and reflective spheres for motion capture. It is glued to the blue kinetic tape protecting the hand.
Figure 2.
Hand motion during the throw.
Figure 3.
Comparison of collected signals between two individuals. Each line corresponds to one throw.
Figure 4.
Model architecture is designed by combining TS2Vec model and prototypical network.
Figure 5.
Confusion matrix for TS2Vec + prototypical network model. Please note that the confusion matrix was collected in a later rerun of the experiment and the accuracy varies slightly from the one reported.
Figure 6.
Differences in embedding quality between classes seen at training and classes unseen during training for 1D CNN feature extractor are obvious.
Figure 7.
Attention for classes in order from left to right: 0 (seen), 14 (unseen), 7 (seen), and 12 (unseen).
Figure 8.
Attention for samples from class 18, when it was unseen during training (left) versus when it was seen during training (right).
Figure 9.
Embeddings from an unsupervised trained feature extractor are of much higher quality for unseen classes compared to
Figure 6.
Table 1.
Dataset statistics.
| Total samples | 2850 |
| Participants | 20 |
| Male participants | 18 |
| Female participants | 2 |
| No. of recording sessions per participant | 1–6 |
| No. of samples for each participant | 783, 984, 60, 33, 96, 81, 54, 57, 60, 60, 54, 60, 60, 60, 57, 57, 60, 60, 57, 57 |
| Age of participants | 21–30 |
Table 2.
Classification results for different input signals.
| Signal Used | Accuracy [%] |
|---|
| 98 |
| 98 |
| 100 |
Table 3.
Classification results.
| Model | Accuracy 70/30 Split | Accuracy 5 Samples | Accuracy 1 Sample |
|---|
| 1D CNN | 97.8 | 90.5 | 60.2 |
| KNN | 80.7 | 77.8 | 62.5 |
| SVM | 87.2 | 86.7 | 63.2 |
| 2D CNN + GAF | 97.3 | 67.6 | 30.0 |
| LNN | 94.7 | 53.0 | 29.9 |
Table 4.
Open-set classification results.
| Model | Accuracy Known Classes | Rejection Rate for Unknown | AUC Unknown Classes |
|---|
| 1D CNN + threshold | 88.5 | 64.5 | 0.88 |
| 1D CNN + RF + EVT | 89.1 | 95.4 | 0.97 |
| 1D CNN + open set loss | 81.3 | 52.0 | 0.82 |
Table 5.
Results for few-shot classification.
| Model | Accuracy Seen Classes | Accuracy Unseen Classes |
|---|
| Matching network + 1D CNN | 90.6 | 66.5 |
| Prototypical network + 1D CNN | 98.4 | 73.3 |
| Prototypical network + GAF + 2D CNN | 84.3 | 60.0 |
| Prototypical network + Transformer | 94.9 | 65.6 |
| Prototypical network + LNN | 83.3 | 83.1 |
| TS2Vec + prototypical network 5-shot | 94.7 | 91.8 |
| TS2Vec + prototypical network 1-shot | 80.3 | 77.8 |
Table 6.
Embedding quality for 1D CNN feature extractor.
| Metric | Seen | Unseen | p-Value |
|---|
| Inter-class distance | 1.239 | 1.166 | 0.06 |
| Intra-class distance | 0.259 | 0.632 | 0.001 |
Table 7.
Embedding quality for unsupervised feature extractor.
| Metric | Seen | Unseen | p-Value |
|---|
| Inter-class distance | 2.22 | 2.20 | 0.99 |
| Intra-class distance | 1.67 | 1.96 | 0.03 |
Table 8.
Few-shot open-set classification results.
| Model | Accuracy Known Classes | Rejection Rate for Unknown |
|---|
| TS2Vec + prototypical network + validation set threshold | 81.5 | 99.6 |
| Prototypical network | 70.4 | 14.1 |
| Prototypical network + SnaTCHer | 60.5 | 72.5 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).