CLUMM: Contrastive Learning for Unobtrusive Motion Monitoring
Abstract
:1. Introduction
- Joint tracking by Computer Vision (CV). We remove the effect of the complex operational environment by directly extracting the coordinates of specific joints from the human body using CV techniques, specifically MediaPipe pose (MPP) [19]. MPP is an open-source framework developed by Google for estimating high-fidelity 2D and 3D coordinates of body joints. It uses BlazePose [20], a lightweight pose estimation network, to detect and track 33 3D body landmarks from videos or images. From the body landmarks identified by MPP, we select key landmarks based on the inputs of ergonomic experts to formulate the initial features significant to various motion types. This method of extracting joint information also preserves privacy by learning from joint coordinates instead of raw image data. Additionally, the MPP is scale- and size-invariant [19], which enables it to handle variations in human sizes and height.
- SimCLR feature embedding. We address the data bottleneck using a contrastive SSL approach to directly learn representations from camera data without requiring extensive manual labeling. We specifically use SimCLR [21], an SSL method that learns features by maximizing agreement between different augmented views of the same sample using a contrastive loss. We use SimCLR to learn embeddings and identify meaningful patterns and similarities within the extracted joint data depicting various motion categories. The learned representations are further leveraged in a downstream task to identify specific motion types.
- Classification for motion recognition and anomaly detection. Lastly, we leverage the learned representations from the CL training for a downstream classification task involving different motion categories. We train a simple logistic regression model on top of our learned representations to identify different motion categories in a few-shot learning [22] setting. This demonstrates the robustness and generalizability of our learned representations to downstream tasks. Additionally, we perform outlier analysis by evaluating the ability of our framework to identify out-of-distribution data. We introduce different amounts of outliers with varying deviations from the classes of interest and measure the ability of our framework to identify these outliers and the effect of outliers on the discriminative ability of our framework.
2. Literature Review
2.1. Unobtrusive Human Motion Monitoring
2.2. Self-Supervised Learning
2.3. Section Summary
3. Method Development
3.1. CLUMM Training
3.1.1. Pose Landmark Extraction
3.1.2. Contrastive Self-Supervised Representation Learning
- Random augmentation module: This module applies random data augmentations on input samples . It performs two transformations on a single sample, resulting in two correlated views and treated as a positive pair.
- Encoder module : This module uses a neural network to extract latent space encodings of the augmented samples and . The encoder is model agnostic, allowing various network designs to be used. The encoder produces output and , where is the encoder network.
- Projector head : This small neural network maps the encoded representations into a space where a contrastive loss is applied to maximize the agreement between the views [16]. A multi-layer perceptron is used, which produces output and , where represents the projector head and represents the output of the encoder module.
- Contrastive loss function: This learning objective maximizes the agreement between positive pairs.
- A.
- Data Augmentation
- Random Jitter : We apply a random jitter on the input samples using a noise signal drawn from a normal distribution with a mean of zero and a standard deviation of 0.5, i.e., . Thus, for each input sample we obtain .
- Random scaling : We scale samples with a random factor drawn from a normal distribution with a mean zero and a standard deviation of 0.2, i.e., . Therefore, for each input sample , we obtain .
- B.
- Encoder Module
- C.
- Projector Head
- D.
- Contrastive Loss Function
Algorithm 1: CLUMM feature representation learning |
(MediaPipe in this context) # Step 1: pose estimation and dataset construction : end end #Step 2: SimCLR training # Apply the two augmentation functions to each sample end # Compute pairwise similarity # Compute NT-Xent loss in Equation (2) end to minimize the loss end |
3.1.3. Finetuning with SoftMax Logistic Regression
3.2. Impact of Outliers on CLUMM Performance
- The average distance from all points in a cluster to their respective centroid
- 2
- The maximum distance of any point in a cluster to the centroid:
3.3. Summary of CLUMM
3.4. Note to Practitioners
4. Case Study
4.1. Data Preparation
4.2. CLUMM Validation on Custom Data
4.3. CLUMM Validation on Public Data
4.4. Results from Outlier Analysis
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Nomenclature
Symbol | Meaning |
Number of frames in the dataset | |
th image frame in the dataset | |
) | |
th frame | |
Dataset matrix containing all feature vectors | |
Temperature parameter used in the contrastive loss function | |
Batch size | |
th input | |
Projection of encoded representation into a contrastive loss space | |
Cosine similarity between two embeddings | |
Mean distance of all points in a cluster to its centroid | |
Standard deviation of distances from points to their nearest centroid | |
Contrastive loss function | |
Output of the residual block combining weights and skip connection in ResNet | |
Weight matrix used in residual block computation | |
Bias term in multinomial logistic regression | |
Normalized temperature-scaled cross-entropy loss |
References
- Tang, C.I.; Perez-Pozuelo, I.; Spathis, D.; Mascolo, C. Exploring Contrastive Learning in Human Activity Recognition for Healthcare. arXiv 2020, arXiv:2011.11542. Available online: http://arxiv.org/abs/2011.11542 (accessed on 15 August 2024).
- Wang, Y.; Cang, S.; Yu, H. A Survey on Wearable Sensor Modality Centred Human Activity Recognition in Health Care; Elsevier: Amsterdam, The Netherlands, 2019. [Google Scholar] [CrossRef]
- Thiel, D.V.; Sarkar, A.K. Swing Profiles in Sport: An Accelerometer Analysis. Procedia Eng. 2014, 72, 624–629. [Google Scholar] [CrossRef]
- Liu, H.; Wang, L. Human motion prediction for human-robot collaboration. J. Manuf. Syst. 2017, 44, 287–294. [Google Scholar] [CrossRef]
- Iyer, H.; Macwan, N.; Guo, S.; Jeong, H. Computer-Vision-Enabled Worker Video Analysis for Motion Amount Quantification. arXiv 2024, arXiv:2405.13999. Available online: http://arxiv.org/abs/2405.13999 (accessed on 17 August 2024).
- Fernandes, J.M.; Silva, J.S.; Rodrigues, A.; Boavida, F. A Survey of Approaches to Unobtrusive Sensing of Humans. Assoc. Comput. Mach. 2022, 55, 41. [Google Scholar] [CrossRef]
- Pham, C.; Diep, N.N.; Phuonh, T.M. e-Shoes: Smart Shoes for Unobtrusive Human Activity Recognition. In Proceedings of the 2017 9th International Conference on Knowledge and Systems Engineering (KSE), Hue, Vietnam, 19–21 October 2017. [Google Scholar]
- Rezaei, A.; Stevens, M.C.; Argha, A.; Mascheroni, A.; Puiatti, A.; Lovell, N.H. An Unobtrusive Human Activity Recognition System Using Low Resolution Thermal Sensors, Machine and Deep Learning. IEEE Trans. Biomed. Eng. 2023, 70, 115–124. [Google Scholar] [CrossRef]
- Yu, C.; Xu, Z.; Yan, K.; Chien, Y.R.; Fang, S.H.; Wu, H.C. Noninvasive Human Activity Recognition Using Millimeter-Wave Radar. IEEE Syst. J. 2022, 16, 3036–3047. [Google Scholar] [CrossRef]
- Gochoo, M.; Tan, T.H.; Liu, S.H.; Jean, F.R.; Alnajjar, F.S.; Huang, S.C. Unobtrusive Activity Recognition of Elderly People Living Alone Using Anonymous Binary Sensors and DCNN. IEEE J. Biomed. Health Inform. 2019, 23, 693–702. [Google Scholar] [CrossRef] [PubMed]
- Muthukumar, K.A.; Bouazizi, M.; Ohtsuki, T. A Novel Hybrid Deep Learning Model for Activity Detection Using Wide-Angle Low-Resolution Infrared Array Sensor. IEEE Access 2021, 9, 82563–82576. [Google Scholar] [CrossRef]
- Takenaka, K.; Kondo, K.; Hasegawa, T. Segment-Based Unsupervised Learning Method in Sensor-Based Human Activity Recognition. Sensors 2023, 23, 8449. [Google Scholar] [CrossRef]
- Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
- Lapuschkin, S.; Wäldchen, S.; Binder, A.; Montavon, G.; Samek, W.; Müller, K.R. Unmasking Clever Hans predictors and assessing what machines really learn. Nat. Commun. 2019, 10, 1096. [Google Scholar] [CrossRef] [PubMed]
- Gui, J.; Chen, T.; Zhang, J.; Cao, Q.; Sun, Z.; Luo, H.; Tao, D. A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends. arXiv 2023, arXiv:2301.05712. Available online: http://arxiv.org/abs/2301.05712 (accessed on 15 August 2024). [PubMed]
- Balestriero, R.; Ibrahim, M.; Sobal, V.; Morcos, A.S.; Shekhar, S.; Goldstein, T.; Bordes, F.; Bardes, A.; Mialon, G.; Tian, Y.; et al. A Cookbook of Self-Supervised Learning. arXiv 2023, arXiv:2304.12210. Available online: http://arxiv.org/abs/2304.12210 (accessed on 17 August 2024).
- Haresamudram, H.; Beedu, A.; Agrawal, V.; Grady, P.L.; Essa, I.; Hoffman, J.; Plötz, T. Masked reconstruction based self-supervision for human activity recognition. In Proceedings of the International Symposium on Wearable Computers, ISWC, Association for Computing Machinery, Mexico City, Mexico, 12–16 September 2020; pp. 45–49. [Google Scholar] [CrossRef]
- Haresamudram, H.; Essa, I.; Plötz, T. Assessing the State of Self-Supervised Human Activity Recognition Using Wearables. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2022, 6, 116. [Google Scholar] [CrossRef]
- Lugaresi, C.; Tang, J.; Nash, H.; McClanahan, C.; Uboweja, E.; Hays, M.; Zhang, F.; Chang, C.-L.; Yong, M.G.; Lee, J.; et al. MediaPipe: A Framework for Building Perception Pipelines. arXiv 2019, arXiv:1906.08172. Available online: http://arxiv.org/abs/1906.08172 (accessed on 15 August 2024).
- Bazarevsky, V.; Grishchenko, I.; Raveendran, K.; Zhu, T.; Zhang, F.; Grundmann, M. BlazePose: On-device Real-time Body Pose tracking. arXiv 2020, arXiv:2006.10204. Available online: http://arxiv.org/abs/2006.10204 (accessed on 28 August 2024).
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A Simple Framework for Contrastive Learning of Visual Representations. arXiv 2020, arXiv:2002.05709. Available online: http://arxiv.org/abs/2002.05709 (accessed on 18 August 2024).
- Parnami, A.; Lee, M. Learning from Few Examples: A Summary of Approaches to Few-Shot Learning. arXiv 2022, arXiv:2203.04291. Available online: http://arxiv.org/abs/2203.04291 (accessed on 24 October 2024).
- Chen, C.; Jafari, R.; Kehtarnavaz, N. A survey of depth and inertial sensor fusion for human action recognition. Multimed. Tools Appl. 2017, 76, 4405–4425. [Google Scholar] [CrossRef]
- Al-Amin, M.; Qin, R.; Tao, W.; Leu, M.C. Sensor Data Based Models for Workforce Management in Smart Manufacturing. In Proceedings of the 2018 Institute of Industrial and Systems Engineers Annual Conference (IISE 2018), Orlando, FL, USA, 19–22 May 2019. [Google Scholar]
- Noh, D.; Yoon, H.; Lee, D. A Decade of Progress in Human Motion Recognition: A comprehensive survey from 2010 to 2020. IEEE Access 2024, 12, 5684–5707. [Google Scholar] [CrossRef]
- Karayaneva, Y.; Baker, S.; Tan, B.; Jing, Y. Use of Low-Resolution Infrared Pixel Array for Passive Human Motion Movement and Recognition; BCS Learning & Development: Seindon, UK, 2018. [Google Scholar] [CrossRef]
- Yao, L.; Sheng, Q.Z.; Li, X.; Gu, T.; Tan, M.; Wang, X.; Wang, S.; Ruan, W. Compressive Representation for Device-Free Activity Recognition with Passive RFID Signal Strength. IEEE Trans. Mob. Comput. 2018, 17, 293–306. [Google Scholar] [CrossRef]
- Luo, Z.; Zou, Y.; Tech, V.; Hoffman, J.; Fei-Fei, L. Label Efficient Learning of Transferable Representations across Domains and Tasks. arXiv 2017, arXiv:1712.00123. [Google Scholar]
- Singh, A.D.; Sandha, S.S.; Garcia, L.; Srivastava, M. Radhar: Human activity recognition from point clouds generated through a millimeter-wave radar. In Proceedings of the Annual International Conference on Mobile Computing and Networking, MOBICOM, Association for Computing Machinery, Los Cabos, Mexico, 21–25 October 2019; pp. 51–56. [Google Scholar] [CrossRef]
- Foroughi, H.; Aski, B.S.; Pourreza, H. Intelligent Video Surveillance for Monitoring Fall Detection of Elderly in Home Environments. In Proceedings of the 2008 11th International Conference on Computer and Information Technology, Khulna, Bangladesh, 24–27 December 2008; pp. 219–224. [Google Scholar]
- Möncks, M.; Roche, J.; De Silva, V. Adaptive Feature Processing for Robust Human Activity Recognition on a Novel Multi-Modal Dataset. arXiv 2019, arXiv:1901.02858. [Google Scholar]
- Heikenfeld, J.; Jajack, A.; Rogers, J.; Gutruf, P.; Tian, L.; Pan, T.; Li, R.; Khine, M.; Kim, J.; Wang, J. Wearable sensors: Modalities, challenges, and prospects. Lab Chip 2018, 18, 217–248. [Google Scholar] [CrossRef]
- Newaz, N.T.; Hanada, E. A Low-Resolution Infrared Array for Unobtrusive Human Activity Recognition That Preserves Privacy. Sensors 2024, 24, 926. [Google Scholar] [CrossRef] [PubMed]
- Nehra, S.; Raheja, J.L. Unobtrusive and Non-Invasive Human Activity Recognition Using Kinect Sensor. In Proceedings of the Indo-Taiwan 2nd International Conference on Computing Analytics and Networks (Indo-Taiwan ICAN), Minxiong, Chiayi, Taiwan, 7–8 February 2020; pp. 58–63. [Google Scholar]
- Li, H.; Wan, C.; Shah, R.C.; Sample, P.A.; Patel, S.N. IDAct: Towards Unobtrusive Recognition of User Presence and Daily Activities; Institute of Electrical and Electronics Engineers: Piscataway, NJ, USA, 2019; p. 245. [Google Scholar]
- Oguntala, G.A.; Abd-Alhameed, R.A.; Ali, N.T.; Hu, Y.F.; Noras, J.M.; Eya, N.N.; Elfergani, I.; Rodriguez, J. SmartWall: Novel RFID-Enabled Ambient Human Activity Recognition Using Machine Learning for Unobtrusive Health Monitoring. IEEE Access 2019, 7, 68022–68033. [Google Scholar] [CrossRef]
- Alrashdi, I.; Siddiqi, M.H.; Alhwaiti, Y.; Alruwaili, M.; Azad, M. Maximum Entropy Markov Model for Human Activity Recognition Using Depth Camera. IEEE Access 2021, 9, 160635–160645. [Google Scholar] [CrossRef]
- Wu, H.; Pan, W.; Xiong, X.; Xu, S. Human Activity Recognition Based on the Combined SVM&HMM. In Proceedings of the IEEE International Conference on Information and Automation, Hailar, China, 28–30 July 2014; p. 1317. [Google Scholar]
- Lecun, Y.; Bengio, Y.; Hinton, G. Deep Learning; Nature Publishing Group: London, UK, 2015. [Google Scholar] [CrossRef]
- Açış, B.; Güney, S. Classification of human movements by using Kinect sensor. Biomed. Signal Process. Control 2023, 81, 104417. [Google Scholar] [CrossRef]
- Ramos, R.G.; Domingo, J.D.; Zalama, E.; Gómez-García-bermejo, J. Daily human activity recognition using non-intrusive sensors. Sensors 2021, 216, 5270. [Google Scholar] [CrossRef] [PubMed]
- Choudhary, P.; Kumari, P.; Goel, N.; Saini, M. An Audio-Seismic Fusion Framework for Human Activity Recognition in an Outdoor Environment. IEEE Sens. J. 2022, 223, 22817–22827. [Google Scholar] [CrossRef]
- Quero, J.; Burns, M.; Razzaq, M.; Nugent, C.; Espinilla, M. Detection of Falls from Non-Invasive Thermal Vision Sensors Using Convolutional Neural Networks. Proceedings 2018, 2, 1236. [Google Scholar] [CrossRef]
- Iyer, H.; Jeong, H. PE-USGC: Posture Estimation-Based Unsupervised Spatial Gaussian Clustering for Supervised Classification of Near-Duplicate Human Motion. IEEE Access 2024, 12, 163093–163108. [Google Scholar] [CrossRef]
- Rahayu, E.S.; Yuniarno, E.M.; Purnama, I.K.E.; Purnomo, M.H. Human activity classification using deep learning based on 3D motion feature. Mach. Learn. Appl. 2023, 12, 100461. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. Available online: https://github.com/liuzhuang13/DenseNet (accessed on 10 October 2024).
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv 2016, arXiv:1610.02357. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2015, arXiv:1512.03385. Available online: http://arxiv.org/abs/1512.03385 (accessed on 10 October 2024).
- Ghelmani, A.; Hammad, A. Self-supervised contrastive video representation learning for construction equipment activity recognition on limited dataset. Autom. Constr. 2023, 154, 105001. [Google Scholar] [CrossRef]
- Lin, L.; Song, S.; Yang, W.; Liu, J. MS2L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition. In Proceedings of the MM 2020—Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 2490–2498. [Google Scholar] [CrossRef]
- Yang, Y.; Liu, G.; Gao, X. Motion Guided Attention Learning for Self-Supervised 3D Human Action Recognition. IEEE Trans. Circuits Syst. Video Technol. 2022, 322, 8623–8634. [Google Scholar] [CrossRef]
- Lin, W.; Ding, X.; Huang, Y.; Zeng, H. Self-Supervised Video-Based Action Recognition with Disturbances. IEEE Trans. Image Process. 2023, 32, 2493–2507. [Google Scholar] [CrossRef] [PubMed]
- Seo, M.; Cho, D.; Lee, S.; Park, J.; Kim, D.; Lee, J.; Ju, J.; Noh, H.; Choi, D.G. A Self-Supervised Sampler for Efficient Action Recognition: Real-World Applications in Surveillance Systems. IEEE Robot Autom. Lett. 2022, 7, 1752–1759. [Google Scholar] [CrossRef]
- Wickstrøm, K.; Kampffmeyer, M.; Mikalsen, K.Ø.; Jenssen, R. Mixing up contrastive learning: Self-supervised representation learning for time series. Pattern Recognit. Lett. 2022, 155, 54–61. [Google Scholar] [CrossRef]
- Liu, S.; Mallol-Ragolta, A.; Parada-Cabaleiro, E.; Qian, K.; Jing, X.; Kathan, A.; Hu, B.; Schuller, B.W. Audio Self-Supervised Learning: A Survey. arXiv 2022, arXiv:2203.01205. Available online: http://arxiv.org/abs/2203.01205 (accessed on 18 August 2024).
- Schiappa, M.C.; Rawat, Y.S.; Shah, M. Self-Supervised Learning for Videos: A Survey. ACM Comput. Surv. 2022, 55, 1–37. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning. In Adaptive Computation and Machine Learning Series; MIT Press: Cambridge, MA, USA, 2016; Available online: https://books.google.com/books?id=omivDQAAQBAJ (accessed on 4 February 2024).
- Xie, Z.; Zhang, Z.; Cao, Y.; Lin, Y.; Bao, J.; Yao, Z.; Dai, Q.; Hu, H. SimMIM: A Simple Framework for Masked Image Modeling. Available online: https://github.com/microsoft/SimMIM (accessed on 18 August 2024).
- Ye, W.; Zheng, G.; Cao, X.; Ma, Y.; Zhang, A. Spurious Correlations in Machine Learning: A Survey. arXiv 2024, arXiv:2402.12715. Available online: http://arxiv.org/abs/2402.12715 (accessed on 19 August 2024).
- Jaiswal, A.; Babu, A.R.; Zadeh, M.Z.; Banerjee, D.; Makedon, F. A Survey on Contrastive Self-supervised Learning. arXiv 2020, arXiv:2011.00362. Available online: http://arxiv.org/abs/2011.00362 (accessed on 22 November 2024).
- Tian, Y.; Sun, C.; Poole, B.; Krishnan, D.; Schmid, C.; Isola, P. What Makes for Good Views for Contrastive Learning? arXiv 2020, arXiv:2005.10243. Available online: http://arxiv.org/abs/2005.10243 (accessed on 22 November 2024).
- Shah, K.; Spathis, D.; Tang, C.I.; Mascolo, C. Evaluating Contrastive Learning on Wearable Timeseries for Downstream Clinical Outcomes. arXiv 2021, arXiv:2111.07089. Available online: http://arxiv.org/abs/2111.07089 (accessed on 22 November 2024).
- Ågren, W. The nt-xent loss upper bound. arXiv 2022, arXiv:2205. 03169. Available online: https://arxiv.org/pdf/2205.03169 (accessed on 22 November 2024).
- Kwak, C.; Clayton-Matthews, A. Multinomial Logistic Regression. Nurs. Res. 2002, 51, 404–410. [Google Scholar] [CrossRef] [PubMed]
- Cox, D.R. The regression analysis of binary sequences. J. R. Stat. Soc. Ser. B Methodol. 1958, 20, 215–232. [Google Scholar] [CrossRef]
- Ji, S.; Xie, Y. Logistic Regression: From Binary to Multi-Class; Texas A&M University: College Station, TX, USA, 2024. [Google Scholar]
- Ruder, S. An Overview of Gradient Descent Optimization Algorithms. arXiv 2016, arXiv:1609.04747. Available online: http://arxiv.org/abs/1609.04747 (accessed on 15 August 2024).
- Hong, D.; Wang, J.; Gardner, R. Chapter 1—Fundamentals. In Real Analysis with an Introduction to Wavelets and Applications; Hong, D., Wang, J., Gardner, R., Eds.; Academic Press: Cambridge, MA, USA, 2005; pp. 1–32. [Google Scholar] [CrossRef]
- Xuan, H.; Stylianou, A.; Liu, X.; Pless, R. Hard Negative Examples are Hard, but Useful. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 126–142. [Google Scholar] [CrossRef]
- Carreira, J.; Zisserman, A.; Com, Z. Deepmind. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. arXiv 2017, arXiv:1705.07750. [Google Scholar]
Pose Number | Representation |
---|---|
11 | Left Shoulder |
12 | Right Shoulder |
13 | Left Elbow |
14 | Right Elbow |
15 | Left Wrist |
16 | Right Wrist |
23 | Left Hip |
24 | Right Hip |
25 | Left Knee |
26 | Right Knee |
Dataset | Network | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|---|
ResNet18 Baseline | 79.6% | 0.801 | 0.797 | 0.796 | |
ResNet50 Baseline | 77.3% | 0.782 | 0.773 | 0.771 | |
Custom | CLUMM (ResNet50 Backbone) | 83.5% | 0.833 | 0.835 | 0.831 |
CLUMM (ResNet18 Backbone) | 90.0% | 0.899 | 0.90 | 0.899 | |
UCF Sports Action | ResNet18 Baseline | 78.9% | 0.839 | 0.826 | 0.829 |
ResNet50 Baseline | 77.3% | 0.818 | 0.777 | 0.791 | |
CLUMM (ResNet50 Backbone) | 86.9% | 0.869 | 0.865 | 0.865 | |
CLUMM (ResNet18 Backbone) | 88.2% | 0.882 | 0.879 | 0.878 |
Number of Outlier Images | Percentage of Outlier Landmarks | Mean Outlier Distance | Max Outlier Distance | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|---|---|
None | - | 90.00% | 0.899 | 0.90 | 0.899 | ||
100 | 3.5% | 0.30 | 0.80 | 86.76% | 0.867 | 0.867 | 0.865 |
200 | 3.1% | 0.30 | 0.81 | 84.48% | 0.845 | 0.844 | 0.843 |
500 | 3% | 0.31 | 0.83 | 84.64 | 0.845 | 0.846 | 0.846 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gyamenah, P.; Iyer, H.; Jeong, H.; Guo, S. CLUMM: Contrastive Learning for Unobtrusive Motion Monitoring. Sensors 2025, 25, 1048. https://doi.org/10.3390/s25041048
Gyamenah P, Iyer H, Jeong H, Guo S. CLUMM: Contrastive Learning for Unobtrusive Motion Monitoring. Sensors. 2025; 25(4):1048. https://doi.org/10.3390/s25041048
Chicago/Turabian StyleGyamenah, Pius, Hari Iyer, Heejin Jeong, and Shenghan Guo. 2025. "CLUMM: Contrastive Learning for Unobtrusive Motion Monitoring" Sensors 25, no. 4: 1048. https://doi.org/10.3390/s25041048
APA StyleGyamenah, P., Iyer, H., Jeong, H., & Guo, S. (2025). CLUMM: Contrastive Learning for Unobtrusive Motion Monitoring. Sensors, 25(4), 1048. https://doi.org/10.3390/s25041048