A Dynamic Precision Evaluation System for Physical Education Classroom Teaching Behaviors Based on the CogVLM2-Video Model
Abstract
1. Introduction
2. Literature Review
2.1. From Pure Manual to Semi-Automated Traditional Evaluations of Teaching Behaviors in PE Classrooms
2.1.1. Purely Manual Evaluation Stage
2.1.2. Semi-Automated Evaluation Stage
2.2. From Convolutional Neural Network-Based Video Action Recognition to Intelligent Evaluation of Teaching Behaviors in PE Classrooms with CogVLM2-Video Models
2.2.1. Exploratory Stage: Convolutional Neural Network and Temporal Segment Network (2014–2016)
2.2.2. Temporal Modeling Enhancements: Convolutional 3D, Inflated 3D, and Temporal Pyramid Network Models (2016–2024)
2.2.3. Emergence and Application Potential of the CogVLM2-Video Model (2024–2025)
3. Materials and Methods
3.1. System Design Methodology
3.2. Technical Implementation Principles
3.3. Data Analysis
3.3.1. System Architecture Verification and Analysis
3.3.2. Application Validation and Analysis of the System
3.4. Ethical Considerations
4. Results
4.1. Implementation of the CogVLM2-Video Model-Based System for Evaluating Teaching Behaviors in PE Classrooms
4.1.1. Construction of Diversified Evaluation Indicators for Teaching Behaviors as the Foundation for AI Integration and Precise Annotation
4.1.2. Intelligent Data Collection: High-Precision Multi-Camera Setup for Comprehensive Non-Intrusive Data Collection and Pre-Processing
4.1.3. Platform Construction: Front-End and Back-End Technologies with MySQL Database Used to Create an Efficient, Compatible, and Stable System Environment
4.1.4. Model Development: Integrating the CogVLM2-Video Model with Multi-Algorithm Fusion for Automated Annotation and Comprehensive Analysis of Teaching Behaviors
- ➀
- CogVLM2-Video Model
- ➁
- Intelligent Algorithm Analysis Model
- (1)
- Information Entropy Analysis
- (2)
- Redundancy Analysis
- (3)
- Ratio Analysis
- (4)
- Temporal Analysis
4.1.5. Model Training
4.2. Application of the CogVLM2-Video Model-Based System for Evaluating Teaching Behavior in PE Classrooms
4.2.1. High Information Entropy, Low Redundancy: Stimulating Classroom Vitality and Eco-Constructivism
4.2.2. From the Macro to the Micro: Comprehensive Deconstruction of the Classroom Structure
- ➀
- The Teaching Principle of “Less Talking, More Practicing” Is Implemented, Emphasizing Circulatory Guidance and Evaluation During the Teaching Process
- ➁
- Learning and Practicing Structured Motor Skills, with Students Guided to Apply What They Learned
- ➂
- Harmonious Interactions Between Teachers and Students, Creating a Positive and Uplifting Classroom Atmosphere
- ➃
- High Exercise Density, Ensuring Sufficient, Effective, and Sustainable Exercise Time for Students
- ➄
- Low Level of Questioning and Limited Learning Depth
- ➅
- Low Level of Application of Modern Information Technology and the Need to Improve Information Literacy
4.2.3. From Point to Area, Precisely Depicting the Teaching Process Framework
5. Discussion
5.1. Advancing Evaluations of Teaching Behaviors in PE Classrooms with Intelligent Technology to Achieve Automation and Precision in Behavioral Analysis
5.1.1. Microservice Architecture and Dual-Layer Database: Novel Integration of Software and Database Design
5.1.2. Behavior Annotation Technology Based on the CogVLM2-Video Model: A New Approach for Automatically Capturing and Classifying Teaching Behaviors
5.1.3. Multi-Algorithm Fusion for Intelligent Analysis: A New Paradigm for Quantifying Teaching Behaviors in PE Classrooms
5.1.4. A New Model for Precise Teaching Feedback Offering a Fully Automated Data Collection, Analysis, and Reporting Platform
5.2. Upholding Ethical Standards and Data Security While Enhancing PE Teachers’ Digital Literacy
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Centra, J.A.; Potter, D.A. School and teacher effects: An interrelational model. Rev. Educ. Res. 1980, 50, 273–291. [Google Scholar] [CrossRef]
- Zhou, T.; Wu, X.; Wang, Y.; Wang, Y.; Zhang, S. Application of artificial intelligence in physical education: A systematic review. Educ. Inf. Technol. 2024, 29, 8203–8220. [Google Scholar] [CrossRef]
- Górriz, J.M.; Ramírez, J.; Ortíz, A.; Martínez-Murcia, F.J.; Segovia, F.; Suckling, J.; Leming, M.; Zhang, Y.-D.; Álvarez-Sánchez, J.R.; Bologna, G.; et al. Artificial intelligence within the interplay between natural and artificial computation: Advances in data science, trends and applications. Neurocomputing 2020, 410, 237–270. [Google Scholar] [CrossRef]
- Ji, L. Interpretation of National Physical Education and Health Curriculum Standards (2017 Edition) of High Schools in China; China Sport Science: Beijing, China, 2017; Volume 38, pp. 3–20. [Google Scholar] [CrossRef]
- Cheffers, J.T.F. Cheffer’s adaptation of the Flanders’ interaction analysis system (CAFIAS). In Systematic Observation Instrumentation for Physical Education; Darst, P.W., Zakrajsek, D., Mancini, V.H., Eds.; L-Eisure Press: New York, NY, USA, 1983; pp. 76–96. [Google Scholar]
- Stewart, M. Observational recording record of physical educator’s teaching behavior (ORRPETB). In Analyzing Physical Education and Sport Instruction; Darst, P.W., Zakrajsek, D., Mancini, V.H., Eds.; Human Kinetics: Champaign, IL, USA, 1989; pp. 249–259. [Google Scholar]
- Quested, E.; Ntoumanis, N.; Stenling, A.; Thogersen-Ntoumani, C.; Hancox, J.E. The need-relevant instructor behaviors scale: Development and initial validation. J. Sport Exerc. Psychol. 2018, 40, 259–268. [Google Scholar] [CrossRef]
- Liu, C.; Dong, C.; Ji, L. Analysis of the characteristics and influencing factors of PE classroom teaching under the Chinese health physical education curriculum model. J. Tianjin Univ. Sport 2023, 38, 289–295. [Google Scholar] [CrossRef]
- Karpathy, A.; Toderici, G.; Shetty, S.; Leung, T.; Sukthankar, R.; Li, F.-F. Large-scale video classification with convolutional neural networks. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 1725–1732. [Google Scholar] [CrossRef]
- Wang, L.; Xiong, Y.; Wang, Z.; Qiao, Y.; Lin, D.; Tang, X.; Van Gool, L. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. In Proceedings of the 14th ECCV 2016, Amsterdam, The Netherlands, 8–16 October 2016; pp. 20–36. [Google Scholar] [CrossRef]
- Yang, C.; Xu, Y.; Shi, J.; Dai, B.; Zhou, B. Temporal pyramid network for action recognition. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 588–597. [Google Scholar] [CrossRef]
- Tu, Q.; Zhao, X.; Gong, D.; Zhang, Q. Improved ECA-ResTCN for Online Classroom Student Attention Recognition. Teh. Vjesn. 2024, 31, 832–836. [Google Scholar] [CrossRef]
- Jia, Q.; He, J. Student Behavior Recognition in Classroom Based on Deep Learning. Appl. Sci. 2024, 14, 7981. [Google Scholar] [CrossRef]
- Han, L.; Ma, X.; Dai, M.; Bai, L. A WAD-YOLOv8-based method for classroom student behavior detection. Sci. Rep. 2025, 15, 9655. [Google Scholar] [CrossRef]
- Wang, Q. Research on Student Movement Behavior Recognition Based on 3D-CNN Algorithm. In Proceedings of the 2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA), Shenyang, China, 28–30 June 2024; pp. 791–1795. [Google Scholar] [CrossRef]
- Zheng, Q.; Chen, Z.; Wang, M.; Shi, Y.; Chen, S.; Liu, Z. Automated Multimode Teaching Behavior Analysis: A Pipeline-Based Event Segmentation and Description. IEEE Trans. Learn. Technol. 2024, 17, 1677–1693. [Google Scholar] [CrossRef]
- Abedi, A.; Khan, S.S. Improving state-of-the-art in Detecting Student Engagement with Resnet and TCN Hybrid Network. In Proceedings of the 2021 18th Conference on Robots and Vision (CRV), Burnaby, BC, Canada, 26–28 May 2021; pp. 151–157. [Google Scholar] [CrossRef]
- Hong, W.; Wang, W.; Ding, M.; Yu, W.; Lv, Q.; Wang, Y.; Cheng, Y.; Huang, S.; Ji, J.; Xue, Z.; et al. Cogvlm2: Visual language models for image and video understanding. arXiv 2024. [Google Scholar] [CrossRef]
- Peng, C.; Zhang, K.; Lyu, M.; Liu, H.; Sun, L.; Wu, Y. Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning. arXiv 2025. [Google Scholar] [CrossRef]
- Qu, C.; Wang, J. Application and research of ASP/ADO technology in the development of distance teaching systems. J. Nantong Univ. Nat. Sci. Ed. 2003, 82–85. Available online: https://kns.cnki.net/kcms2/article/abstract?v=bTgd32KJj6u_1XGHAf8bkr5ldxyRbIpcLek46yzSACHst6tEmUVkFfuaDGXT9pqlczdL5GFJqvsRL82FvVPws55aeOSdL_BGESldySycmBDx5xHv7uU2taB4bxm2TDQdjbA2NPeW3JsWm0s9hT9Rp4rJmuCRQTXpH9qlxYlQ6qgmAjGk4wncwlwa0rDR2X-C&uniplatform=NZKPT&language=CHS (accessed on 6 July 2025). (In Chinese).
- Liu, C.; Dong, C.; Li, X.; Huang, H.; Wang, Q. Analysis of Physical Education Classroom Teaching after Implementation of the Chinese Health Physical Education Curriculum Model: A Video-Based Assessment. Behav. Sci. 2023, 13, 251. [Google Scholar] [CrossRef] [PubMed]
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- González-Peño, A.; Franco, E.; Coterón, J. Do observed teaching behaviors relate to students’ engagement in physical education? Int. J. Environ. Res. Public Health 2021, 18, 2234. [Google Scholar] [CrossRef]
- Jin, G.; He, L.; Tsai, S.B. An Empirical Study on Virtual English Teaching System Based on the Microservice Architecture with Wireless Internet Sensor Network. Math. Probl. Eng. 2021, 2021, 8494410. [Google Scholar] [CrossRef]
- Kay, W.; Carreira, J.; Simonyan, K.; Zhang, B.; Hillier, C.; Vijayanarasimhan, S.; Viola, F.; Green, T.; Back, T.; Natsev, P.; et al. The kinetics human action video dataset. arXiv 2017. [Google Scholar] [CrossRef]
- Blinowski, G.; Ojdowska, A.; Przybyłek, A. Monolithic vs. microservice architecture: A performance and scalability evaluation. IEEE Access 2022, 10, 20357–20374. [Google Scholar] [CrossRef]
- Buede, D.M.; Miller W, D. The Engineering Design of Systems: Models and Methods; John Wiley & Sons: Hoboken, NJ, USA, 2024; p. 220. [Google Scholar]
- Miao, K.; Li, J.; Hong, W.; Chen, M. A Microservice-Based Big Data Analysis Platform for Online Educational Applications. Sci. Program. 2020, 2020, 6929750. [Google Scholar] [CrossRef]
- Prieto, L.P.; Sharma, K.; Dillenbourg, P.; Jesús, M. Teaching analytics: Towards automatic extraction of orchestration graphs using wearable sensors. In Proceedings of the Sixth International Conference on Learning Analytics & Knowledge (LAK ‘16), Edinburgh, UK, 25–29 April 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 148–157. [Google Scholar] [CrossRef]
- Ma, C.; Yang, P. Research on classroom teaching behavior analysis and evaluation system based on deep learning face recognition technology. J. Phys. Conf. Ser. 2021, 1992, 032040. [Google Scholar] [CrossRef]
- Almusawi, H.A.; Durugbo, C.M.; Bugawa, A.M. Innovation in physical education: Teachers’ perspectives on readiness for wearable technology integration. Comput. Educ. 2021, 167, 104185. [Google Scholar] [CrossRef]
- Li, Z.; Su, H.; Jiang, C.; Han, J. Machine Learning-Enhanced ORB Matching Using EfficientPS for Error Reduction. Appl. Math. Nonlinear Sci. 2024, 9, 1–15. [Google Scholar] [CrossRef]
- Rone, N.; Guao, N.A.; Jariol, M., Jr.; Acedillo, N. Students’ Lack of Interest, Motivation in Learning, and Classroom Participation: How to Motivate Them? Psychol. Educ. Multidiscip. J. 2023, 7, 636–646. [Google Scholar] [CrossRef]
- Lander, N.; Nahavandi, D.; Toomey, N.G.; Barnett, L.M.; Mohamed, S. Accuracy vs. practicality of inertial measurement unit sensors to evaluate motor competence in children. Front. Sports Act. Living 2022, 4, 917340. [Google Scholar] [CrossRef]
- Zhang, T.; Wu, Y.; Li, X. Dilated Multi-Temporal Modeling for Action Recognition. Appl. Sci. 2023, 13, 6934. [Google Scholar] [CrossRef]
- Lupton, D.; Williamson, B. The datafied child: The dataveillance of children and implications for their rights. New Media Soc. 2017, 19, 780–794. [Google Scholar] [CrossRef]
- Floridi, L. Establishing the rules for building trustworthy AI. Ethics Gov. Policies Artif. Intell. 2021, 144, 41–45. [Google Scholar] [CrossRef]
- Standing Committee of the National People’s Congress of the People’s Republic of China. Personal Information Protection Law of the People’s Republic of China; National People’s Congress Standing Committee: Beijing, China, 2021. [Google Scholar]
- European Parliament and Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC. Off. J. Eur. Union 2016, L119, 1–88. Available online: https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed on 6 July 2025).
- Wiefling, S.; Tolsdorf, J.; Iacono, L.L. Privacy considerations for risk-based authentication systems. In Proceedings of the 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Vienna, Austria, 6–10 September 2021; pp. 320–327. [Google Scholar] [CrossRef]
- Akinrinola, O.; Okoye, C.C.; Ofodile, O.C.; Ugochukwu, C.E. Navigating and reviewing ethical dilemmas in AI development: Strategies for transparency, fairness, and accountability. GSC Adv. Res. Rev. 2024, 18, 50–58. [Google Scholar] [CrossRef]
- Jedličková, A. Ethical approaches in designing autonomous and intelligent systems: A comprehensive survey towards responsible development. AI Soc. 2024, 40, 2703–2716. [Google Scholar] [CrossRef]
Experimental Metric | Description | Result Value | Explanation |
---|---|---|---|
Video Data Volume | Number of basketball classroom teaching videos collected and pre-processed | 50 | Collected via the perception layer, including videos with various teaching behaviors |
Number of Action Categories | Categories of teaching behavior actions | 12 | Includes “teacher demonstration,” “student practice,” “interactive Q&A,” etc. |
Data Pre-Processing Time | Average pre-processing time per video | 15 s | Includes denoising, segmentation, and normalization processes |
Action Recognition Accuracy | Accuracy of action recognition based on the CogVLM2-Video model | 90% | Classification accuracy of teaching behaviors by the model |
Annotation Consistency | Consistency between model results and manual annotations | 95% | Manual annotation results were used as a reference |
Data Transmission Latency | Average latency from data collection to visualization | 1.5 s | Includes the entire process of collection, pre-processing, analysis, and front-end display |
Overall System Evaluation | User satisfaction score (1–5 scale) | 4.8 | Based on feedback from 10 participating teachers |
Metric | Description | Result |
---|---|---|
Action Categories | Total number of action categories in the test set | 50 |
Test Videos | Total number of videos in the test set | 200 |
Action Recognition Accuracy | Correct classification rate of action categories | 92% |
Annotation Consistency | Consistency between model predictions and manual annotations | 95% |
Inference Speed | Time to process each frame of video data | 25 ms |
Recall | Proportion of correctly identified positive samples | 90% |
Precision | Proportion of true positives among predicted positives | 93% |
F1 Score | Harmonic mean of precision and recall | 91.50% |
Category | Proportion (%) | Frequency | Duration (s) |
---|---|---|---|
Explanation and Demonstration | 5 | 12 | 120 |
Guidance and Evaluation | 5 | 11 | 110 |
Technology Use | 0 | 0 | 0 |
Transitions | 8 | 19 | 190 |
Warm-Up Exercises | 10 | 24 | 240 |
Single Technique Practice | 0 | 0 | 0 |
Combined Technique Practice | 6 | 15 | 150 |
Demonstration and Competition | 22 | 51 | 510 |
Fitness Training | 20 | 48 | 480 |
Relaxation Exercises | 3 | 7 | 70 |
Closed Questions—Expected Responses—General Feedback | 3 | 8 | 80 |
Open Questions—Interpretative Responses—Professional Feedback | 0 | 0 | 0 |
Collaborative Learning | 3 | 8 | 80 |
Teacher-Student Competitions | 13 | 30 | 300 |
Peer Discussion and Evaluation | 2 | 4 | 40 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, C.; Yang, F.; Ge, C.; Shao, Z. A Dynamic Precision Evaluation System for Physical Education Classroom Teaching Behaviors Based on the CogVLM2-Video Model. Appl. Sci. 2025, 15, 7712. https://doi.org/10.3390/app15147712
Liu C, Yang F, Ge C, Shao Z. A Dynamic Precision Evaluation System for Physical Education Classroom Teaching Behaviors Based on the CogVLM2-Video Model. Applied Sciences. 2025; 15(14):7712. https://doi.org/10.3390/app15147712
Chicago/Turabian StyleLiu, Chao, Fan Yang, Chengyu Ge, and Zhiyu Shao. 2025. "A Dynamic Precision Evaluation System for Physical Education Classroom Teaching Behaviors Based on the CogVLM2-Video Model" Applied Sciences 15, no. 14: 7712. https://doi.org/10.3390/app15147712
APA StyleLiu, C., Yang, F., Ge, C., & Shao, Z. (2025). A Dynamic Precision Evaluation System for Physical Education Classroom Teaching Behaviors Based on the CogVLM2-Video Model. Applied Sciences, 15(14), 7712. https://doi.org/10.3390/app15147712