Enhancing Programming Performance, Learning Interest, and Self-Efficacy: The Role of Large Language Models in Middle School Education
Abstract
1. Introduction
2. Literature Review
2.1. Programming Education
2.2. Large Language Models
2.3. Large Language Models and Programming Education
3. Method
3.1. Participants
3.2. Research Design
3.3. Instructional Intervention
3.3.1. Overview of iFLYTEK Spark
3.3.2. Integration of iFLYTEK Spark into Instruction
3.4. Data Collection Instruments
3.4.1. Programming Performance Test
3.4.2. Learning Interest Questionnaire
3.4.3. Self-Efficacy Questionnaire
3.4.4. Semi-Structured Interviews
3.5. Data Analysis
4. Result
4.1. Programming Performance
4.2. Learning Interest
4.3. Self-Efficacy
4.4. Qualitative Findings
4.4.1. Themed Finding 1: LLMs Are Intelligent Learning Companions Fostering Students’ Self-Directed Learning and Mastery of Programming
4.4.2. Themed Finding 2: LLMs Provided Just-in-Time Support and Fostered Dialogic Engagement in Human–Machine Interaction
4.4.3. Themed Finding 3: LLMs Stimulated Exploratory Learning and Supported the Transfer of Programming Knowledge
5. Discussion and Conclusions
5.1. Impact of LLMs on Programming Performance
5.2. Impact of LLMs on Interest in Programming
5.3. Impact of LLMs on Programming Self-Efficacy
5.4. Implications for Programming Teaching
5.5. Limitations and Future Research
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix B
References
- Tuomi, P.; Multisilta, J.; Saarikoski, P.; Suominen, J. Coding skills as a success factor for a society. Educ. Inf. Technol. 2018, 23, 419–434. [Google Scholar] [CrossRef]
- Panth, B.; Maclean, R. Introductory overview: Anticipating and preparing for emerging skills and jobs—Issues, concerns, and prospects. In Anticipating and Preparing for Emerging Skills and Jobs. Education in the Asia-Pacific Region: Issues, Concerns and Prospects; Springer: Singapore, 2020; pp. 1–10. [Google Scholar] [CrossRef]
- Papadakis, S. The impact of coding apps to support young Children in computational thinking and computational fluency. A literature review. Front. Educ. 2021, 6, 657895. [Google Scholar] [CrossRef]
- Chen, C.-M.; Huang, M.-Y. Enhancing programming learning performance through a Jigsaw collaborative learning method in a metaverse virtual space. Int. J. STEM Educ. 2024, 11, 36. [Google Scholar] [CrossRef]
- Tsai, M.-J.; Wang, C.-Y.; Hsu, P.-F. Developing the Computer Programming Self-Efficacy Scale for Computer Literacy Education. J. Educ. Comput. Res. 2019, 56, 1345–1360. [Google Scholar] [CrossRef]
- Atmatzidou, S.; Demetriadis, S. Advancing students’ computational thinking skills through educational robotics: A study on age and gender relevant differences. Rob. Auton. Syst. 2016, 75, 661–670. [Google Scholar] [CrossRef]
- Kanbul, S.; Uzunboylu, H. Importance of coding education and robotic applications for achieving 21st-century skills in North Cyprus. Int. J. Emerg. Technol. Learn. 2017, 12, 130. [Google Scholar] [CrossRef]
- Nouri, J.; Zhang, L.; Mannila, L.; Norén, E. Development of computational thinking, digital competence and 21st century skills when learning programming in K-9. Educ. Inq. 2020, 11, 1–17. [Google Scholar] [CrossRef]
- Moraiti, I.; Fotoglou, A.; Drigas, A. Coding with block programming languages in educational robotics and mobiles, improve problem solving, creativity & critical thinking skills. Int. J. Interact. Mob. Technol. 2022, 16, 59–78. [Google Scholar] [CrossRef]
- Wu, L.; Looi, C.-K.; Multisilta, J.; How, M.-L.; Choi, H.; Hsu, T.-C.; Tuomi, P. Teacher’s perceptions and readiness to teach coding skills: A comparative study between Finland, Mainland China, Singapore, Taiwan, and South Korea. Asia-Pac. Educ. Res. 2020, 29, 21–34. [Google Scholar] [CrossRef]
- Lindberg, R.S.N.; Laine, T.H.; Haaranen, L. Gamifying programming education in K-12: A review of programming curricula in seven countries and programming games. Br. J. Educ. Technol. 2019, 50, 1979–1995. [Google Scholar] [CrossRef]
- Kasneci, E.; Sessler, K.; Küchemann, S.; Bannert, M.; Dementieva, D.; Fischer, F.; Gasser, U.; Groh, G.; Günnemann, S.; Hüllermeier, E.; et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learn. Individ. Differ. 2023, 103, 102274. [Google Scholar] [CrossRef]
- Silva, C.A.G.d.; Ramos, F.N.; de Moraes, R.V.; Santos, E.L.d. ChatGPT: Challenges and benefits in software programming for higher education. Sustainability 2024, 16, 1245. [Google Scholar] [CrossRef]
- Phung, T.; Pădurean, V.-A.; Cambronero, J.; Gulwani, S.; Kohn, T.; Majumdar, R.; Singla, A.; Soares, G. Generative AI for programming education: Benchmarking ChatGPT, GPT-4, and human tutors. In Proceedings of the 2023 ACM Conference on International Computing Education Research, Chicago IL USA, 7 August 2023; Volume 2, pp. 41–42. [Google Scholar] [CrossRef]
- Humble, N.; Boustedt, J.; Holmgren, H.; Milutinovic, G.; Seipel, S.; Östberg, A.-S. Cheaters or AI-Enhanced Learners: Consequences of ChatGPT for programming education. Electron. J. e-Learn. 2024, 22, 16–29. [Google Scholar] [CrossRef]
- Okonkwo, C.W.; Ade-Ibijola, A. Python-bot: A chatbot for teaching python programming. Eng. Lett. 2020, 29, 25–34. [Google Scholar]
- Chen, E.; Huang, R.; Chen, H.-S.; Tseng, Y.-H.; Li, L.-Y. GPTutor: A ChatGPT-powered programming tool for code explanation. In Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science; Springer: Cham, Switzerland, 2023; pp. 321–327. [Google Scholar] [CrossRef]
- Groothuijsen, S.; van den Beemt, A.; Remmers, J.C.; van Meeuwen, L.W. AI chatbots in programming education: Students’ use in a scientific computing course and consequences for learning. Comput. Educ. Artif. Intell. 2024, 7, 100290. [Google Scholar] [CrossRef]
- Lyu, W.; Wang, Y.; Chung, T.; Sun, Y.; Zhang, Y. Evaluating the effectiveness of LLMs in introductory computer science education: A semester-long field study. In Proceedings of the Eleventh ACM Conference on Learning @ Scale, Atlanta, GA, USA, 7 September 2024; pp. 63–74. [Google Scholar] [CrossRef]
- Johnson, D.M.; Doss, W.; Estepp, C.M. Using ChatGPT with novice arduino programmers: Effects on performance, interest, self-efficacy, and programming ability. J. Res. Tech. Careers 2024, 8, 1. [Google Scholar] [CrossRef]
- Raihan, N.; Siddiq, M.L.; Santos, J.C.S.; Zampieri, M. Large language models in computer science education: A systematic literature review. In Proceedings of the 56th ACM Technical Symposium on Computer Science Education V. 1, Pittsburgh, PA, USA, 2 December 2025; pp. 938–944. [Google Scholar] [CrossRef]
- Abdulla, S.; Ismail, S.; Fawzy, Y.; Elhag, A. Using ChatGPT in teaching computer programming and studying its impact on students performance. Electron. J. e-Learn. 2024, 22, 66–81. [Google Scholar] [CrossRef]
- Choi, S.; Kim, H. The impact of a large language model-based programming learning environment on students’ motivation and programming ability. Educ. Inf. Technol. 2024, 30, 8109–8138. [Google Scholar] [CrossRef]
- Jošt, G.; Taneski, V.; Karakatič, S. The impact of large language models on programming education and student learning outcomes. Appl. Sci. 2024, 14, 4115. [Google Scholar] [CrossRef]
- Padiyath, A.; Hou, X.; Pang, A.; Viramontes Vargas, D.; Gu, X.; Nelson-Fromm, T.; Wu, Z.; Guzdial, M.; Ericson, B. Insights from social shaping theory: The appropriation of large language models in an undergraduate programming course. In Proceedings of the 2024 ACM Conference on International Computing Education Research-Volume 1, Melbourne, VIC, Australia, 12 August 2024; pp. 114–130. [Google Scholar] [CrossRef]
- Guizani, S.; Mazhar, T.; Shahzad, T.; Ahmad, W.; Bibi, A.; Hamam, H. A systematic literature review to implement large language model in higher education: Issues and solutions. Discov. Educ. 2025, 4, 35. [Google Scholar] [CrossRef]
- Wing, J.M. Computational thinking. Commun. ACM 2006, 49, 33–35. [Google Scholar] [CrossRef]
- Fessakis, G.; Gouli, E.; Mavroudi, E. Problem solving by 5–6 years old kindergarten children in a computer programming environment: A case study. Comput. Educ. 2013, 63, 87–97. [Google Scholar] [CrossRef]
- Theodoropoulos, A.; Lepouras, G. Augmented reality and programming education: A systematic review. Int. J. Child-Comput. Interact. 2021, 30, 100335. [Google Scholar] [CrossRef]
- Brinda, T.; Puhlmann, H.; Schulte, C. Bridging ICT and CS. ACM SIGCSE Bulletin 2009, 41, 288–292. [Google Scholar] [CrossRef]
- Kalelioğlu, F. A new way of teaching programming skills to K-12 students: Code.org. Comput. Hum. Behav. 2015, 52, 200–210. [Google Scholar] [CrossRef]
- Bers, M.U. Coding and computational thinking in early childhood: The Impact of scratchJr in europe. Eur. J. STEM Educ. 2018, 3, 8. [Google Scholar] [CrossRef]
- European Commission. Digital Education Action Plan 2021–2027: Resetting Education and Training for the Digital Age; Publications Office of the European Union: Luxembourg, 2020. [Google Scholar]
- Robins, A.; Rountree, J.; Rountree, N. Learning and teaching programming: A review and discussion. Comput. Sci. Educ. 2003, 13, 137–172. [Google Scholar] [CrossRef]
- Gandy, E.A.; Bradley, S.; Arnold-Brookes, D.; Allen, N.R. The use of LEGO mindstorms NXT robots in the teaching of introductory Java programming to undergraduate students. Innov. Teach. Learn. Inf. Comput. Sci. 2010, 9, 2–9. [Google Scholar] [CrossRef]
- de la Hera, D.P.; Zanoni, M.B.; Sigman, M.; Calero, C.I. Peer tutoring of computer programming increases exploratory behavior in children. J. Exp. Child Psychol. 2022, 216, 105335. [Google Scholar] [CrossRef]
- Kinnunen, P.; Malmi, L. Why students drop out CS1 course? In Proceedings of the Second International Workshop on Computing Education Research, Canterbury, UK, 9 September 2006; pp. 97–108. [Google Scholar] [CrossRef]
- Kadirhan, Z.; Gül, A.; Battal, A. Self-efficacy to teach coding in K-12 education. In Self-Efficacy in Instructional Technology Contexts; Springer: Cham, Switzerland, 2018; pp. 205–226. [Google Scholar] [CrossRef]
- Askar, P.; Davenport, D. An investigation of factors related to self-efficacy for java programming among engineering students. Turk. Online J. Educ. Technol. 2009, 8, 3. [Google Scholar]
- Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S. Gpt-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
- Wei, J.; Tay, Y.; Bommasani, R.; Raffel, C.; Zoph, B.; Borgeaud, S.; Yogatama, D.; Bosma, M.; Zhou, D.; Metzler, D. Emergent abilities of large language models. arXiv 2022, arXiv:2206.07682. [Google Scholar] [CrossRef]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Bi, X.; Chen, D.; Chen, G.; Chen, S.; Dai, D.; Deng, C.; Ding, H.; Dong, K.; Du, Q.; Fu, Z. Deepseek llm: Scaling open-source language models with longtermism. arXiv 2024, arXiv:2401.02954. [Google Scholar] [CrossRef]
- Shanahan, M. Talking about large language models. Commun. ACM 2024, 67, 68–79. [Google Scholar] [CrossRef]
- Naveed, H.; Khan, A.U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; Mian, A. A comprehensive overview of large language models. arXiv 2023, arXiv:2307.06435. [Google Scholar] [CrossRef]
- Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; et al. A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–45. [Google Scholar] [CrossRef]
- Demszky, D.; Yang, D.; Yeager, D.S.; Bryan, C.J.; Clapper, M.; Chandhok, S.; Eichstaedt, J.C.; Hecht, C.; Jamieson, J.; Johnson, M.; et al. Using large language models in psychology. Nat. Rev. Psychol. 2023, 2, 688–701. [Google Scholar] [CrossRef]
- Kung, T.H.; Cheatham, M.; Medenilla, A.; Sillos, C.; De Leon, L.; Elepaño, C.; Madriaga, M.; Aggabao, R.; Diaz-Candido, G.; Maningo, J.; et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLoS Digit. Health 2023, 2, e0000198. [Google Scholar] [CrossRef]
- Dijkstra, R.; Genç, Z.; Kayal, S.; Kamps, J. Reading comprehension quiz generation using generative pre-trained transformers. In Proceedings of the iTextbooks@ AIED, Durham, UK, 27–31 July 2022; pp. 4–17. [Google Scholar]
- Gabajiwala, E.; Mehta, P.; Singh, R.; Koshy, R. Quiz maker: Automatic quiz generation from text using NLP. In Futuristic Trends in Networks and Computing Technologies; Lecture Notes in Electrical Engineering; Springer: Singapore, 2022; pp. 523–533. [Google Scholar] [CrossRef]
- Qu, F.; Jia, X.; Wu, Y. Asking questions like educational experts: Automatically generating question-answer pairs on real-world examination data. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Virtual Event/Punta Cana, Dominican Republic, 7–11 November 2021; pp. 2583–2593. [Google Scholar] [CrossRef]
- Raina, V.; Gales, M. Multiple-choice question generation: Towards an automated assessment framework. arXiv 2022, arXiv:2209.11830. [Google Scholar] [CrossRef]
- Abdelghani, R.; Wang, Y.-H.; Yuan, X.; Wang, T.; Lucas, P.; Sauzéon, H.; Oudeyer, P.-Y. GPT-3-driven pedagogical agents to train children’s curious question-asking skills. Int. J. Artif. Intell. Educ. 2024, 34, 483–518. [Google Scholar] [CrossRef]
- Jia, Q.; Cui, J.; Xiao, Y.; Liu, C.; Rashid, P.; Gehringer, E.F. All-in-one: Multi-task learning bert models for evaluating peer assessments. arXiv 2021, arXiv:2110.03895. [Google Scholar] [CrossRef]
- Lucas, H.C.; Upperman, J.S.; Robinson, J.R. A systematic review of large language models and their implications in medical education. Med. Educ. 2024, 58, 1276–1285. [Google Scholar] [CrossRef] [PubMed]
- Yilmaz, R.; Karaoglan Yilmaz, F.G. Augmented intelligence in programming learning: Examining student views on the use of ChatGPT for programming learning. Comput. Hum. Behav. Artif. Hum. 2023, 1, 100005. [Google Scholar] [CrossRef]
- Ai-Lim Lee, E.; Wong, K.W.; Fung, C.C. How does desktop virtual reality enhance learning outcomes? A structural equation modeling approach. Comput. Educ. 2010, 55, 1424–1442. [Google Scholar] [CrossRef]
- Ma, Q. The Impact of Project-Based Learning on Elementary School Students’ Programming Self-Efficacy. Master’s Dissertation, Inner Mongolia Normal University, Hohhot, China, 2022. [Google Scholar]
- Schunk, D.H. Learning Theories; Printice Hall Inc.: Upper Saddle River, NJ, USA, 1996. [Google Scholar]
- Ryan, R.M.; Deci, E.L. Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. Am. Psychol. 2000, 55, 68–78. [Google Scholar] [CrossRef]
- Locke, E.A. Self-efficacy: The exercise of control. Pers. Psychol. 1997, 50, 801. [Google Scholar]
Group | N | Mean | Standard Deviation | t-Value | p | |
---|---|---|---|---|---|---|
Programming knowledge score | EG | 52 | 35.60 | 5.942 | — | 0.265 |
CG | 51 | 37.00 | 6.753 | |||
Programming skill score | EG | 52 | 37.08 | 12.210 | 3.691 | 0.000 *** |
CG | 51 | 27.29 | 14.605 | |||
Total Score | EG | 52 | 72.67 | 15.811 | 2.414 | 0.018 * |
CG | 51 | 64.29 | 19.279 |
Group | N | Mann–Whitney U Test | |||||
---|---|---|---|---|---|---|---|
Mean | Mean Rank | U | Z | p | |||
Learning Interest | EG | 52 | 4.115 | 58.60 | 933.500 | −2.279 | 0.023 * |
CG | 51 | 3.769 | 45.27 | ||||
Cognitive Interest | EG | 52 | 4.221 | 59.02 | 961.000 | −2.469 | 0.014 * |
CG | 51 | 3.833 | 44.84 | ||||
Affective Interest | EG | 52 | 4.010 | 57.04 | 1064.00 | −1.762 | 0.078 |
CG | 51 | 3.706 | 46.86 |
Group | N | Mann–Whitney U Test | |||||
---|---|---|---|---|---|---|---|
Mean | Mean Rank | U | Z | p | |||
Programming Self-Efficacy | EG | 52 | 3.865 | 58.61 | 933.500 | −2.593 | 0.010 * |
CG | 51 | 3.499 | 45.26 | ||||
Programming Learning Ability | EG | 52 | 4.115 | 57.76 | 1026.500 | −1.985 | 0.047 * |
CG | 51 | 3.804 | 46.13 | ||||
Programming Learning Control | EG | 52 | 3.637 | 58.61 | 982.500 | −2.271 | 0.023 * |
CG | 51 | 3.238 | 45.26 |
Theme | Categories (Frequency Count) | Nodes (Frequency Count) |
---|---|---|
Theme 1 | Programming Mastery (12) | Enhanced Code Understanding (10) |
Code Comparison and Analysis (2) | ||
Self-Directed Learning Behaviors (13) | Try Before Asking (4) | |
Proactive Use of AI Tools (4) | ||
Extracurricular Learning Extension (5) | ||
Theme 2 | Just-in-Time Feedback (16) | Always Accessible (5) |
Improved Problem-Solving Efficiency (6) | ||
Immediate Response to Questions (5) | ||
Human–AI Dialog (9) | Explains Like a Teacher (7) | |
Intelligent Learning Companion (1) | ||
Sense of Dialog (1) | ||
Theme 3 | Exploratory Learning (13) | Proactively Exploring Knowledge Points (1) |
Changing Question Strategies (3) | ||
Asking When Facing Difficulties (9) | ||
Knowledge Transfer (4) | Extracurricular Tool Transfer (3) | |
Internalizing and Applying AI Strategies (1) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tang, B.; Liang, J.; Hu, W.; Luo, H. Enhancing Programming Performance, Learning Interest, and Self-Efficacy: The Role of Large Language Models in Middle School Education. Systems 2025, 13, 555. https://doi.org/10.3390/systems13070555
Tang B, Liang J, Hu W, Luo H. Enhancing Programming Performance, Learning Interest, and Self-Efficacy: The Role of Large Language Models in Middle School Education. Systems. 2025; 13(7):555. https://doi.org/10.3390/systems13070555
Chicago/Turabian StyleTang, Bixia, Jiarong Liang, Wenshuang Hu, and Heng Luo. 2025. "Enhancing Programming Performance, Learning Interest, and Self-Efficacy: The Role of Large Language Models in Middle School Education" Systems 13, no. 7: 555. https://doi.org/10.3390/systems13070555
APA StyleTang, B., Liang, J., Hu, W., & Luo, H. (2025). Enhancing Programming Performance, Learning Interest, and Self-Efficacy: The Role of Large Language Models in Middle School Education. Systems, 13(7), 555. https://doi.org/10.3390/systems13070555