Opportunities and Challenges of Big Models in Middle School Mathematics Teaching

Sun, Yuyang; Zou, Jiancheng

doi:10.3390/engproc2025103020

Open AccessProceeding Paper

Opportunities and Challenges of Big Models in Middle School Mathematics Teaching^†

by

Yuyang Sun

¹ and

Jiancheng Zou

^2,*

¹

Beijing Yuying School, Beijing 100036, China

²

Department of Mathematics, North China University of Technology, Beijing 100144, China

^*

Author to whom correspondence should be addressed.

^†

Presented at the 8th Eurasian Conference on Educational Innovation 2025, Bali, Indonesia, 7–9 February 2025.

Eng. Proc. 2025, 103(1), 20; https://doi.org/10.3390/engproc2025103020

Published: 27 August 2025

(This article belongs to the Proceedings of The 8th Eurasian Conference on Educational Innovation 2025)

Download

Browse Figures

Versions Notes

Abstract

The influence of large language models (LLMs) has permeated education, too. We explored the opportunities and challenges of LLMs in mathematics teaching. In mathematics education, the generative nature of LLMs is appropriate for teachers as it enables an understanding of mathematical knowledge rather than students who lack discernment. Additionally, we combined programming languages with LLMs, using the example of geometric models, to integrate mathematics and visual representation in a new way. Through a comparison of problem-solving between ChatGPT and MathGPT and an analysis of their logical reasoning, teachers can exercise with large models as auxiliary tools to enhance the quality of mathematics teaching.

Keywords:

large language model; artificial intelligence; mathematics education

1. Introduction

The advancement of artificial intelligence (AI) has affected all walks of life, and education is no exception. The Ministry of Education’s “Compulsory Education Mathematics Curriculum Standards (2022 Edition)” pointed out the following: “Attach importance to the role of big data, artificial intelligence, etc. in promoting mathematics teaching reform, improve teaching methods, and promote changes in students’ learning methods.” The advent of large language models (LLMs) has enhanced enthusiasm, and more people have paid attention to the application of LLMs in the field of education.

LLMs refer to machine learning models with a large number of parameters and complex computational structures, typically constructed on deep neural networks. These models are trained on vast amounts of data with billions of parameters, enabling them to generalize and make accurate predictions on unseen data, as they are capable of handling complex tasks. Based on the type of input data, LLMs are classified into language models, vision models, and multimodal models. In education, current attention is mainly paid to LLMs, such as bidirectional encoder representations from transformers (BERTs), text-to-text transfer transformer (T5), and the generative pre-trained transformer (GPT) series [1], as well as enhanced representation through knowledge integration (ERNIE Bot), among which ChatGPT has garnered the most attention.

ChatGPT is an AI chatbot launched by OpenAI in November 2022. It analyzes a large amount of text data and learns a probabilistic model to predict the next possible word based on the words and sentences that appeared previously. Its powerful natural language processing and context understanding capabilities give it broad application prospects in various fields. Cui et al. [2] believed that GPT achieves teaching content optimization, teaching process guidance assistance, teaching method optimization, academic paper writing, and teaching and learning effect evaluation. GPT helps improve teaching quality and efficiency, and can also be used as a teaching auxiliary tool, answer professional academic questions, build an independent learning platform, save human resource costs, and reconstruct the school education structure [3]. Zhang et al. [4] used ChatGPT to create intelligent geography teaching scenarios in middle school geography, promote innovation in geography learning methods, and reshape geography teaching evaluation methods. Guo et al. [5] took the data structure course teaching as an example and explored how to use the large language model ERNIE Bot proposed by Baidu in university teaching. Wang et al. [6] applied AI dialogue robots represented by Baidu ERNIE Bot or ChatGPT in programming teaching.

Li et al. [7] found that ChatGPT can be used to design lesson plans for teachers and propose practices for lesson preparation in junior high school mathematics. However, it cannot simulate real teaching and propose educational objects and educational behavior. ChatGPT is a part of an excellent teaching auxiliary tool. ChatGPT has great potential in improving education, solving mathematical problems, and student learning [8]. For example, it helps teachers and educators generate personalized and relevant educational content for students to enhance student participation, enthusiasm, and academic performance. The ChatGPT model is a valuable tool for educational assessment and evaluation. Teachers can use it to evaluate students’ homework and provide feedback. Shakarian et al. [9] studied the performance of mathematical word problems (MWPs) on LLMs and found that the performance of ChatGPT changes depending on the requirements, and the probability of failure increases linearly with the number of addition and subtraction operations.

In response to math problems, Xueersi proposed the Jiuzhang Large Model (MathGPT). It was independently developed by Tomorrow Advancing Life (TAL) and provides a large model for math problem-solving and problem-explaining algorithms for global users and research institutions. It is also the first large model in China built specifically for math. MathGPT’s capabilities cover math problems in elementary, middle, and high schools and calculation questions, application questions, and algebraic questions. However, question-and-answer interactions other than math are not yet available.

2. LLMs in Middle School Mathematics Teaching

Whether it is ERNIE Bot, ChatGPT, or MathGPT, the answers given by LLMs are based on a probability model, which might not be accurate. Even the two answers to the same question might be different, which causes problems in mathematics, a subject that emphasizes accuracy and clarity. The main user group of LLMs is teachers who understand mathematical knowledge, rather than students who lack it. This article mainly analyzes the application of LLMs in the teacher group. The most obvious change is the transformation in teaching methods. LLMs can be used by teachers, but students must use them with caution.

2.1. Traditional Teaching Method

Before the emergence of large language models (LLMs), mathematics instruction took place through face-to-face interactions between teachers and students, with classroom teaching as the main method and online learning serving as a supplementary tool. Traditional classroom teaching is a teacher-centered lecture-based approach, with students passively receiving the learning. This teaching model has been used for a long time and is favored by many teachers. It allows for rapid and efficient knowledge delivery, enabling teachers to complete the required teaching tasks within a specified time. Additionally, it enables immediate feedback from students, allowing the teacher to assess students’ understanding of the mathematical concepts and adjust the teaching process accordingly. Traditional mathematics teaching methods often fail to inspire students’ enthusiasm for learning, causing them to dislike mathematics. In addition, traditional teaching models cannot accommodate the different learning styles and paces of each student. The classroom teaching model adopts an indoctrination teaching method, and students lack independent activities and a spirit of inquiry.

2.2. LLMs in Supporting Teaching

LLMs serve as a teaching assistant for teachers and a tool to enhance the learning experience for students. They provide teachers with rich lesson planning ideas and materials to refine the design of teaching activities. This, in turn, stimulates students’ curiosity for exploration and discovery, fostering their active learning. LLMs generate interesting mathematical problems and challenges, encouraging students to engage their creative thinking and problem-solving abilities. LLMs tailor teaching to different student abilities, mitigating issues from “one size fits all” and enabling the teaching of students according to their aptitude. Current education must be student-centered. Each student has his or her learning style and pace. LLMs use an amount of data for training to be “well-informed” and provide personalized guidance plans based on the different situations of students. With the help of LLMs, teachers personalize teaching more easily, greatly reducing the difficulty of teaching students with different aptitudes.

2.3. Integrating Programming Languages to Enrich Teaching Methods

Jiang et al. [10] combined their confusion and experience in middle school mathematics teaching and pointed out that ChatGPT, as a language model, cannot combine numbers and shapes. This problem can be solved by combining the large model with a programming language to develop a teaching method that combines numbers and shapes. Programming languages are also applicable to LLMs for graphical plotting. In the past, due to the lack of specialized programming knowledge, middle school mathematics teachers rarely used programming languages in teaching. LLMs such as ERNIE Bot and ChatGPT remove the barrier to using these programming languages. Nowadays, graphical tasks are easily completed by running the code provided by LLMs, leading to successful results. Python 3.10 is an example of using ERNIE Bot and ChatGPT to draw a parabola with parameters interactively set by the user.

In the experiment, we requested ERNIE Bot 4.5 and ChatGPT-3.5 to create a segment of code from each of them. The response from ERNIE Bot is presented on the left of Figure 1, and the response from ChatGPT-3.5 is on the right.

Without any modifications, we ran the created code. Here, we used Jupyter Notebook 7 with Anaconda for running the code. The result from ChatGPT3.5 is presented in Figure 2.

The result of ERNIE Bot is presented in Figure 3 (colored boxes are not obtained by running the code).

ERNIE Bot produced better results in the experiment. The parabola was adjusted by dragging the three sliders in the blue box area, and the parameters in the green box were modified to adjust the parabola.

3. Challenges in Using LLMs

3.1. Insufficient Accuracy in Problem-Solving

Due to the particularities of the mathematics discipline, which emphasizes accuracy and clarity, the accuracy of problem-solving is crucial when LLMs are used as intelligent tutors for students in math education. To illustrate this, an example was explored in this study. Using the data provided by the official MathGPT website, we compared the outcomes of MathGPT and ChatGPT on the open-source datasets TAL-SCQ5K-CN and TAL-SCQ5K-EN, which consists of single-choice math questions from elementary, middle, and high school levels, along with detailed solution steps with 3K training sets and 2K test sets (Table 1).

MathGPT’s results were better than ChatGPT4, while ChatGPT4 was appropriate for solving English problems. MathGPT was better at solving Chinese problems. However, the accuracy of the two was similar. LLMs are likely to provide wrong answers, which hinders them from playing a guiding role, with a negative impact on students and misleading students’ cognition. Mathematics teachers need to participate in students’ personalized learning as supervisors who cannot judge whether the knowledge is correct or not.

3.2. Lack of Logical Consistency

The logical reasoning of LLMs is inherently limited due to their characteristics. First, LLMs represented by ChatGPT are generative models based on a statistical model. The answers it generates are obtained by learning the statistical laws of a large amount of text data through training. The observed language patterns and statistical probabilities are the source of its generated answers. There is no direct logical reasoning process in the entire answering process. This leads to LLMs’ inevitable limitations when dealing with logical problems, which is obvious in complex reasoning and inference problems. In middle school mathematics, logical reasoning is important, which leads to the limitations of the use of LLMs and produces logical errors.

In addition, each training session requires a lot of manpower and material resources, with training data from the large-scale text corpus on the Internet. LLMs have no cognition of post-training results, which leads to potential errors in answers, and the corpus on the Internet used for dataset production is not correct. In contrast, programming languages produce better results due to their strict characteristics.

3.3. Dependence

Excessive use of LLMs makes people dependent on them, and excessive dependence causes people to lose their basic judgment and creativity. Due to the convenience brought by LLMs, people can be naïve to a large extent. While pursuing the convenience brought by LLMs and improving efficiency, people do not tend to think seriously. Inactive thinking becomes an obstacle to the improvement in teachers’ abilities. LLMs are auxiliary tools, and the improvement in teaching quality still depends mainly on the judgment and wisdom of teachers.

Author Contributions

The main idea of the paper comes from J.Z. The paper writing was jointly completed by Y.S. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the 2023 Annual Project of the 14th Five-Year Plan, Haidian District Education Science, Beijing, China. Grant No. HDGH2023117-6.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Liu, M.; Wu, Z.; Liao, J.; Ren, Y.; Su, Y. Educational applications of LLMs: Principles, status and challenges-from light-weighted BERT to conversational ChatGPT. Mod. Educ. Technol. 2023, 33, 19–28. [Google Scholar]
Cui, Y.; Bai, F.; Zhang, R. Application, risks and countermeasures of ChatGPT in the field of higher education. J. Chongqing Univ. Technol. (Soc. Sci.) 2023, 37, 16–25. [Google Scholar]
Zhu, Y.; Yang, F. ChatGPT/AIGC and educational innovation: Opportunities, challenges, and the future. J. East China Norm. Univ. (Educ. Sci.) 2023, 41, 1. [Google Scholar]
Zhang, J.; Liu, H. Application of ChatGPT in middle school geography teaching: Scenarios, limitations and breakthrough strategies. J. Geogr. Teach. 2024, 5, 16–20. [Google Scholar]
Guo, N.; Dong, Q.; Xu, X.; Xu, S. Study on the teaching method of data structure course based on ERNIE Bot. J. Sci. Educ. 2023, 21, 95–100. [Google Scholar]
Wang, F.; Zhao, Z.; Wang, Y. Exploration and practice of AI in programming teaching. Comput. Educ. 2023, 11, 45–50. [Google Scholar]
Li, Y.; Liu, J. The innovative function and value positioning of ChatGPT in junior high school mathematics lesson preparation under the background, the new curriculum standards. China Educ. Technol. 2024, 3, 109–114. [Google Scholar]
Guo, B.; Zhang, X.; Wang, Z.; Jiang, M.; Nie, J.; Ding, Y.; Yue, J.; Wu, Y. How close is ChatGPT to human experts? Comparison corpus, evaluation, and detection. arXiv 2023, arXiv:2301.07597. [Google Scholar] [CrossRef]
Shakarian, P.; Koyyalamudi, A.; Ngu, N.; Mareedu, L. An independent evaluation of ChatGPT on mathematical word problems. arXiv 2023, arXiv:2302.13814. [Google Scholar] [CrossRef]
Jiang, X. Research on ChatGPT assisting innovation in middle school mathematics teaching. Creat. Educ. Stud. 2024, 12, 492–499. [Google Scholar] [CrossRef]

Figure 1. (a) Response from ChatGPT; (b) response from ERNIE Bot.

Figure 2. Result from ChatGPT3.5.

Figure 3. Result of ERNIE Bot.

Table 1. Comparison of MathGPT and ChatGPT on TAL-SCQ5K-CN and TAL-SCQ5K-EN.

		GPT4-0819			MathGPT
Dataset	Dataset	Total	Correct	Accuracy	Total	Correct	Accuracy
Chinese	TAL-SCQ5K-CN 2K Training	2000	555	0.28	2000	1342	0.67
	TAL-SCQ5K-CN 3K Training	3000	869	0.29	3000	2176	0.73
	TAL-SCQ5K-CN	5000	1424	0.28	5000	3518	0.70
English	TAL-SCQ5K-CN 2K Training	2000	1014	0.51	2000	1181	0.59
	TAL-SCQ5K-CN 3K Training	3000	1576	0.53	3000	1850	0.62
	TAL-SCQ5K-CN	5000	2590	0.52	5000	3031	0.61

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, Y.; Zou, J. Opportunities and Challenges of Big Models in Middle School Mathematics Teaching. Eng. Proc. 2025, 103, 20. https://doi.org/10.3390/engproc2025103020

AMA Style

Sun Y, Zou J. Opportunities and Challenges of Big Models in Middle School Mathematics Teaching. Engineering Proceedings. 2025; 103(1):20. https://doi.org/10.3390/engproc2025103020

Chicago/Turabian Style

Sun, Yuyang, and Jiancheng Zou. 2025. "Opportunities and Challenges of Big Models in Middle School Mathematics Teaching" Engineering Proceedings 103, no. 1: 20. https://doi.org/10.3390/engproc2025103020

APA Style

Sun, Y., & Zou, J. (2025). Opportunities and Challenges of Big Models in Middle School Mathematics Teaching. Engineering Proceedings, 103(1), 20. https://doi.org/10.3390/engproc2025103020

Article Menu

Opportunities and Challenges of Big Models in Middle School Mathematics Teaching^†

Abstract

1. Introduction