Can Generative Artificial Intelligence Effectively Enhance Students’ Mathematics Learning Outcomes?—A Meta-Analysis of Empirical Studies from 2023 to 2025
Abstract
1. Introduction
2. Literature Review
2.1. GenAI Applications in Mathematics Education
2.2. Meta-Analysis Evidence of the Impact of Artificial Intelligence on Students’ Learning
3. Methods
3.1. Literature Search
3.2. Data Encoding
3.3. Data Analysis
3.4. Experimental Results
4. Results
4.1. GenAI Exerts a Moderate Positive Impact on Students’ Mathematics Learning Outcomes
4.2. Regulatory Effect Analysis
5. Discussion
5.1. Responses to the First Research Question
5.2. Responses to the Second Research Question
5.3. Practical Implications
5.4. Limitations and Future Research
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Component | Description |
|---|---|
| Databases/Platforms | Web of Science, EBSCO (e.g., ERIC, APA PsycINFO), CNKI, Google Scholar |
| Time Frame | 1 January 2023–30 September 2025 |
| Search Strategy | Boolean queries were constructed by combining terms from four core conceptual groups using the AND operator: |
| 1. Technology: (“generative AI” OR “generative artificial intelligence” OR ChatGPT OR “GenAI” OR “large language model” OR “AI-powered” OR “AI-driven”) | |
| 2. Subject: (math OR mathematics OR algebra OR geometry OR calculus OR statistics OR “problem-solving”) | |
| 3. Outcome: (learn OR performance OR achievement OR outcomes OR anxiety OR attitudes OR motivation OR “computational thinking” OR skill) | |
| 4. Population: (student OR pupil OR learner OR “elementary school” OR “primary school” OR “middle school” OR “high school” OR “undergraduate” OR “higher education”) | |
| The specific syntax and field codes were adapted for each database. | |
| Additional Searches | Manual screening of reference lists and citation tracking for included studies. |
| No. | Author(s) & Year | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Febriantoro et al. (2024) | 2 | 0.5 | 0.5 | 2 | 1 | 0 | 2 | 1 | 2 | 1 | 12 |
| 2 | Polydoros et al. (2025) | 2 | 0.5 | 0.5 | 2 | 1 | 0 | 2 | 1 | 2 | 1 | 12 |
| 3 | Sánchez-Ruiz et al. (2023) | 2 | 0.5 | 0.5 | 1.5 | 1 | 0 | 2 | 1 | 1 | 1 | 10.5 |
| 4 | Wahba et al. (2024) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 5 | Yavich (2025) | 3 | 0.5 | 0.5 | 1.5 | 2 | 0 | 2 | 1 | 2 | 1 | 13.5 |
| 6 | X. Wang and Wei (2025) | 2 | 0.5 | 0.5 | 1.5 | 2 | 0 | 2 | 1 | 1 | 1 | 11.5 |
| 7 | Xing et al. (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 8 | Kadhim and Fares (2025) | 2 | 0.5 | 0.5 | 2 | 1 | 0 | 2 | 1 | 2 | 1 | 12 |
| 9 | Noviyana et al. (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 10 | Luo et al. (2024) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 1 | 1 | 12 |
| 11 | Dasari et al. (2024) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 12 | Karaman and Göksu (2024) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 13 | Utami et al. (2024) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 14 | Xuan et al. (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 15 | Nakavachara et al. (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 16 | Liao (2024) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 17 | X. C. Liu and Zhang (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 18 | Fardian et al. (2025) | 2 | 1 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13.5 |
| 19 | Adelegan (2023) | 2 | 0.5 | 0.5 | 2 | 1 | 0 | 2 | 1 | 2 | 1 | 12 |
| 20 | J. Liu et al. (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| 21 | Alvarez (2024) | 2 | 0.5 | 0.5 | 1.5 | 1 | 0 | 2 | 1 | 1 | 1 | 10.5 |
| 22 | R. Zhou et al. (2025) | 2 | 0.5 | 0.5 | 2 | 2 | 0 | 2 | 1 | 2 | 1 | 13 |
| No. | Author(s) & Year | Country | Duration | Stage | Content | Integration | Size | Participant | H/L | C/NC | Learning Mode |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Febriantoro et al. (2024) | Indonesia | Long | Primary School | Geometry | CT | Small | 60 | Low | Cognitive | Collaborative Learning |
| 2 | Polydoros et al. (2025) | Greece | N/A | Primary School | Geometry | IPA | Large | 436 | Low | Cognitive | Independent Learning |
| 3 | Sánchez-Ruiz et al. (2023) | Spain | Long | Tertiary | Integration | IPA | Large | 245 | Low | Cognitive | Independent Learning |
| 4 | Sánchez-Ruiz et al. (2023, a) | Spain | Long | Tertiary | Integration | IPA | Large | 246 | Low | Cognitive | Independent Learning |
| 5 | Sánchez-Ruiz et al. (2023, b) | Spain | Long | Tertiary | Integration | IPA | Large | 241 | Low | Cognitive | Independent Learning |
| 6 | Sánchez-Ruiz et al. (2023, c) | Spain | Long | Tertiary | Integration | IPA | Large | 235 | Low | Cognitive | Independent Learning |
| 7 | Sánchez-Ruiz et al. (2023, d) | Spain | Long | Tertiary | Integration | IPA | Large | 238 | Low | Cognitive | Independent Learning |
| 8 | Sánchez-Ruiz et al. (2023, e) | Spain | Long | Tertiary | Integration | IPA | Large | 240 | Low | Cognitive | Independent Learning |
| 9 | Sánchez-Ruiz et al. (2023, f) | Spain | Long | Tertiary | Integration | IPA | Large | 245 | Low | Cognitive | Independent Learning |
| 10 | Wahba et al. (2024) | Jordan | Short | Tertiary | Statistics | CT | Small | 56 | High | Cognitive | Independent Learning |
| 11 | Yavich (2025) | Israel | Long | Secondary School | Number & Algebra | IPA | Small | 50 | High | Cognitive | Collaborative Learning |
| 12 | X. Wang and Wei (2025) | China | Short | Primary School | Integration | IPA | Large | 105 | N/A | Non-cognitive | Independent Learning |
| 13 | Xing et al. (2025) | USA | Short | Secondary School | Number & Algebra | CT | Large | 212 | Low | Cognitive | N/A |
| 14 | Kadhim and Fares (2025) | Iraq | Long | Secondary School | Integration | IPA | Small | 78 | High | Cognitive | N/A |
| 15 | Noviyana et al. (2025) | Indonesia | N/A | Tertiary | Integration | IPA | Small | 60 | High | Cognitive | Independent Learning |
| 16 | Luo et al. (2024) | China | Long | Tertiary | Integration | IPA | Large | 117 | N/A | Non-cognitive | Collaborative Learning |
| 17 | Luo et al. (2024, a) | China | Long | Primary School | Integration | IPA | Large | 117 | N/A | Non-cognitive | Independent Learning |
| 18 | Dasari et al. (2024) | Indonesia | N/A | Primary School | Statistics | IPA | Small | 20 | Low | Cognitive | Independent Learning |
| 19 | Dasari et al. (2024, a) | Indonesia | N/A | Primary School | Statistics | IPA | Small | 20 | Low | Cognitive | Independent Learning |
| 20 | Karaman and Göksu (2024) | Turkey | Long | Tertiary | Geometry | IPA | Small | 39 | Low | Cognitive | Independent Learning |
| 21 | Utami et al. (2024) | Indonesia | Long | Primary School | Geometry | CT | Small | 51 | Low | Cognitive | Collaborative Learning |
| 22 | Utami et al. (2024, a) | Indonesia | Long | Tertiary | Geometry | CT | Small | 51 | Low | Cognitive | Independent Learning |
| 23 | Utami et al. (2024, b) | Indonesia | Long | Tertiary | Geometry | CT | Small | 51 | High | Cognitive | Independent Learning |
| 24 | Xuan et al. (2025) | Vietnam | Short | Tertiary | Number & Algebra | IPA | Small | 60 | Low | Cognitive | Collaborative Learning |
| 25 | Xuan et al. (2025, a) | Vietnam | Short | Primary School | Number & Algebra | IPA | Small | 60 | Low | Cognitive | Collaborative Learning |
| 26 | Nakavachara et al. (2025) | Thailand | Short | Secondary School | Statistics | IPA | Large | 242 | High | Cognitive | Independent Learning |
| 27 | Liao (2024) | China | Long | Primary School | Integration | IPA | Large | 115 | Low | Cognitive | N/A |
| 28 | Liao (2024, a) | China | Long | Secondary School | Integration | IPA | Large | 115 | N/A | Non-cognitive | Independent Learning |
| 29 | Liao (2024, b) | China | Long | Secondary School | Integration | IPA | Large | 115 | N/A | Non-cognitive | Independent Learning |
| 30 | Liao (2024, c) | China | Long | Secondary School | Integration | IPA | Large | 115 | Low | Cognitive | N/A |
| 31 | Liao (2024, d) | China | Long | Secondary School | Integration | IPA | Large | 115 | Low | Cognitive | N/A |
| 32 | Liao (2024, e) | China | Long | Secondary School | Comprehensive | IPA | Large | 115 | N/A | Non-cognitive | N/A |
| 33 | Liao (2024, f) | China | Long | Secondary School | Comprehensive | IPA | Large | 115 | Low | Cognitive | N/A |
| 34 | Liao (2024, g) | China | Long | Secondary School | Comprehensive | IPA | Large | 115 | N/A | Non-cognitive | N/A |
| 35 | X. C. Liu and Zhang (2025) | China | Long | Secondary School | Comprehensive | IPA | Large | 115 | N/A | Non-cognitive | N/A |
| 36 | Fardian et al. (2025) | Indonesia | N/A | Secondary School | Geometry | IPA | Small | 205 | High | Cognitive | N/A |
| 37 | Fardian et al. (2025, a) | Indonesia | N/A | Secondary School | Geometry | IPA | Small | 22 | High | Cognitive | N/A |
| 38 | Adelegan (2023) | USA | Short | Secondary School | Number & Algebra | IPA | Small | 18 | Low | Cognitive | Independent Learning |
| 39 | Adelegan (2023, a) | Nigeria | Short | Secondary School | Number & Algebra | IPA | Small | 28 | Low | Cognitive | Independent Learning |
| 40 | Adelegan (2023, b) | Finland | Short | Secondary School | Number & Algebra | IPA | Small | 28 | Low | Cognitive | Independent Learning |
| 41 | Adelegan (2023, c) | USA | Short | Secondary School | Number & Algebra | IPA | Small | 62 | Low | Cognitive | Independent Learning |
| 42 | Adelegan (2023, d) | Nigeria | Short | Secondary School | Number & Algebra | IPA | Small | 62 | Low | Cognitive | Independent Learning |
| 43 | Adelegan (2023, e) | Finland | Short | Secondary School | Number & Algebra | IPA | Small | 44 | Low | Cognitive | Independent Learning |
| 44 | Z. Liu et al. (2025) | China | Short | Primary School | Number & Algebra | IPA | Large | 104 | Low | Cognitive | Independent Learning |
| 45 | Alvarez (2024) | Philippines | Short | Tertiary | Number & Algebra | IPA | Small | 20 | Low | Cognitive | Independent Learning |
| 46 | R. Zhou et al. (2025) | China | N/A | Tertiary | Statistics | IPA | Small | 29 | Low | Cognitive | Independent Learning |
| Characteristic | Category | Number of Effect Sizes (n) | % of Effect Sizes |
|---|---|---|---|
| Research Design | Quantitative research | 8 | 17.39% |
| Mixed-methods research | 38 | 82.61% | |
| Region | Asia | 31 | 67.39% |
| Europe | 10 | 21.74% | |
| North America | 3 | 6.52% | |
| Other (MENA, Africa) | 2 | 4.34% | |
| Grade Level | Primary School | 10 | 21.74% |
| Secondary School | 20 | 43.48% | |
| Tertiary | 16 | 34.78% | |
| Mathematics Content | Number & Algebra | 12 | 26.09% |
| Geometry | 8 | 17.39% | |
| Statistics | 5 | 10.87% | |
| Integration | 21 | 45.65% | |
| Outcome Type | Cognitive Skills | 38 | 82.61% |
| Lower-order Cognitive | 30 | 65.22% | |
| Higher-order Cognitive | 8 | 17.39% | |
| Non-cognitive Skills | 8 | 17.39% | |
| Intervention Duration | Short-term (≤1 month) | 14 | 30.43% |
| Long-term (>1 month) | 25 | 54.35% | |
| Not specified | 7 | 15.22% | |
| Integration Degree | CT | 6 | 13.04% |
| IPA | 40 | 86.96% | |
| Sample Size | Large | 23 | 50.00% |
| Small | 23 | 50.00% | |
| Learning mode | Independent Learning | 29 | 63.04% |
| Collaborative Learning | 6 | 13.04% | |
| Not specified | 11 | 23.92% |
Appendix B

References
- Adams, J. B. (2015). Bloom’s taxonomy of cognitive learning objectives. Journal of the Medical Library Association, 103(3), 152–153. [Google Scholar] [CrossRef]
- Adelegan, J. (2023). The impact of ChatGPT on students’ performance [Bachelor’s thesis, Lappeenranta–Lahti University of Technology LUT]. [Google Scholar]
- Ali, O., Murray, P. A., Momin, M., Dwivedi, Y. K., & Malik, T. (2024). The effects of artificial intelligence applications in educational settings: Challenges and strategies. Technological Forecasting and Social Change, 199, 123076. [Google Scholar] [CrossRef]
- Al-Smadi, M. (2023). ChatGPT and beyond: The generative AI revolution in education. arXiv. [Google Scholar] [CrossRef]
- Alvarez, J. I. (2024). Evaluating the impact of AI–powered tutors MathGPT and Flexi 2.0 in enhancing calculus learning. Jurnal Ilmiah Ilmu Terapan Universitas Jambi, 8(2), 495–508. [Google Scholar] [CrossRef]
- Assink, M., & Wibbelink, C. J. M. (2016). Fitting three-level meta-analytic models in R: A step-by-step tutorial. The Quantitative Methods for Psychology, 12(3), 154–174. [Google Scholar] [CrossRef]
- Barno, E., & Phelps, G. (2025). Using a multi-agent system and evidence-centered design to integrate educator expertise within generated feedback. Education Sciences, 15(10), 1273. [Google Scholar] [CrossRef]
- Bartolini, A., Batini, F., De Santis, M., Milella, M., Malavasi, P., Morganti, A., Rosati, A., Salvato, R., Signorelli, A., & Sannipoli, M. (Eds.). (2025). La formazione iniziale e continua degli insegnanti: Relazioni, comunicazione, metodi. Pensa MultiMedia. [Google Scholar]
- Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., & Ma-riman, R. (2024). Generative AI without guardrails canharm learning: Evidence from high school mathema-tics. Proceedings of the National Academy of Sciences, 121(6), e2321890121. [Google Scholar] [CrossRef]
- Bernard, R. M., Borokhovski, E., Schmid, R. F., Tamim, R. M., & Abrami, P. C. (2014). A meta-analysis of blended learning and technology use in higher education: From the general to the applied. Journal of Computing in Higher Education, 26(1), 87–122. [Google Scholar] [CrossRef]
- Bernardi, M. L., Capone, R., Faggiano, E., & Rocha, H. (2025). Generative AI in mathematics education: Pre-service teachers’ knowledge and implications for their professional development. International Journal of Mathematical Education in Science and Technology, 56(8), 1513–1530. [Google Scholar] [CrossRef]
- Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2017). Introduction to meta-analysis (2nd ed.). John Wiley & Sons. [Google Scholar]
- Borup, J., Graham, C. R., Short, C. R., & Shin, J. K. (2022). Evaluating blended teaching with the 4Es and PICRAT. In C. R. Graham, J. Borup, M. A. Jensen, K. T. Arnesen, & C. R. Short (Eds.), K-12 blended teaching (Vol. 2): A guide to practice within the disciplines (pp. 39–54). EdTech Books. Available online: https://edtechbooks.org/k12blended_math/evaluating_bt (accessed on 13 November 2025).
- Cevikbas, M., & Kaiser, G. (2021). A systematic review on task design in dynamic and interactive mathematics learning environments (DIMLEs). Mathematics, 9(4), 399. [Google Scholar] [CrossRef]
- Chen, Y., & Hou, H. (2024). A mobile contextualized educational game framework with ChatGPT interactive scaffolding for employee ethics training. Journal of Educational Computing Research, 62(7), 1517–1542. [Google Scholar] [CrossRef]
- Cohen, J. (2009). Statistical power analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates. [Google Scholar]
- Cosentino, G., Anton, J., Sharma, K., Gelsomini, M., Giannakos, M., & Abrahamson, D. (2025). Generative AI and multimodal data for educational feedback: Insights from embodied math learning. British Journal of Educational Technology, 56(5), 1686–1709. [Google Scholar] [CrossRef]
- Dasari, D., Hendriyanto, A., Sahara, S., Suryadi, D., Muhaimin, L. H., Chao, T., & Fitriana, L. (2024). ChatGPT in didactical tetrahedron, does it make an exception? A case study in mathematics teaching and learning. Frontiers in Education, 8, 1295413. [Google Scholar] [CrossRef]
- De Simone, M., Tiberti, F., Barron Rodriguez, M., Manolio, F., Mosuro, W., & Dikoru, E. J. (2025). From chalkboards to chatbots: Evaluating the impact of generative AI on learning outcomes in Nigeria (Policy Research Working Paper No. 11125). World Bank Group. [CrossRef]
- Duval, S., & Tweedie, R. L. (2000). Trim and fill: A simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics, 56(2), 455–463. [Google Scholar] [CrossRef] [PubMed]
- Fardian, D., Suryadi, D., Prabawanto, S., & Jupri, A. (2025). Integrating Chat-GPT in the classroom: A study on linear algebra learning in higher education. International Journal of Information and Education Technology, 15(4), 732–751. [Google Scholar] [CrossRef]
- Febriantoro, F. S., Fatharani, A., Dewi, N. C., & Kurniati, L. (2024). Assessing the efficacy of coding with Scratch and AI interaction using ChatGPT on 5th graders’ math performance and computational thinking. Reforma: Jurnal Pendidikan dan Pembelajaran, 15(1), 78–99. [Google Scholar] [CrossRef]
- Ghazi, S. R., Ullah, K., & Jan, F. A. (2016). Concrete operational stage of Piaget’s cognitive development theory: An implication in learning mathematics. GUJR, 32(1), 10–20. [Google Scholar]
- Gray, W. M. (1975). The factor structure of concrete and formal operations: A confirmation of Piaget (EDRS Document Reproduction Service No. ED 115 697; TM 004 972). ERIC. Available online: https://eric.ed.gov/ (accessed on 10 October 2025).
- Gu, J., & Yan, Z. (2025). Effects of GenAI interventions on student academic performance: A meta-analysis. Journal of Educational Computing Research, 63(6), 1460–1492. [Google Scholar] [CrossRef]
- Hetmanenko, L., & Khoruzha, L. (2025). Leveraging artificial intelligence to enhance mathematics education and overcome instructional challenges. Innovaciencia, 13(1), e5075. [Google Scholar] [CrossRef]
- Hwang, G.-J., & Tu, Y.-F. (2021). Roles and research trends of artificial intelligence in mathematics education: A bibliometric mapping analysis and systematic review. Mathematics, 9(6), 584. [Google Scholar] [CrossRef]
- Hwang, S. (2022). Examining the effects of artificial intelligence on elementary students’ mathematics achievement: A meta-analysis. Sustainability, 14(20), 13185. [Google Scholar] [CrossRef]
- Kadhim, T. M., & Fares, I. J. (2025). The impact of the generative model supported by artificial intelligence as an advanced organizer on high-order thinking skills among middle school students in mathematics. International Journal of Environmental Sciences, 11(4s), 8–17. Available online: https://theaspd.com/index.php/ijes/article/view/414 (accessed on 13 January 2026).
- Karaman, M. R., & Göksu, İ. (2024). Are lesson plans created by ChatGPT more effective? An experimental study. International Journal of Technology in Education (IJTE), 7(1), 107–127. [Google Scholar] [CrossRef]
- Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., … Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. [Google Scholar] [CrossRef]
- Kim, H. K., Roknaldin, A., Nayak, S., Zhang, X., Yang, M., Twyman, M., & Lu, S. (2024, June 23–26). ChatGPT and me: Collaborative creativity in a group brainstorming with generative AI [Conference proceeding]. ASEE Annual Conference & Exposition, Portland, Oregon. [Google Scholar] [CrossRef]
- Kimmons, R., Draper, D., & Backman, J. (2022). PICRAT. EdTechnica. Available online: https://edtechbooks.org/encyclopedia/picrat (accessed on 10 October 2025).
- Klar, M. (2025). Using ChatGPT is easy, using it effectively is tough? A mixed methods study on K-12 students’ perceptions, interaction patterns, and support for learning with generative AI chatbots. Smart Learning Environments, 12(1), 32. [Google Scholar] [CrossRef]
- Kumar, A., Tak, T. K., Ali, S. M. S., Haque, M., Paralkar, T. A., Kshirsagar, P. R., & Upreti, K. (2025). Predictive modeling of student learning outcomes through cognitive and emotional skill integration. International Research Journal of Multidisciplinary Scope (IRJMS), 6(1), 892–910. [Google Scholar] [CrossRef]
- Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. [Google Scholar] [CrossRef]
- Liao, X. F. (2024). A study of the effects of generative comment feedback on learning achievement, motivation, and self-regulated learning: A case study of middle school mathematics [Master’s thesis, Central China Normal University]. Available online: https://kns.cnki.net/kcms/detail/detail.aspx?dbname=CMFD202402&filename=1024376457.nh (accessed on 13 January 2026).
- Liu, J., Sun, D., Sun, J., Wang, J., & Yu, P. L. H. (2025). Designing a generative AI enabled learning environment for mathematics word problem solving in primary schools: Learning performance, attitudes and interaction. Computers and Education: Artificial Intelligence, 9, 100438. [Google Scholar] [CrossRef]
- Liu, X. C., & Zhang, J. (2025). An empirical study on improving primary school students’ innovative abilities in mathematics teaching supported by generative artificial intelligence. Journal of Western Quality Education, 11(16), 100–103. [Google Scholar] [CrossRef]
- Liu, Z., Zhao, Y., Zuo, H., & Lu, Y. (2025). Perceived satisfaction, perceived usefulness, and interactive learning environments as predictors of university students’ self-regulation in the context of GenAI-assisted learning: An empirical study in mainland China. Frontiers in Psychology, 16, 1599478. [Google Scholar] [CrossRef]
- Luo, H., Liao, X. F., Ru, Q. Q., & Wang, Z. F. (2024). Generative AI-supported teacher comments: An empirical study based on junior high school mathematics classrooms. Journal of Educational Technology Research, 45(5), 58–65. [Google Scholar] [CrossRef]
- Ma, W., Adesope, O. O., Nesbit, J. C., & Liu, Q. (2014). Intelligent tutoring systems and learning outcomes: A meta-analysis. Journal of Educational Psychology, 106(4), 901–918. [Google Scholar] [CrossRef]
- Manzke, L. S., Conrad, C. D., Marchildon, P., Raisinghani, M., & Xie, R. (2025, August 14–16). Artificial intelligence in the classroom: Can GenAI teach effectively? [Conference proceeding]. AMCIS 2025 Proceedings, Montréal, QC, Canada. Available online: https://aisel.aisnet.org/amcis2025/paperathon/paperathon/2 (accessed on 4 November 2025).
- Marzano, D. (2025). Generative Artificial Intelligence (GAI) in teaching and learning processes at the K-12 level: A systematic review. Technology, Knowledge and Learning, 1–41. [Google Scholar] [CrossRef]
- Mustapha, K. B., Yap, E. H., & Abakr, Y. A. (2024). Bard, ChatGPT and 3DGPT: A scientometric analysis of generative AI tools and assessment of implications for mechanical engineering education. Interactive Technology and Smart Education, 21(4), 588–624. [Google Scholar] [CrossRef]
- Nakavachara, V., Potipiti, T., & Chaiwat, T. (2025). Experimenting with generative AI: Does ChatGPT really increase everyone’s productivity? (Puey Ungphakorn Institute for Economic Research Working Paper). Faculty of Economics, Chulalongkorn University. [Google Scholar]
- Noviyana, H., Rahmawati, F., Kirana, A. R., & Tanod, M. J. (2025). Enhancing elementary students’ mathematical problem-solving skills through AI-assisted problem-based learning. Journal of Integrated Elementary Education, 5(2), 254–268. [Google Scholar] [CrossRef]
- Oh, S. (2025). Evaluating mathematical problem-solving abilities of generative AI models: Performance analysis of o1-preview and gpt-4o using the Korean College Scholastic Ability Test. IEEE Access, 13, 1227–1235. [Google Scholar] [CrossRef]
- Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. [Google Scholar] [CrossRef]
- Pando, M., & Leon, M. (2025). Mathematics disciplinary literacy: A case study of a bilingual teacher’s interaction with ChatGPT. Language and Education, 1–20. [Google Scholar] [CrossRef]
- Passolunghi, M. C., De Vita, C., & Pellizzoni, S. (2020). Math anxiety and math achievement: The effects of emotional and math strategy training. Learning and Individual Differences, 79, 101868. [Google Scholar] [CrossRef]
- Pigott, T. D., & Polanin, J. R. (2020). Methodological guidance paper: High-quality meta-analysis in a systematic review. Review of Educational Research, 90(1), 24–46. [Google Scholar] [CrossRef]
- Polydoros, G., Galitskaya, V., Antoniou, A.-S., & Drigas, A. (2025). AI technology integration in elementary geometry and its effects on performance, anxiety levels, learning styles, cognitive styles, and executive functions. Scientific Electronic Archives, 18(2), 1–11. [Google Scholar] [CrossRef]
- Poynton, K. (2015). Cognitive and non-cognitive learning factors: A literature review. Centre for Inspiring Minds. [Google Scholar]
- Qu, X., Sherwood, J., Liu, P., & Aleisa, N. (2025, April 26–May 1). Generative AI tools in higher education: A meta-analysis of cognitive impact [Conference proceeding]. CHI Conference on Human Factors in Computing Systems, Yokohama, Japan. [Google Scholar] [CrossRef]
- Ramos, D. S., Chaparro, I., Padilla, J., Casallas, R., Cruz, J. C., & Reyes, L. H. (2025). Integrating generative AI with the dialogic model in education: The cognitive-AI synergy framework (CASF). Preprints.org. [Google Scholar] [CrossRef]
- Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. [Google Scholar] [CrossRef]
- Sammallahti, E., Finell, J., Jonsson, B., & Korhonen, J. (2023). A meta-analysis of math anxiety interventions. Journal of Numerical Cognition, 9(2), 346–362. [Google Scholar] [CrossRef]
- Sánchez Muñoz, J. A., Flores-Eraña, G., Silva-Campos, J. M., Chavira-Quintero, R., & Olais-Govea, J. M. (2025). GenAI as a cognitive mediator: A critical-constructivist inquiry into computational thinking in pre-university education. Frontiers in Education, 10, 1597249. [Google Scholar] [CrossRef]
- Sánchez-Ruiz, L. M., Moll-López, S., Nuñez-Pérez, A., Moraño-Fernández, J. A., & Vega-Fleitas, E. (2023). ChatGPT challenges blended learning methodologies in engineering education: A case study in mathematics. Applied Sciences, 13(10), 6039. [Google Scholar] [CrossRef]
- Segal, R., & Klemer, A. (2025). Dialogic interactions between mathematics teachers and GenAI: Multi-environment task design and its contribution to TPACK. International Journal of Mathematical Education in Science and Technology, 1–25. [Google Scholar] [CrossRef]
- Shrestha, R., & Yi, M. (2025). Pre-service teachers’ perceptions of adopting generative AI tools in teaching mathematics: Insights from a TPACK-based workshop. In R. Jake Cohen (Ed.), Proceedings of society for information technology & teacher education international conference (pp. 874–879). Association for the Advancement of Computing in Education (AACE). Available online: https://www.learntechlib.org/primary/p/225611/ (accessed on 21 October 2025).
- Song, Y., Kim, J., Liu, Z., Li, C., & Xing, W. (2024). Students’ perceived roles, opportunities, and challenges of a generative AI-powered teachable agent: A case of middle school math class. Journal of Research on Technology in Education, 1–19. [Google Scholar] [CrossRef]
- Srivastava, A., Vaidya, V., Murthy, S., & Dasgupta, C. (2024). GeoSolvAR: Scaffolding spatial perspective-taking ability of middle-school students using AR-enhanced inquiry learning environment. British Journal of Educational Technology, 55(6), 2617–2638. [Google Scholar] [CrossRef]
- Stephenson, D. E. (2022). Effectiveness of individual-level resource building interventions in the workplace: A meta-analysis [Master’s thesis, University of Canterbury]. [Google Scholar]
- Sureda, P., Parra, V., Corica, A. R., Godoy, D., & Schiaffino, S. (2025). On the role of generative AI in fractals teaching: Solutions and class proposals designed by chatbots and mathematics teachers. International Journal of Education in Mathematics Science and Technology, 13(5), 1298–1316. [Google Scholar] [CrossRef]
- Sweller, J. (2011). Cognitive load theory. In J. P. Mestre, & B. H. Ross (Eds.), Psychology of learning and motivation (Vol. 55, pp. 37–76). Academic Press. [Google Scholar] [CrossRef]
- Utami, I. Q., Hwang, W.-Y., & Hariyanti, U. (2024). Contextualized and personalized math word problem generation in authentic contexts using generative pre-trained transformer and its influences on geometry learning. Journal of Educational Computing Research, 62(6), 1384–1419. [Google Scholar] [CrossRef]
- Van Den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2015). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47(4), 1274–1294. [Google Scholar] [CrossRef]
- Veroniki, A. A., Jackson, D., Viechtbauer, W., Bender, R., Bowden, J., Knapp, G., Reeves, B. C., Higgins, J. P. T., Thomas, J., & Ioannidis, J. P. A. (2016). Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods, 7(1), 55–79. [Google Scholar] [CrossRef]
- Wahba, F., Ajlouni, A. O., & Abumosa, M. A. (2024). The impact of ChatGPT-based learning statistics on undergraduates’ statistical reasoning and attitudes toward statistics. EURASIA Journal of Mathematics, Science and Technology Education, 20(7), em2468. [Google Scholar] [CrossRef]
- Walkington, C. (2025). The implications of generative artificial intelligence for mathematics education. School Science and Mathematics, 125(1), 1–8. [Google Scholar] [CrossRef]
- Walkington, C., Pando, M., Lipsmeyer, L. L., Beauchamp, T., Sager, M., & Milton, S. (2025). Middle school girls using generative AI to engage in mathematical problem-posing. Mathematical Thinking and Learning, 1–22. [Google Scholar] [CrossRef]
- Wang, J., & Fan, W. (2025). The effect of ChatGPT on students’ learning performance, learning perception, and higher-order thinking: Insights from a meta-analysis. Humanities and Social Sciences Communications, 12(1), 621. [Google Scholar] [CrossRef]
- Wang, K., & Guo, Z. (2025). Can learners’ use of GenAI enhance learning engagement? A meta-analysis. Education Sciences, 15(12), 1578. [Google Scholar] [CrossRef]
- Wang, X., & Wei, Y. (2025). The influence of Gen-AI assisted learning on primary school students’ math anxiety: An intervention study. Applied Cognitive Psychology, 39(4), e70088. [Google Scholar] [CrossRef]
- Wardat, Y., Tashtoush, M. A., AlAli, R., & Jarrah, A. M. (2023). ChatGPT: A revolutionary tool for teaching and learning mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 19(7), em2286. [Google Scholar] [CrossRef]
- Wu, B., Chang, X., & Hu, Y. (2023). A meta-analysis of the effects of spherical video-based virtual reality on cognitive and non-cognitive learning outcomes. Interactive Learning Environments, 32(7), 3472–3489. [Google Scholar] [CrossRef]
- Wu, J., Tlili, A., Salha, S., Mizza, D., Saqr, M., López-Pernas, S., & Huang, R. (2025). Unlocking the potential of artificial intelligence in improving learning achievement in blended learning: A meta-analysis. Frontiers in Psychology, 16, 1691414. [Google Scholar] [CrossRef]
- Wulff, P., & Kubsch, M. (2025). Learning against the machine: The double edged sword of (Gen)AI in STEM education. International Journal of STEM Education, 12(1), 66. [Google Scholar] [CrossRef]
- Xia, Q., Zhang, P., Huang, W., & Chiu, T. K. F. (2025). The impact of generative AI on university students’ learning outcomes via Bloom’s taxonomy: A meta-analysis and pattern mining approach. Asia Pacific Journal of Education, 1–31. [Google Scholar] [CrossRef]
- Xing, W., Song, Y., Li, C., Liu, Z., Zhu, W., & Oh, H. (2025). Development of a generative AI-powered teachable agent for middle school mathematics learning: A design-based research study. British Journal of Educational Technology, 56, 2043–2077. [Google Scholar] [CrossRef]
- Xuan, S. H., Nguyen, A. T., Nguyen, T., Nguyen, L., Nguyen, H., Pham, N., Phung, T., Ngo, B., Nguyen, V., Nguyen, M., Tran, T., Le, T., Nguyen, K., & FNU, P. (2025). Evaluating the impact of generative AI in mathematics education: A comparative study in Vietnamese high schools. Human Behavior and Emerging Technologies, 2025, 8886206. [Google Scholar] [CrossRef]
- Yanar, A. N., & Ergene, Ö. (2025). Integrating artificial intelligence in education: How pre-service mathematics teachers use ChatGPT for 5E lesson plan design. Journal of Pedagogical Research, 9(2), 158–176. [Google Scholar] [CrossRef]
- Yavich, R. (2025). Improving learning outcomes in advanced mathematics for underprepared university students through AI-driven educational tools. African Educational Research Journal, 13(2), 224–239. [Google Scholar]
- Ye, X., Zhang, W., Zhou, Y., Li, X., & Zhou, Q. (2025). Improving students’ programming performance: An integrated mind mapping and generative AI chatbot learning approach. Humanities and Social Sciences Communications, 12(1), 558. [Google Scholar] [CrossRef]
- Yi, L., Liu, D., Jiang, T., & Xian, Y. (2025). The effectiveness of AI on K-12 students’ mathematics learning: A systematic review and meta-analysis. International Journal of Science and Mathematics Education, 23(4), 1105–1126. [Google Scholar] [CrossRef]
- Yoon, H., Hwang, J., Lee, K., Roh, K. H., & Kwon, O. N. (2024). Students’ use of generative artificial intelligence for proving mathematical statements. ZDM—Mathematics Education, 56(7), 1531–1551. [Google Scholar] [CrossRef]
- Yu, M., Liu, Z., Long, T., Li, D., Deng, L., Kong, X., & Sun, J. (2025). Exploring cognitive presence patterns in GenAI-integrated six-hat thinking technique scaffolded discussion: An epistemic network analysis. International Journal of Educational Technology in Higher Education, 22(1), 48. [Google Scholar] [CrossRef]
- Zhou, L. (2025). Interdisciplinary teaching quality monitoring for primary majors under “Double-High” policy: A case study of Hefei Preschool Education College. International Journal of Knowledge Management, 21(1), 1–20. [Google Scholar] [CrossRef]
- Zhou, R., He, X., Fan, Q., Li, Y., Li, Y., Xiao, X., & Fang, J. (2025). Exploring ChatGPT-facilitated scaffolding in undergraduates’ mathematical problem solving. Journal of Computer Assisted Learning, 41, e70077. [Google Scholar] [CrossRef]
- Zhuang, Y. (2025). Lessons from using ChatGPT in calculus: Insights from two contrasting cases. Journal of Formative Design in Learning, 9(1), 25–35. [Google Scholar] [CrossRef]


| Screening Stage | Inclusion Criteria | Exclusion Criteria | Literature Count |
|---|---|---|---|
| Initial Screening after | 1. Records identified from Databases (Web of Science, EBSCO, CNKI), Google Scholar, Other methods | 1. Repetitive literature was removed (n = 369) | Initial: 2104 After: 1658 |
| 2. Records removed for other reasons (n = 77) | |||
| Titles/Abstract Screening | 1. Records assessed for relevance to the research topic | Records excluded as not related to the research topic (n = 1489) | Initial: 1658 After: 169 |
| Full-Text Eligibility Assessment | 1. Experimental/quasi-experimental studies employing GenAI as the intervention, with a control group using traditional instructional methods | 1. Non-experimental/quasi-experimental design (n = 124) | Initial: 169 After: 22 |
| 2. Provided complete effect size data or data calculable for effect sizes (e.g., means, standard deviations, sample sizes). | 2. Incomplete data for effect size calculation (n = 22) | ||
| 3. MERSQI score ≥ 10.5 points | 3. MERSQI score below 10.5 (n = 1) |
| Dimension | Category | Description | References |
|---|---|---|---|
| Cognitive or Non-cognitive | Cognitive Skills | Higher-order cognitive skills; Lower-order cognitive skills | (Xia et al., 2025) |
| Non-cognitive Skills | Affective, motivational, and related abilities | (Xia et al., 2025) | |
| Intervention Duration | Short-term | ≤1 month | (Stephenson, 2022) |
| Long-term | >1 month | (Stephenson, 2022) | |
| Sample Size | Small | ≤100 participants | (Bernard et al., 2014) |
| Large | >100 participants | (Bernard et al., 2014) | |
| Education Level | Primary School | Primary school students | (Bartolini et al., 2025) |
| Secondary School | Junior or senior high school students | (Bartolini et al., 2025) | |
| Tertiary | University students | (Bartolini et al., 2025) | |
| Learning Content | Number & Algebra | Number or Algebra | (Yi et al., 2025) |
| Geometry | Geometry | (Yi et al., 2025) | |
| Statistics | Data and Chance | (Yi et al., 2025) | |
| Integration | involves two or more of the above core fields | (Yi et al., 2025) | |
| Degree of GenAI Integration | CT | Creative Transformation | (Borup et al., 2022) |
| IPA | Interactive or Passive Augmentation (A combined category for interventions fitting either or both modes) | (Borup et al., 2022) | |
| Learning mode | Independent Learning | Self-directed use of GenAI for learning | (K. Wang & Guo, 2025) |
| Collaborative Learning | Group-based interaction with GenAI for learning | (K. Wang & Guo, 2025) |
| DF | AIC | BIC | AICc | logLik | LRT | p-Value | QE | |
|---|---|---|---|---|---|---|---|---|
| Full | 3 | 94.98 | 100.4 | 95.57 | −44.49 | 90.91 | ||
| Reduced | 2 | 92.98 | 96.6 | 93.27 | −44.49 | 0 | 1 | 90.91 |
| Estimate | SE | CL | Tau2 | ST | I2 | H2 | Q | P | |
|---|---|---|---|---|---|---|---|---|---|
| DL | 0.535 | 0.098 | [0.343, 0.728] | 0.2142 | 0.463 | 50.5 | 2.02 | 90.91 | 0.001 |
| REML | 0.534 | 0.097 | [0.345, 0.723] | 0.2013 | 0.449 | 48.9 | 1.96 | 90.91 | 0.001 |
| SJ | 0.547 | 0.117 | [0.318, 0.776] | 0.3908 | 0.391 | 65.1 | 2.86 | 90.91 | 0.001 |
| Outcome Variables | n | g | 95% CI | Q | P |
|---|---|---|---|---|---|
| Overall | 46 | 0.534 *** | [0.345, 0.723] | ||
| Cognitive | 38 | 0.596 *** | [0.367, 0.824] | 1.355 | 0.2443 |
| Non-cognitive | 8 | 0.299 | [−0.003, 0.601] |
| Outcome Variables | n | g | 95% CI | Q | P |
|---|---|---|---|---|---|
| Cognitive-high | 8 | 0.718 *** | [0.344, 1.092] | 1.735 | 0.42 |
| Cognitive-low | 30 | 0.569 *** | [0.298, 0.840] |
| Variables | n | g | 95% CI | Q | P | ||
|---|---|---|---|---|---|---|---|
| intervention settings | Intervention Duration | Long | 25 | 0.376 *** | [0.172, 0.579] | 3.330 | 0.1892 |
| Short | 14 | 0.735 *** | [0.468, 1.002] | ||||
| N/A | 7 | 0.672 | [−0.466, 1.810] | ||||
| Sample Size | Small | 23 | 0.832 *** | [0.470, 1.193] | 7.501 | 0.0062 | |
| Large | 23 | 0.336 *** | [0.151, 0.522] | ||||
| educational context | Learning Content | Integration | 21 | 0.256 ** | [0.081, 0.431] | 10.750 | 0.0131 |
| Geometry | 8 | 0.906 ** | [0.366, 1.446] | ||||
| Number & Algebra | 12 | 0.784 *** | [0.469, 1.098] | ||||
| Statistics | 5 | 0.775 | [−0.742, 2.293] | ||||
| Grade Level | Primary School | 10 | 0.754 ** | [0.196, 1.313] | 3.811 | 0.1487 | |
| Secondary School | 20 | 0.313 ** | [0.105, 0.520] | ||||
| Tertiary | 16 | 0.667 *** | [0.285, 1.049] | ||||
| GenAI application features | Integration Degree | CT | 6 | 1.164 *** | [0.656, 1.673] | 6.624 | 0.0101 |
| IPA | 40 | 0.443 *** | [0.252, 0.634] | ||||
| Learning Mode | Independent Learning | 29 | 0.592 *** | [0.328, 0.856] | 7.372 | 0.0251 | |
| Collaborative Learning | 6 | 1.008 *** | [0.522, 1.494] | ||||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, B.; Zhang, W.; Wang, F. Can Generative Artificial Intelligence Effectively Enhance Students’ Mathematics Learning Outcomes?—A Meta-Analysis of Empirical Studies from 2023 to 2025. Educ. Sci. 2026, 16, 140. https://doi.org/10.3390/educsci16010140
Liu B, Zhang W, Wang F. Can Generative Artificial Intelligence Effectively Enhance Students’ Mathematics Learning Outcomes?—A Meta-Analysis of Empirical Studies from 2023 to 2025. Education Sciences. 2026; 16(1):140. https://doi.org/10.3390/educsci16010140
Chicago/Turabian StyleLiu, Baoxin, Wenlan Zhang, and Fangfang Wang. 2026. "Can Generative Artificial Intelligence Effectively Enhance Students’ Mathematics Learning Outcomes?—A Meta-Analysis of Empirical Studies from 2023 to 2025" Education Sciences 16, no. 1: 140. https://doi.org/10.3390/educsci16010140
APA StyleLiu, B., Zhang, W., & Wang, F. (2026). Can Generative Artificial Intelligence Effectively Enhance Students’ Mathematics Learning Outcomes?—A Meta-Analysis of Empirical Studies from 2023 to 2025. Education Sciences, 16(1), 140. https://doi.org/10.3390/educsci16010140

