Educational Measurement with Emerging Technologies: A Systematic Review Through Evidentiary Lens on Granularity and Constructing Measures Theory
Abstract
1. Introduction
2. Theoretical Framework
2.1. Measurement as Evidentiary Process
2.2. Four Building Blocks Theory in Constructing Measures
2.3. Measurement Granularity
3. Methods
3.1. Database Search Strategy
3.2. Eligibility Criteria
3.3. Study Selection and Screening
3.4. Data Extraction and Coding
4. Results and Discussion
4.1. Descriptive Overview
4.1.1. Descriptive Overview of Demographic Information
4.1.2. Descriptive Overview of Analytical Coding
4.2. Emerging Technologies Across Grain Sizes and Building Blocks
4.2.1. Emerging Technologies Across Grain Sizes
4.2.2. Emerging Technologies Across Building Blocks Within Each Grain Size
5. Critical Reflections on Emerging Technologies-Enabled Educational Measurement
5.1. Construct Meaning and Validity Drift
5.2. Robustness and Generalizability
5.3. Fairness and Transparency
5.4. Privacy and Governance
6. Implications and Future Direction
6.1. For Researchers
6.2. For System Designers
6.3. For Educators and Practitioners
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abdelhalim, S. M., & Alsehibany, R. A. (2025). Integrating AI-powered tools in EFL pronunciation instruction: Effects on accuracy and L2 motivation. Computer Assisted Language Learning, 1–25. [Google Scholar] [CrossRef]
- Abdi, S., Khosravi, H., Sadiq, S., & Gasevic, D. (2019). A multivariate Elo-based learner model for adaptive educational systems. arXiv, arXiv:1910.12581. [Google Scholar] [CrossRef]
- Alfredo, R., Echeverria, V., Zhao, L., Lawrence, L., Fan, J. X., Yan, L., Li, X., Swiecki, Z., Gašević, D., & Martinez-Maldonado, R. (2024). Designing a human-centred learning analytics dashboard in-use. Journal of Learning Analytics, 11(3), 62–81. [Google Scholar] [CrossRef]
- Al Hakim, V. G., Yang, S. H., Liyanawatta, M., Wang, J. H., & Chen, G. D. (2022). Robots in situated learning classrooms with immediate feedback mechanisms to improve students’ learning performance. Computers & Education, 182, 104483. [Google Scholar] [CrossRef]
- AlJarrah, A., Thomas, M. K., & Shehab, M. (2018). Investigating temporal access in a flipped classroom: Procrastination persists. International Journal of Educational Technology in Higher Education, 15(1), 1. [Google Scholar] [CrossRef]
- Alvarez-Garcia, M., Arenas-Parra, M., & Ibar-Alonso, R. (2024). Uncovering student profiles. An explainable cluster analysis approach to PISA 2022. Computers & Education, 223, 105166. [Google Scholar] [CrossRef]
- American Educational Research Association, American Psychological Association & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association. [Google Scholar]
- Baker, R. S., & Siemens, G. (2014). Educational data mining and learning analytics. In R. K. Sawyer (Ed.), The Cambridge handbook of the learning sciences (2nd ed.). Cambridge University Press. [Google Scholar] [CrossRef]
- Baral, S., Botelho, A. F., Erickson, J. A., Benachamardi, P., & Heffernan, N. T. (2021). Improving automated scoring of student open responses in mathematics. In 14th international conference on educational data mining (EDM 2021) (pp. 130–138). International Educational Data Mining Society. Available online: https://eric.ed.gov/?id=ED615565 (accessed on 22 November 2025).
- Bennett, R. E. (2015). The changing nature of educational assessment. Review of Research in Education, 39(1), 370–407. [Google Scholar] [CrossRef]
- Bertolini, R., Finch, S. J., & Nehm, R. H. (2021). Testing the impact of novel assessment sources and machine learning methods on predictive outcome modeling in undergraduate biology. Journal of Science Education and Technology, 30(2), 193–209. [Google Scholar] [CrossRef]
- Bilal, M., Omar, M., Anwar, W., Bokhari, R. H., & Choi, G. S. (2025). Bridging the gap: From traditional admissions to data-driven insights for predicting and supporting undergraduate performance. Education and Information Technologies, 30(18), 27085–27110. [Google Scholar] [CrossRef]
- Black, P., Wilson, M., & Yao, S. Y. (2011). Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research & Perspective, 9(2–3), 71–123. [Google Scholar] [CrossRef]
- Bonami, B., Piazentini, L., & Dala-Possa, A. (2020). Education, big data and artificial intelligence: Mixed methods in digital platforms. Comunicar, 65, 43–52. [Google Scholar] [CrossRef]
- Borchers, C., Fleischer, H., Schanze, S., Scheiter, K., & Aleven, V. (2025). High scaffolding of an unfamiliar strategy improves conceptual learning but reduces enjoyment compared to low scaffolding and strategy freedom. Computers & Education, 236, 105364. [Google Scholar] [CrossRef]
- Bowen, N. E., & Todd, R. W. (2025). Enhancing ChatGPT-based writing research through effective prompt use. Teaching English with Technology, 25(1), 26–40. [Google Scholar] [CrossRef]
- Bulathwela, S., Verma, M., Pérez-Ortiz, M., Yilmaz, E., & Shawe-Taylor, J. (2022). Can population-based engagement improve personalisation? A novel dataset and experiments. arXiv, arXiv:2207.01504. [Google Scholar] [CrossRef]
- Bulut, O., Gorgun, G., & Yildirim-Erbasli, S. N. (2025). The impact of frequency and stakes of formative assessment on student achievement in higher education: A learning analytics study. Journal of Computer Assisted Learning, 41(1), e13087. [Google Scholar] [CrossRef]
- Butterfuss, R., Roscoe, R. D., Allen, L. K., McCarthy, K. S., & McNamara, D. S. (2022). Strategy uptake in writing pal: Adaptive feedback and instruction. Journal of Educational Computing Research, 60(3), 696–721. [Google Scholar] [CrossRef]
- Cabı, E., & Türkoğlu, H. (2025). The impact of a learning analytics based feedback system on students’ academic achievement and self-regulated learning in a flipped classroom. International Review of Research in Open and Distributed Learning, 26(1), 175–196. [Google Scholar] [CrossRef]
- Cai, Z., Graesser, A. C., Windsor, L., Cheng, Q., Shaffer, D. W., & Hu, X. (2018). Impact of corpus size and dimensionality of LSA spaces from Wikipedia articles on AutoTutor answer evaluation. Journal of Educational Data Mining. Available online: https://par.nsf.gov/biblio/10098439 (accessed on 22 November 2025).
- Campbell, J. L., Quincy, C., Osserman, J., & Pedersen, O. K. (2013). Coding in-depth semistructured interviews: Problems of unitization and intercoder reliability and agreement. Sociological Methods & Research, 42(3), 294–320. [Google Scholar] [CrossRef]
- Cerratto Pargman, T., & McGrath, C. (2021). Mapping the ethics of learning analytics in higher education: A systematic literature review of empirical research. Journal of Learning Analytics, 8(2), 123–139. [Google Scholar] [CrossRef]
- Charleer, S., Vande Moere, A., Klerkx, J., Verbert, K., & De Laet, T. (2017). Learning analytics dashboards to support adviser-student dialogue. IEEE Transactions on Learning Technologies, 11(3), 389–399. [Google Scholar] [CrossRef]
- Chejara, P., Kasepalu, R., Prieto, L. P., Rodríguez-Triana, M. J., Ruiz Calleja, A., & Schneider, B. (2023). How well do collaboration quality estimation models generalize across authentic school contexts? British Journal of Educational Technology, 55(4), 1602–1624. [Google Scholar] [CrossRef]
- Chen, B., Bao, L., Zhang, R., Zhang, J., Liu, F., Wang, S., & Li, M. (2024). A multi-strategy computer-assisted EFL writing learning system with deep learning incorporated and its effects on learning: A writing feedback perspective. Journal of Educational Computing Research, 61(8), 1596–1638. [Google Scholar] [CrossRef]
- Chen, D., Jeng, A., Sun, S., & Kaptur, B. (2023). Use of technology-based assessments: A systematic review covering over 30 countries. Assessment in Education: Principles, Policy & Practice, 30(5–6), 396–428. [Google Scholar] [CrossRef]
- Chen, J. (2024). Effects of learning analytics-based feedback on students’ self-regulated learning and academic achievement in a blended EFL course. System, 124, 103388. [Google Scholar] [CrossRef]
- Chen, S. Y., & Yeh, C. C. (2017). The effects of cognitive styles on the use of hints in academic English: A learning analytics approach. Journal of Educational Technology & Society, 20(2), 251–264. [Google Scholar]
- Chen, Y., Li, J., Liu, Y., Jiang, F., Zhou, A., & Li, Y. (2025). Mining the patterns of teachers’ nonverbal behavior: Automated recognition and systematic exploration. Journal of Educational Computing Research, 63(7–8), 1583–1617. [Google Scholar] [CrossRef]
- Cheung, K. C., Sit, P. S., Zheng, J. Q., Lam, C. C., Mak, S. K., & Ieong, M. K. (2024). A machine-learning model of academic resilience in the times of the COVID-19 pandemic: Evidence drawn from 79 countries/economies in the PISA 2022 mathematics study. British Journal of Educational Psychology, 94(4), 1224–1244. [Google Scholar] [CrossRef]
- Cohen, J., Anglin, K., & Wiseman, E. (2024a). Tailoring teacher supports: A mixed-methods analysis of responses to coaching and self-reflection. AERA Open, 10, 23328584241289876. [Google Scholar] [CrossRef]
- Cohen, J., Wong, V. C., Krishnamachari, A., & Erickson, S. (2024b). Experimental evidence on the robustness of coaching supports in teacher education. Educational Researcher, 53(1), 19–35. [Google Scholar] [CrossRef]
- Cohn, C., Snyder, C., Fonteles, J. H., TS, A., Montenegro, J., & Biswas, G. (2025). A multimodal approach to support teacher, researcher and AI collaboration in STEM+ C learning environments. British Journal of Educational Technology, 56(2), 595–620. [Google Scholar] [CrossRef]
- Costa-Mendes, R., Oliveira, T., Castelli, M., & Cruz-Jesus, F. (2021). A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach. Education and Information Technologies, 26(2), 1527–1547. [Google Scholar] [CrossRef]
- Cukurova, M., Khan-Galaria, M., Millán, E., & Luckin, R. (2022). A learning analytics approach to monitoring the quality of online one-to-one tutoring. Journal of Learning Analytics, 9(2), 105–120. [Google Scholar] [CrossRef]
- Çakiroğlu, Ü., & Kahyar, S. (2022). Modelling online community constructs through interaction data: A learning analytics based approach. Education and Information Technologies, 27(6), 8311–8328. [Google Scholar] [CrossRef]
- Daly, P., & Deglaire, E. (2025). AI-enabled correction: A professor’s journey. Innovations in Education and Teaching International, 62(4), 1241–1257. [Google Scholar] [CrossRef]
- Dannath, J., Deriyeva, A., & Paaßen, B. (2025). What is a step? A user study on how to sub-divide the solution process of introductory python tasks. In C. Mills, G. Alexandron, D. Taibi, G. Lo Bosco, & L. Paquette (Eds.), 18th international conference on educational data mining (EDM 2025) (pp. 533–540). International Educational Data Mining Society. Available online: https://eric.ed.gov/?id=ED675667 (accessed on 22 November 2025).
- Darvishi, A., Khosravi, H., Sadiq, S., & Gašević, D. (2022). Incorporating AI and learning analytics to build trustworthy peer assessment systems. British Journal of Educational Technology, 53(4), 844–875. [Google Scholar] [CrossRef]
- de Barros Camargo, C., & Hernández Fernández, A. (2024). Neuropedagogy and neuroimaging of artificial intelligence and deep learning. Educational Process: International Journal, 13(3), 97–115. [Google Scholar] [CrossRef]
- Delgado, A. J., Wardlow, L., McKnight, K., & O’Malley, K. (2015). Educational technology: A review of the integration, resources, and effectiveness of technology in K–12 classrooms. Journal of Information Technology Education: Research, 14, 397–416. [Google Scholar] [CrossRef] [PubMed]
- DiSabito, D., Hansen, L., Mennella, T., & Rodriguez, J. (2025). Exploring the frontiers of generative AI in assessment: Is there potential for a human-AI partnership? New Directions for Teaching and Learning, 2025(182), 81–96. [Google Scholar] [CrossRef]
- Divasón, J., Martínez-de-Pisón, F. J., Romero, A., & Sáenz-de-Cabezón, E. (2023). Artificial intelligence models for assessing the evaluation process of complex student projects. IEEE Transactions on Learning Technologies, 16(5), 694–707. [Google Scholar] [CrossRef]
- Divjak, B., Svetec, B., Horvat, D., & Kadoić, N. (2023). Assessment validity and learning analytics as prerequisites for ensuring student-centred learning design. British Journal of Educational Technology, 54(1), 313–334. [Google Scholar] [CrossRef]
- Doleck, T., Lemay, D. J., Basnet, R. B., & Bazelais, P. (2020). Predictive analytics in education: A comparison of deep learning frameworks. Education and Information Technologies, 25(3), 1951–1963. [Google Scholar] [CrossRef]
- Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., & Lim, W. M. (2021). How to conduct a bibliometric analysis: An overview and guidelines. Journal of Business Research, 133, 285–296. [Google Scholar] [CrossRef]
- Dosaru, D. F., Simion, D. M., Ignat, A. H., Negreanu, L. C., & Olteanu, A. C. (2025). Using GenAI to assess design patterns in student written code. IEEE Transactions on Learning Technologies, 18, 869–876. [Google Scholar] [CrossRef]
- Drinkwater Gregg, K., Ryan, O., Katz, A., Huerta, M., & Sajadi, S. (2025). Expanding possibilities for generative AI in qualitative analysis: Fostering student feedback literacy through the application of a feedback quality rubric. Journal of Engineering Education, 114(3), e70024. [Google Scholar] [CrossRef]
- Firetto, C. M., Murphy, P. K., Starrett, E., Herman, E. A., Greene, J. A., Tang, Y., & Yan, L. (2025). Investigating grade-level and text genre effects in quality talk discussions: An AI-powered discourse analysis of upper primary students’ high-level comprehension. Learning and Instruction, 100, 102208. [Google Scholar] [CrossRef]
- Flodén, J. (2025). Grading exams using large language models: A comparison between human and AI grading of exams in higher education using ChatGPT. British Educational Research Journal, 51(1), 201–224. [Google Scholar] [CrossRef]
- Forkan, A. R. M., Kang, Y.-B., Jayaraman, P. P., Du, H., Thomson, S., Kollias, E., & Wieland, N. (2023). VideoDL: Video-based digital learning framework using AI question generation and answer assessment. International Journal of Advanced Corporate Learning, 16(1), 19–27. [Google Scholar] [CrossRef]
- Frick, T. W., Myers, R. D., & Dagli, C. (2022). Analysis of patterns in time for evaluating effectiveness of first principles of instruction. Educational Technology Research and Development, 70(1), 1–29. [Google Scholar] [CrossRef]
- Fuller, R., Goddard, V. C. T., Nadarajah, V. D., Treasure-Jones, T., Yeates, P., Scott, K., Webb, A., Valter, K., & Pyörälä, E. (2022). Technology enhanced assessment: Ottawa consensus statement and recommendations. Medical Teacher, 44(8), 836–850. [Google Scholar] [CrossRef] [PubMed]
- Gardner, J., & Brooks, C. (2018). Evaluating predictive models of student success: Closing the methodological gap. arXiv, arXiv:1801.08494. [Google Scholar] [CrossRef]
- Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. [Google Scholar] [CrossRef]
- Geckin, V., Kızıltaş, E., & Çınar, Ç. (2023). Assessing second-language academic writing: AI vs. Human raters. Journal of Educational Technology and Online Learning, 6(4), 1096–1108. [Google Scholar] [CrossRef]
- Gelan, A., Fastré, G., Verjans, M., Martin, N., Janssenswillen, G., Creemers, M., Lieben, J., Depaire, B., & Thomas, M. (2018). Affordances and limitations of learning analytics for computer-assisted language learning: A case study of the VITAL project. Computer Assisted Language Learning, 31(3), 294–319. [Google Scholar] [CrossRef]
- Gray, C. C., & Perkins, D. (2019). Utilizing early engagement and machine learning to predict student outcomes. Computers & Education, 131, 22–32. [Google Scholar] [CrossRef]
- Greller, W., & Drachsler, H. (2012). Translating learning into numbers: A generic framework for learning analytics. Journal of Educational Technology & Society, 15(3), 42–57. [Google Scholar]
- Guevara-Flores, K. F., Hernández-Calderón, J. G., & Soto-Mendoza, V. (2023, November 25–26). Enhancing English proficiency test evaluation: Leveraging artificial intelligence for result classification. 2023 10th International Conference on Soft Computing & Machine Intelligence (ISCMI) (pp. 183–187), Mexico City, Mexico. [Google Scholar] [CrossRef]
- Guo, D. (2025, June 6–7). An enhanced evaluation of English teaching quality based on explainable artificial intelligence techniques. 2025 International Conference on Intelligent Computing and Knowledge Extraction (ICICKE) (pp. 1–6), Bengaluru, India. [Google Scholar] [CrossRef]
- Gupta, S., & Sabitha, A. S. (2019). Deciphering the attributes of student retention in massive open online courses using data mining techniques. Education and Information Technologies, 24(3), 1973–1994. [Google Scholar] [CrossRef]
- Hakimi, L., Eynon, R., & Murphy, V. A. (2021). The ethics of using digital trace data in education: A thematic review of the research landscape. Review of Educational Research, 91(5), 671–717. [Google Scholar] [CrossRef]
- Han, F., & Ellis, R. (2020a). Combining self-reported and observational measures to assess university student academic performance in blended course designs. Australasian Journal of Educational Technology, 36(6), 1–14. [Google Scholar] [CrossRef]
- Han, F., & Ellis, R. (2020b). Personalised learning networks in the university blended learning context. Comunicar, 28(62), 19–30. [Google Scholar] [CrossRef]
- Hansel, C. A., Ottenbreit-Leftwich, A., Quick, J. D., Greene, A. H., & Ricci, M. (2024). Gradescope in large lecture classrooms: A case study at Indiana university: How an online grading platform enhanced student learning and instructor feedback in large-scale courses. Journal of Teaching and Learning with Technology, 13(1), 33–48. [Google Scholar] [CrossRef]
- Hao, J., Gan, J., & Zhu, L. (2022). MOOC performance prediction and personal performance improvement via Bayesian network. Education and Information Technologies, 27(5), 7303–7326. [Google Scholar] [CrossRef]
- Harindranathan, P., & Folkestad, J. (2019). Learning analytics to inform the learning design: Supporting instructors’ inquiry into student learning in unsupervised technology-enhanced platforms. Online Learning, 23(3), 34–55. [Google Scholar] [CrossRef]
- Heil, J., & Ifenthaler, D. (2023). Online assessment in higher education: A systematic review. Online Learning, 27(1), 187–218. [Google Scholar] [CrossRef]
- Henríquez, V., Guerra, J., & Scheihing, E. (2024). The impact of an academic counselling learning analytics tool: Evidence from 3 years of use. British Journal of Educational Technology, 55(5), 1884–1899. [Google Scholar] [CrossRef]
- Herodotou, C., Rienties, B., Boroowa, A., Zdrahal, Z., & Hlosta, M. (2019). A large-scale implementation of predictive learning analytics in higher education: The teachers’ role and perspective. Educational Technology Research and Development, 67(5), 1273–1306. [Google Scholar] [CrossRef]
- Hershberger, P. J., Pei, Y., Bricker, D. A., Crawford, T. N., Shivakumar, A., Castle, A., Conway, K., Medaramitta, R., Rechtin, M., & Wilson, J. F. (2024). Motivational interviewing skills practice enhanced with artificial intelligence: ReadMI. BMC Medical Education, 24, 237. [Google Scholar] [CrossRef]
- Hershkovitz, A., Tabach, M., & Cohen, A. (2022). Online activity and achievements in elementary school mathematics: A large-scale exploration. Journal of Educational Computing Research, 60(1), 258–278. [Google Scholar] [CrossRef]
- Hilliger, I., Aguirre, C., Miranda, C., Celis, S., & Pérez-Sanagustín, M. (2022). Lessons learned from designing a curriculum analytics tool for improving student learning and program quality. Journal of Computing in Higher Education, 34(3), 633–657. [Google Scholar] [CrossRef]
- Hirschi, K., Kang, O., Yang, M., Hansen, J. H. L., & Beloin, K. (2025). Artificial intelligence-generated feedback for second language intelligibility: An exploratory intervention study on effects and perceptions. Language Learning, 75(S1), 204–241. [Google Scholar] [CrossRef]
- Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign. [Google Scholar]
- Horikoshi, I., Noguchi, M., & Tamura, Y. (2016). Evaluation of learning unit design with use of page flip information analysis. International Association for Development of the Information Society. Available online: https://eric.ed.gov/?id=ED571426 (accessed on 28 November 2025).
- Hou, R., Bühler, B., Fütterer, T., Bozkir, E., Gerjets, P., Trautwein, U., & Kasneci, E. (2025). Multimodal assessment of classroom discourse quality: A text-centered attention-based multi-task learning approach. arXiv, arXiv:2505.07902. [Google Scholar] [CrossRef]
- Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. [Google Scholar] [CrossRef]
- Ifenthaler, D., & Yau, J. Y. K. (2020). Utilising learning analytics to support study success in higher education: A systematic review. Educational Technology Research and Development, 68(4), 1961–1990. [Google Scholar] [CrossRef]
- International Organization for Standardization [ISO]. (1995). Guide to the expression of uncertainty in measurement (GUM). International Organization for Standardization. [Google Scholar]
- Joseph, B., & Abraham, S. (2023). Identifying slow learners in an e-learning environment using k-means clustering approach. Knowledge Management & E-Learning, 15(4), 539–553. [Google Scholar] [CrossRef]
- Jovanović, J., Saqr, M., Joksimović, S., & Gašević, D. (2021). Students matter the most in learning analytics: The effects of internal and instructional conditions in predicting academic success. Computers & Education, 172, 104251. [Google Scholar] [CrossRef]
- Khosravi, H., Buckingham Shum, S., Chen, G., Conati, C., Tsai, Y.-S., Kay, J., Knight, S., Martinez-Maldonado, R., Sadiq, S., & Gašević, D. (2022). Explainable artificial intelligence in education. Computers and Education: Artificial Intelligence, 3, 100074. [Google Scholar] [CrossRef]
- Kim, D., Park, Y., Yoon, M., & Jo, I. H. (2016). Toward evidence-based learning analytics: Using proxy variables to improve asynchronous online discussion environments. The Internet and Higher Education, 30, 30–43. [Google Scholar] [CrossRef]
- Kivimäki, V., Pesonen, J., Romanoff, J., Remes, H., & Ihantola, P. (2019). Curricular concept maps as structured learning diaries: Collecting data on self-regulated learning and conceptual thinking for learning analytics applications. Journal of Learning Analytics, 6(3), 106–121. [Google Scholar] [CrossRef]
- Klang, E., Portugez, S., Gross, R., Kassif Lerner, R., Brenner, A., Gilboa, M., Ortal, T., Ron, S., Robinzon, V., Meiri, H., & Segal, G. (2023). Advantages and pitfalls in utilizing artificial intelligence for crafting medical examinations: A medical education pilot study with GPT-4. BMC Medical Education, 23, 772. [Google Scholar] [CrossRef] [PubMed]
- Kokoç, M. (2019). Flexibility in e-learning: Modelling its relation to behavioural engagement and academic performance. Themes in eLearning, 12(12), 1–16. [Google Scholar]
- Kong, X., Liu, Z., Chen, C., Liu, S., Xu, Z., & Tang, Q. (2025). Exploratory study of an AI-supported discussion representational tool for online collaborative learning in a Chinese university. The Internet and Higher Education, 64, 100973. [Google Scholar] [CrossRef]
- Koraishi, O. (2024). The intersection of AI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2. Language Teaching Research Quarterly, 43, 22–42. [Google Scholar] [CrossRef]
- Kortemeyer, G., Nöhl, J., & Onishchuk, D. (2024). Grading assistance for a handwritten thermodynamics exam using artificial intelligence: An exploratory study. Physical Review Physics Education Research, 20(2), 020144. [Google Scholar] [CrossRef]
- Krippendorff, K. (2004). Reliability in content analysis: Some common misconceptions and recommendations. Human Communication Research, 30(3), 411–433. [Google Scholar] [CrossRef]
- Lai, J. W. M., & Bower, M. (2019). How is the use of technology in education evaluated? A systematic review. Computers & Education, 133, 27–42. [Google Scholar] [CrossRef]
- Lai, J. W. M., & Bower, M. (2020). Evaluation of technology use in education: Findings from a critical analysis of systematic literature reviews. Journal of Computer Assisted Learning, 36(3), 241–259. [Google Scholar] [CrossRef]
- Lan, H. (2025, June 6–7). Quality evaluation of talent cultivation in higher vocational education based on artificial intelligence algorithms. 2025 International Conference on Intelligent Computing and Knowledge Extraction (ICICKE) (pp. 1–7), Bengaluru, India. [Google Scholar] [CrossRef]
- Leavy, A., Dick, L., Meletiou-Mavrotheris, M., Paparistodemou, E., & Stylianou, E. (2023). The prevalence and use of emerging technologies in STEAM education: A systematic review of the literature. Journal of Computer Assisted Learning, 39(4), 1061–1082. [Google Scholar] [CrossRef]
- Lee, J., Soleimani, F., Irish, I., Hosmer, J., IV, Yilmaz Soylu, M., Finkelberg, R., & Chatterjee, S. (2022). Predicting cognitive presence in at-scale online learning: MOOC and for-credit online course environments. Online Learning, 26(1), 58–79. [Google Scholar] [CrossRef]
- Lehrer, R. (2021). Accountable assessment [Keynote presentation]. In Research conference 2021: Excellent progress for every student: Proceedings and program. Australian Council for Educational Research. [Google Scholar] [CrossRef]
- Li, H., Xing, W., Li, C., Zhu, W., & Woodhead, S. (2025a). Integrating option tracing into knowledge tracing: Enhancing learning analytics for mathematics multiple-choice questions. Journal of Learning Analytics, 12(1), 322–337. [Google Scholar] [CrossRef]
- Li, H., Xing, W., Zhu, W., Zhang, S., & Liu, Z. (2025b). Should educational AI models include gender attribute? explaining the why based on environmental psychology course with gender imbalance. Journal of Computing in Higher Education, 37(4), 1371–1412. [Google Scholar] [CrossRef]
- Li, R., Liu, Y., & Gao, N. (2025a, June 13–16). On AI assisted formative assessment of blended teaching model: Taking “cultivation of ethics and fundamentals of law” course as an example. 2025 International Conference on Distance Education and Learning (ICDEL) (pp. 166–170), Kunming, China. [Google Scholar] [CrossRef]
- Li, R., Liu, Y., & Gao, N. (2025b, May 14–16). On the effectiveness of formative assessment method assisted by artificial intelligence in college education: Taking “cultivation of ethics and fundamentals of law” course as an example. 2025 5th International Conference on Artificial Intelligence and Education (ICAIE) (pp. 670–674), Suzhou, China. [Google Scholar] [CrossRef]
- Lim, T., Gottipati, S., Cheong, M., Ng, J. W., & Pang, C. (2023). Analytics-enabled authentic assessment design approach for digital education. Education and Information Technologies, 28(7), 9025–9048. [Google Scholar] [CrossRef]
- Lin, C. J., & Hwang, G. J. (2025). Artificial intelligence-supported procedural scaffolding for promoting EFL learners’ writing performance in flipped peer assessment activities. Interactive Learning Environments, 1–15. [Google Scholar] [CrossRef]
- Lin, J., Singh, S., Sha, L., Tan, W., Lang, D., Gašević, D., & Chen, G. (2022). Is it a good move? Mining effective tutoring strategies from human–human tutorial dialogues. Future Generation Computer Systems, 127, 194–207. [Google Scholar] [CrossRef]
- Lin, J. J. (2025). AI-assisted evaluation of problem-solving performance using eye movement and handwriting. Journal of Research on Technology in Education, 57(5), 1019–1043. [Google Scholar] [CrossRef]
- Link, S., Redmon, R., Shamsi, Y., & Hagan, M. (2024). Generating genre-based automatic feedback on English for research publication purposes. CALICO Journal, 41(3), 319–346. [Google Scholar] [CrossRef]
- Liu, C., Feng, Y., & Wang, Y. (2022). An innovative evaluation method for undergraduate education: An approach based on BP neural network and stress testing. Studies in Higher Education, 47(1), 212–228. [Google Scholar] [CrossRef]
- Lokkila, E., Christopoulos, A., & Laakso, M. J. (2023). A data-driven approach to compare the syntactic difficulty of programming languages. Journal of Information Systems Education, 34(1), 84–93. [Google Scholar]
- Lombard, M., Snyder-Duch, J., & Bracken, C. C. (2002). Content analysis in mass communication: Assessment and reporting of intercoder reliability. Human Communication Research, 28(4), 587–604. [Google Scholar] [CrossRef]
- Long, P., & Siemens, G. (2014). Penetrating the fog: Analytics in learning and education. Italian Journal of Educational Technology, 22(3), 132–137. [Google Scholar]
- Ma, X., Pan, W., & Yu, X. N. (2025). Evaluating AI-generated examination papers in periodontology: A comparative study with human-designed counterparts. BMC Medical Education, 25(1), 1099. [Google Scholar] [CrossRef]
- Macarini, L. A., Lemos dos Santos, H., Cechinel, C., Ochoa, X., Rodés, V., Pérez Casas, A., Lucas, P. P., Maya, R., Alonso, G. E., & Díaz, P. (2020). Towards the implementation of a countrywide K-12 learning analytics initiative in Uruguay. Interactive Learning Environments, 28(2), 166–190. [Google Scholar] [CrossRef]
- MacQueen, K. M., McLellan, E., Kay, K., & Milstein, B. (1998). Codebook development for team-based qualitative analysis. Field Methods, 10(2), 31–36. [Google Scholar] [CrossRef]
- Makhlouf, J., & Mine, T. (2020). Analysis of click-stream data to predict STEM careers from student usage of an intelligent tutoring system. Journal of Educational Data Mining, 12(2), 1–18. [Google Scholar]
- Mangaroska, K., Vesin, B., Kostakos, V., Brusilovsky, P., & Giannakos, M. N. (2021). Architecting analytics across multiple e-learning systems to enhance learning design. IEEE Transactions on Learning Technologies, 14(2), 173–188. [Google Scholar] [CrossRef]
- Mari, L., Wilson, M., & Maul, A. (Eds.). (2023). Measurement across the sciences: Developing a shared concept system for measurement (2nd ed.). Springer. [Google Scholar] [CrossRef]
- Marquart, C. L., Hinojosa, C., Swiecki, Z., Eagan, B., & Shaffer, D. W. (2021). Epistemic network analysis (Version 1.7.0) [Software]. University of Wisconsin–Madison.
- Martin, F., Dennen, V. P., & Bonk, C. J. (2020). A synthesis of systematic review research on emerging learning environments and technologies. Educational Technology Research and Development: ETR & D, 68(4), 1613–1633. [Google Scholar] [CrossRef]
- Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174. [Google Scholar] [CrossRef]
- Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Macmillan. [Google Scholar]
- Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23. [Google Scholar] [CrossRef]
- Messick, S. (1996). Validity of performance assessments. In Technical issues in large-scale performance assessment (pp. 1–18). National Center for Education Statistics. [Google Scholar]
- Minty, I., Lawson, J., Guha, P., Luo, X., Malik, R., Cerneviciute, R., Kinross, J., & Martin, G. (2022). The use of mixed reality technology for the objective assessment of clinical skills: A validation study. BMC Medical Education, 22(1), 639. [Google Scholar] [CrossRef]
- Mislevy, R. J. (1996). Test theory reconceived. Journal of Educational Measurement, 33(4), 379–416. [Google Scholar] [CrossRef]
- Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003a). A brief introduction to evidence-centered design. ETS Research Report Series, 2003(1), i-29. [Google Scholar] [CrossRef]
- Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003b). Focus article: On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1(1), 3–62. [Google Scholar] [CrossRef]
- Monllao Olive, D., Huynh, D. Q., Reynolds, M., Dougiamas, M., & Wiese, D. (2020). A supervised learning framework: Using assessment to identify students at risk of dropping out of a MOOC. Journal of Computing in Higher Education, 32(1), 9–26. [Google Scholar] [CrossRef]
- Muresan, A., Cardei, M., & Cardei, I. (2025). Predicting student success with heterogeneous graph deep learning and machine learning models. In 18th international conference on educational data mining (EDM 2025) (pp. 265–275). International Educational Data Mining Society. Available online: https://eric.ed.gov/?id=ED675661 (accessed on 25 November 2025).
- Nahar, K., Shova, B. I., Ria, T., Rashid, H. B., & Islam, A. S. (2021). Mining educational data to predict students performance: A comparative study of data mining techniques. Education and Information Technologies, 26(5), 6051–6067. [Google Scholar] [CrossRef]
- Nam, S., Frishkoff, G., & Collins-Thompson, K. (2017). Predicting students disengaged behaviors in an online meaning-generation task. IEEE Transactions on Learning Technologies, 11(3), 362–375. [Google Scholar] [CrossRef]
- Nasir, J., Kothiyal, A., Bruno, B., & Dillenbourg, P. (2021). Many are the ways to learn identifying multi-modal behavioral profiles of collaborative learning in constructivist activities. International Journal of Computer-Supported Collaborative Learning, 16(4), 485–523. [Google Scholar] [CrossRef]
- National Research Council. (2001). Knowing what students know: The science and design of educational assessment. National Academies Press. [Google Scholar]
- Nawahdah, M., Sawalha, H., Salameh, R., & Taha, M. (2025, July 9–10). Evaluating the accuracy and effectiveness of AI-based grading in computer science education. 2025 International Conference on Smart Learning Courses (SCME) (pp. 1–6), Hebron, Palestine. [Google Scholar] [CrossRef]
- Nazaretsky, T., Hershkovitz, S., & Alexandron, G. (2019). Kappa learning: A new item-similarity method for clustering educational items from response data. In 12th international conference on educational data mining (EDM 2019) (pp. 129–138). International Educational Data Mining Society. Available online: https://eric.ed.gov/?id=ED599209 (accessed on 22 November 2025).
- Ngoc, H. D., Hoang, L. H., & Hung, V. X. (2020). Transforming education with emerging technologies in higher education: A systematic literature review. International Journal of Higher Education, 9(5), 252–258. [Google Scholar] [CrossRef]
- Nguyen, Q., Rienties, B., & Whitelock, D. (2020). A mixed-method study of how instructors design for learning in online and distance education. Journal of Learning Analytics, 7(3), 64–78. [Google Scholar] [CrossRef]
- Niknam, M., & Thulasiraman, P. (2020). LPR: A bio-inspired intelligent learning path recommendation system based on meaningful learning theory. Education and Information Technologies, 25(5), 3797–3819. [Google Scholar] [CrossRef]
- Novak, M., Andročec, D., & Picek, R. (2025, September 18–20). Comparison of generative artificial intelligence tools in the assessment of student assignments. 2025 International Conference on Software, Telecommunications and Computer Networks (SoftCOM) (pp. 1–6), Split, Croatia. [Google Scholar]
- Novita, S., Kusuma, P. A., Ratnasari, R. D., Khairani, R. N., Rahmayanthi, D., Noer, A. H., & Purba, F. D. (2022, October 13–15). Mathematics assessment using virtual reality: A study on indonesian elementary school children. 2022 International Conference on Assessment and Learning (ICAL) (pp. 1–6), Bali, Indonesia. [Google Scholar] [CrossRef]
- Núñez-Regueiro, F., Falcon, S., & Bressoux, P. (2025). Modeling demands-resources fit in teacher education using open-ended data: A methodological-substantive synergy. Education and Information Technologies, 30(18), 26025–26056. [Google Scholar] [CrossRef]
- O’Brien, B. C., Harris, I. B., Beckman, T. J., Reed, D. A., & Cook, D. A. (2014). Standards for reporting qualitative research: A synthesis of recommendations. Academic Medicine, 89(9), 1245–1251. [Google Scholar] [CrossRef]
- O’Connor, C., & Joffe, H. (2020). Intercoder reliability in qualitative research: Debates and practical guidelines. International Journal of Qualitative Methods, 19, 1609406919899220. [Google Scholar] [CrossRef]
- Oğuz, E. (2025). Can generative AI figure out figurative language? The influence of idioms on essay scoring by ChatGPT, Gemini, and Deepseek. Assessing Writing, 66, 100981. [Google Scholar] [CrossRef]
- Olsen, J. K., Aleven, V., & Rummel, N. (2017). Exploring dual eye tracking as a tool to assess collaboration. In A. A. von Davier, M. Zhu, & P. C. Kyllonen (Eds.), Innovative assessment of collaboration (pp. 157–172). Springer International Publishing AG. [Google Scholar] [CrossRef]
- Olsen, J. K., Sharma, K., Rummel, N., & Aleven, V. (2020). Temporal analysis of multimodal data to predict collaborative learning outcomes. British Journal of Educational Technology, 51(5), 1527–1547. [Google Scholar] [CrossRef]
- Ong, N., Zhu, J., & Mossé, D. (2022). Towards including instructor features in student grade prediction. In A. Mitrovic, & N. Bosch (Eds.), 15th international conference on educational data mining (pp. 239–250). International Educational Data Mining Society. Available online: https://eric.ed.gov/?id=ED624131 (accessed on 22 November 2025).
- Ontong, J. M. (2024). Do words matter: Investigating the association between linguistic features of accounting examinations and marks. South African Journal of Education, 44(2), 1–8. [Google Scholar] [CrossRef]
- Opoku, R. A., Pei, B., & Xing, W. (2025). Unveiling accuracy-fairness trade-offs: Investigating machine learning models in student performance prediction. Journal of Learning Analytics, 12(2), 125–139. [Google Scholar] [CrossRef]
- Ortega-Morla, J., Leis, A., Mallo, A., Moran-Fernandez, L., Guerreiro, S., Paz-Lopez, A., Perez-Sanchez, B., Sanchez-Marono, N., Rodriguez-Arias, A., Fontenla-Romero, O., & Bellas, F. (2025). ProgTutor: A robotic-based framework to support teaching and learning of programming fundamentals. IEEE Transactions on Learning Technologies, 18, 783–797. [Google Scholar] [CrossRef]
- Ouyang, F., Dai, X., & Chen, S. (2022). Applying multimodal learning analytics to examine the immediate and delayed effects of instructor scaffoldings on small groups’ collaborative programming. International Journal of STEM Education, 9(1), 45. [Google Scholar] [CrossRef]
- Ouyang, F., Xu, W., Liu, L., Cai, R., & Liu, J. (2024a). The influence of instructor support levels on collaborative knowledge construction. Learning, Culture and Social Interaction, 47, 100841. [Google Scholar] [CrossRef]
- Ouyang, F., Zhang, L., Wu, M., & Jiao, P. (2024b). Empowering collaborative knowledge construction through the implementation of a collaborative argument map tool. The Internet and Higher Education, 62, 100946. [Google Scholar] [CrossRef]
- Padrón-Rivera, G., Rebolledo-Mendez, G., Parra, P. P., & Huerta-Pacheco, N. S. (2016). Identification of action units related to affective states in a tutoring system for mathematics. Journal of Educational Technology & Society, 19(2), 77–86. [Google Scholar]
- Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. [Google Scholar] [CrossRef]
- Pan, L., Patterson, N., McKenzie, S., Rajasegarar, S., Wood-Bradley, G., Rough, J., Luo, W., Lanham, E., & Coldwell-Neilson, J. (2020). Gathering intelligence on student information behavior using data mining. Library Trends, 68(4), 636–658. [Google Scholar] [CrossRef]
- Pan, Z., Biegley, L., Taylor, A., & Zheng, H. (2024). A systematic review of learning analytics: Incorporated instructional interventions on learning management systems. Journal of Learning Analytics, 11(2), 52–72. [Google Scholar] [CrossRef]
- Pang, S., Zhang, Y., Zhang, J., Yang, Y., Sun, D., & Xiang, J. (2025). Automatic detection of students’ classroom behavior via long-term classroom videos to predict students’ learning gains. Education and Information Technologies, 30, 26961–26989. [Google Scholar] [CrossRef]
- Pardo, A., Han, F., & Ellis, R. A. (2016). Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Transactions on Learning Technologies, 10(1), 82–92. [Google Scholar] [CrossRef]
- Pardo, A., & Siemens, G. (2014). Ethical and privacy principles for learning analytics. British Journal of Educational Technology, 45(3), 438–450. [Google Scholar] [CrossRef]
- Pellegrino, J. W. (2014). Assessment as a positive influence on 21st-century teaching and learning: A systems approach to progress. Psicología Educativa, 20(2), 65–77. [Google Scholar] [CrossRef]
- Peng, Y., Wang, Y., & Hu, J. (2023). Examining ICT attitudes, use and support in blended learning settings for students’ reading performance: Approaches of artificial intelligence and multilevel model. Computers & Education, 203, 104846. [Google Scholar] [CrossRef]
- Pereira, F. D., Rodrigues, L., Henklain, M. H. O., Freitas, H., Oliveira, D. F., Cristea, A. I., Carvalho, L., Isotani, S., Benedict, A., Dorodchi, M., & de Oliveira, E. H. T. (2022). Toward human–AI collaboration: A recommender system to support CS1 instructors to select problems for assignments and exams. IEEE Transactions on Learning Technologies, 16(3), 457–472. [Google Scholar] [CrossRef]
- Picasso, F. (2024). Technology-enhanced assessment and feedback practices: A systematic literature review to explore academic development models. Research on Education and Media, 16(2), 2024. [Google Scholar] [CrossRef]
- Plumley, R. D., Bernacki, M. L., Greene, J. A., Kuhlmann, S., Raković, M., Urban, C. J., Hogan, K. A., Lee, C., Panter, A. T., & Gates, K. M. (2024). Co-designing enduring learning analytics prediction and support tools in undergraduate biology courses. British Journal of Educational Technology, 55(5), 1860–1883. [Google Scholar] [CrossRef]
- Prasad, L. T. V., Mythili, M., Balavivekanandhan, A., Sreela, B., Bordoloi, D., & Alphonse, F. R. (2024, October 3–5). AI-enhanced deep learning techniques for evaluating progress in English L2 learners. 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) (pp. 1941–1947), Kirtipur, Nepal. [Google Scholar] [CrossRef]
- Premlatha, K. R., Dharani, B., & Geetha, T. V. (2016). Dynamic learner profiling and automatic learner classification for adaptive e-learning environment. Interactive Learning Environments, 24(6), 1054–1075. [Google Scholar] [CrossRef]
- Radović, S., & Seidel, N. (2025). Uncovering variations in learning behaviors and cognitive engagement among students with diverse learning goals and outcomes. Educational Technology Research and Development, 73(5), 2877–2895. [Google Scholar] [CrossRef]
- Rai, L., Sheng, K., & Liu, F. (2025, July 26–28). Automated essay assessment using generative AI: Evaluating DeepSeek’s performance in university-level grading. 2025 IEEE 8th International Conference on Electronic Information and Communication Technology (ICEICT) (pp. 242–247), Weihai, China. [Google Scholar] [CrossRef]
- Rantanen, P., Saari, M., Virta, U. T., & Abrahamsson, P. (2025, June 2–6). Toward AI evaluation of student essays. 2025 MIPRO 48th ICT and Electronics Convention (pp. 729–734), Opatija, Croatia. [Google Scholar] [CrossRef]
- Reid, D. P., & Drysdale, T. D. (2024). Student-facing learning analytics dashboard for remote lab practical work. IEEE Transactions on Learning Technologies, 17, 1037–1050. [Google Scholar] [CrossRef]
- Retnawati, H., Kardanova, E., Sumaryanto, S., Prasojo, L., Jailani, J., Arliani, E., Hidayati, K., Susanti, M., Lestari, H., Apino, E., Rafi, I., Rosyada, M., Tuanaya, R., Dewanti, S., Sotlikova, R., & Kassymova, G. (2024). A systematic review of the use of technology in educational assessment practices: Lesson learned and direction for future studies. International Journal of Robotics and Control Systems, 4(4), 1656–1693. [Google Scholar] [CrossRef]
- Roa Romero, Y., Tame, H., Holzhausen, Y., Petzold, M., Wyszynski, J.-V., Peters, H., Alhassan-Altoaama, M., Domanska, M., & Dittmar, M. (2021). Design and usability testing of an in-house developed performance feedback tool for medical students. BMC Medical Education, 21(1), 354. [Google Scholar] [CrossRef]
- Rodríguez, D., Guzman, M., Brito, P., & Llorens, R. (2025). Ecological validity of self-perceived voice quality and acoustic measures during voice assessments: An observational study on faculty teachers. Journal of Speech, Language, and Hearing Research, 68(2), 478–490. [Google Scholar] [CrossRef]
- Rodríguez, M. E., Guerrero-Roldán, A. E., Baneres, D., & Karadeniz, A. (2022). An intelligent nudging system to guide online learners. International Review of Research in Open and Distributed Learning, 23(1), 41–62. [Google Scholar] [CrossRef]
- Rohani, N., Gal, K., Gallagher, M., & Manataki, A. (2024). Providing insights into health data science education through artificial intelligence. BMC Medical Education, 24(1), 564. [Google Scholar] [CrossRef] [PubMed]
- Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(3), e1355. [Google Scholar] [CrossRef]
- Rubio, F., Thomas, J. M., & Li, Q. (2018). The role of teaching presence and student participation in Spanish blended courses. Computer Assisted Language Learning, 31(3), 226–250. [Google Scholar] [CrossRef]
- Saint, J., Whitelock-Wainwright, A., Gašević, D., & Pardo, A. (2020). Trace-SRL: A framework for analysis of microlevel processes of self-regulated learning from trace data. IEEE Transactions on Learning Technologies, 13(4), 861–877. [Google Scholar] [CrossRef]
- Sekeroglu, B., Dimililer, K., & Tuncal, K. (2019). Artificial intelligence in education: Application in student performance evaluation. Dilemas Contemporáneos: Educación, Política y Valores, 7(1), 1. [Google Scholar]
- Selwyn, N. (2016). Education and technology: Key issues and debates. Bloomsbury Academic. [Google Scholar]
- Sembey, R., Hoda, R., & Grundy, J. (2024). Emerging technologies in higher education assessment and feedback practices: A systematic literature review. Journal of Systems and Software, 211, 111988. [Google Scholar] [CrossRef]
- Serrano-Mamolar, A., Miguel-Alonso, I., Checa, D., & Pardo-Aguilar, C. (2023). Hacia una metodología de evaluación del rendimiento del alumno en entornos de aprendizaje iVR utilizando eye-tracking y aprendizaje automático. Comunicar: Revista Científica de Comunicación y Educación, 31(76), 9–20. [Google Scholar] [CrossRef]
- Shabara, R., ElEbyary, K., & Boraie, D. (2024). Teachers or ChatGPT: The issue of accuracy and consistency in L2 assessment. Teaching English with Technology, 24(2), 71–92. [Google Scholar] [CrossRef]
- Shermis, M. D. (2025). Using ChatGPT to score essays and short-form constructed responses. Assessing Writing, 66, 100988. [Google Scholar] [CrossRef]
- Shermis, M. D., & Burstein, J. (Eds.). (2013). Handbook of automated essay evaluation: Current applications and new directions. Routledge. [Google Scholar]
- Shute, V., Rahimi, S., & Smith, G. (2019). Game-based learning analytics in physics playground. In A. Tlili, & M. Chang (Eds.), Data analytics approaches in educational games and gamification systems (pp. 69–93). Springer. [Google Scholar] [CrossRef]
- Shute, V. J., Smith, G., Kuba, R., Dai, C.-P., Rahimi, S., Liu, Z., & Almond, R. (2021). The design, development, and testing of learning supports for the Physics Playground game. International Journal of Artificial Intelligence in Education, 31(3), 357–379. [Google Scholar] [CrossRef]
- Shute, V. J., & Ventura, M. (2013). Stealth assessment: Measuring and supporting learning in video games. MIT Press. [Google Scholar]
- Siemens, G., & Baker, R. S. J. d. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In 2nd international conference on learning analytics and knowledge (LAK’12) (pp. 252–254). Association for Computing Machinery. [Google Scholar] [CrossRef]
- Slade, S., & Prinsloo, P. (2013). Learning analytics: Ethical issues and dilemmas. American Behavioral Scientist, 57(10), 1510–1529. [Google Scholar] [CrossRef]
- Slater, S., & Baker, R. (2019). Forecasting future student mastery. Distance Education, 40(3), 380–394. [Google Scholar] [CrossRef]
- Sosa Neira, E. A., Salinas, J., & De Benito, B. (2017). Emerging technologies (ETs) in education: A systematic review of the literature published between 2006 and 2016. International Journal of Emerging Technologies in Learning, 12(5), 128–149. [Google Scholar] [CrossRef]
- Standen, P. J., Brown, D. J., Taheri, M., Galvez Trigo, M. J., Boulton, H., Burton, A., Hallewell, M. J., Lathe, J. G., Shopland, N., Blanco Gonzalez, M. A., Kwiatkowska, G. M., Milli, E., Cobello, S., Mazzucato, A., Traversi, M., & Hortal, E. (2020). An evaluation of an adaptive learning system based on multimodal affect recognition for learners with intellectual disabilities. British Journal of Educational Technology, 51(5), 1748–1765. [Google Scholar] [CrossRef]
- Steif, P. S., Fu, L., & Kara, L. B. (2016). Providing formative assessment to students solving multipath engineering problems with complex arrangements of interacting parts: An intelligent tutor approach. Interactive Learning Environments, 24(8), 1864–1880. [Google Scholar] [CrossRef]
- Steinbach, M., Fleckenstein, J., Kuklick, L., & Meyer, J. (2025). (De) motivating zero-performing students with negative feedback: Does the salience of performance information matter? Journal of Computer Assisted Learning, 41(4), e70070. [Google Scholar] [CrossRef]
- Stewart, J., Anthony, L., Batty, A. O., Nakamura, K., Nicklin, C., McLean, S., & Tomaru, K. (2025). Can we reliably score meaning recall vocabulary tests using AI? A comparison of human vs. AI scoring. Computer Assisted Language Learning, 1–23. [Google Scholar] [CrossRef]
- Suraworachet, W., Zhou, Q., & Cukurova, M. (2025). University students’ perceptions of a multimodal AI system for real-world collaboration analytics: Lessons learned from a case study. Journal of Computer Assisted Learning, 41(5), e70103. [Google Scholar] [CrossRef]
- Tadjer, H., Lafifi, Y., Seridi-Bouchelaghem, H., & Gülseçen, S. (2022). Improving soft skills based on students’ traces in problem-based learning environments. Interactive Learning Environments, 30(10), 1879–1896. [Google Scholar] [CrossRef]
- Talamás-Carvajal, J. A., Ceballos, H. G., & Hilliger, I. (2025). The facts behind the prophecy: Validating a methodology for identifying behavioural differences in higher education student subpopulations under intervention. Journal of Learning Analytics, 12(2), 211–223. [Google Scholar] [CrossRef]
- Tempelaar, D. (2017). How dispositional learning analytics helps understanding the worked-example principle. In 14th international conference on cognition and exploratory learning in digital age (CELDA 2017) (pp. 117–124). International Association for Development of the Information Society. Available online: https://eric.ed.gov/?id=ED579458 (accessed on 24 November 2025).
- Tempelaar, D., Rienties, B., & Giesbers, B. (2024). Dispositional learning analytics and formative assessment: An inseparable twinship. International Journal of Educational Technology in Higher Education, 21(1), 57. [Google Scholar] [CrossRef]
- Topuz, A. C., Yıldız, M., Taşlıbeyaz, E., Polat, H., & Kurşun, E. (2025). Is generative AI ready to replace human raters in scoring EFL writing? Comparison of human and automated essay evaluation. Educational Technology & Society, 28(3), 36–50. [Google Scholar] [CrossRef]
- Udeozor, C., Chan, P., Russo Abegão, F., & Glassey, J. (2023). Game-based assessment framework for virtual reality, augmented reality and digital game-based learning. International Journal of Educational Technology in Higher Education, 20(1), 36. [Google Scholar] [CrossRef]
- Ulitzsch, E. (2022). Computational psychometrics: New methodologies for a new generation of digital learning and assessment. Psychometrika, 87(4), 1571–1574. [Google Scholar] [CrossRef]
- Vale, E., & Falloon, G. (2024). Using learning analytics to understand K–12 learner behavior in online video-based learning. Online Learning, 28(1), 44–68. [Google Scholar] [CrossRef]
- van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. [Google Scholar] [CrossRef] [PubMed]
- Van Leeuwen, A., & Rummel, N. (2020, March 23–27). Comparing teachers’ use of mirroring and advising dashboards. Tenth International Conference on Learning Analytics & Knowledge (pp. 26–34), Frankfurt, Germany. [Google Scholar] [CrossRef]
- Vignesh, S., Sharmitha, D. K. S., & Libisena, P. S. (2025, January 7–8). AI-powered students’ collaboration and evaluator using LDA. 2025 6th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI) (pp. 1791–1796), Goathgaun, Nepal. Available online: https://ieeexplore.ieee.org/abstract/document/10883069/ (accessed on 24 November 2025).
- Vilanti, T., Luiro, K., Dahlqvist, I., Piipponen, J., Hemminki-Reijonen, U., Tkalcan, S., Ketamo, H., & Koivisto, J. M. (2025). Contraception-related topics in chat dialogues between healthcare students and generative AI patients: A natural language processing analysis. BMC Medical Education, 25(1), 1458. [Google Scholar] [CrossRef] [PubMed]
- Villagrán, C., Nygaard, T., Gaete, M. I., Vera, M., & Cecilio-Fernandes, D. (2024). Enhancing feedback uptake and self-regulated learning in procedural skills training: Design and evaluation of a learning analytics dashboard. Journal of Learning Analytics, 11(2), 138–156. [Google Scholar] [CrossRef]
- Wang, D., Bian, C., & Chen, G. (2024). Using explainable AI to unravel classroom dialogue analysis: Effects of explanations on teachers’ trust, technology acceptance and cognitive load. British Journal of Educational Technology, 55(6), 2530–2556. [Google Scholar] [CrossRef]
- Wang, F., Cheung, A. C., Neitzel, A. J., & Chai, C. S. (2025a). Does chatting with chatbots improve language learning performance? A meta-analysis of chatbot-assisted language learning. Review of Educational Research, 95(4), 623–660. [Google Scholar] [CrossRef]
- Wang, F., Li, N., Cheung, A. C., & Wong, G. K. (2025b). In GenAI we trust: An investigation of university students’ reliance on and resistance to generative AI in language learning. International Journal of Educational Technology in Higher Education, 22(1), 59. [Google Scholar] [CrossRef]
- Wei, Y., Carvalho, P., & Stamper, J. (2025). KCluster: An LLM-based clustering approach to knowledge component discovery. arXiv, arXiv:2505.06469. [Google Scholar] [CrossRef]
- Wen, Y., & Song, Y. (2021). Learning analytics for collaborative language learning in classrooms. Educational Technology & Society, 24(1), 1–15. [Google Scholar] [CrossRef]
- Williamson, B. (2017). Big data in education: The digital future of learning, policy and practice. SAGE. [Google Scholar]
- Wilson, A., Watson, C., Thompson, T. L., Drew, V., & Doyle, S. (2017). Learning analytics: Challenges and limitations. Teaching in Higher Education, 22(8), 991–1007. [Google Scholar] [CrossRef]
- Wilson, J., Huang, Y., Palermo, C., Beard, G., & MacArthur, C. A. (2021). Automated feedback and automated scoring in the elementary grades: Usage, attitudes, and associations with writing outcomes in a districtwide implementation of MI write. International Journal of Artificial Intelligence in Education, 31(2), 234–276. [Google Scholar] [CrossRef]
- Wilson, M. (2018). Making measurement important for education: The crucial role of classroom assessment. Educational Measurement: Issues and Practice, 37(1), 5–20. [Google Scholar] [CrossRef]
- Wilson, M. (2023). Constructing measures: An item response modeling approach (2nd ed.). Routledge. [Google Scholar]
- Wilson, M. (2024a). Finding the right grain-size for measurement in the classroom. Journal of Educational and Behavioral Statistics, 49(1), 3–31. [Google Scholar] [CrossRef]
- Wilson, M. (2024b). What makes measurement important for education? Educational Measurement: Issues and Practice, 43(4), 73–82. [Google Scholar] [CrossRef]
- Wilson, M., Gochyyev, P., & Scalise, K. (2016). Assessment of learning in digital interactive social networks: A learning analytics approach. Online Learning, 20(2), 97–119. [Google Scholar] [CrossRef][Green Version]
- Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181–208. [Google Scholar] [CrossRef]
- Wise, A. F., & Shaffer, D. W. (2015). Why theory matters more than ever in the age of big data. Journal of Learning Analytics, 2(2), 5–13. [Google Scholar] [CrossRef]
- Wools, S., Molenaar, M., & Hopster-den Otter, D. (2019). The validity of technology enhanced assessments—Threats and opportunities. In B. P. Veldkamp, & C. Sluijter (Eds.), Theoretical and practical advances in computer-based educational measurement (pp. 3–19). Springer. [Google Scholar] [CrossRef]
- Wu, J., Wang, J., Lei, S., Wu, F., & Gao, X. (2025). The impact of metacognitive scaffolding on deep learning in a GenAI-supported learning environment. Interactive Learning Environments, 33(9), 5166–5183. [Google Scholar] [CrossRef]
- Xu, J., Wei, T., & Lv, P. (2022, July 24–27). SQL-DP: A novel difficulty prediction framework for SQL programming problems. 15th International Conference on Educational Data Mining (pp. 86–97), Durham, UK. Available online: https://eric.ed.gov/?id=ED624132 (accessed on 27 November 2025).
- Xu, W., & Ouyang, F. (2022). The application of AI technologies in STEM education: A systematic review from 2011 to 2021. International Journal of STEM Education, 9(1), 59. [Google Scholar] [CrossRef]
- Yang, C. C. Y., & Ogata, H. (2023). Personalized learning analytics intervention approach for enhancing student learning achievement and behavioral engagement in blended learning. Education and Information Technologies, 28(3), 2509–2528. [Google Scholar] [CrossRef]
- Yang, T. C. (2023). Application of artificial intelligence techniques in analysis and assessment of digital competence in university courses. Educational Technology & Society, 26(1), 232–243. [Google Scholar] [CrossRef]
- Yang, T. C., Chen, M. C., & Chen, S. Y. (2018). The influences of self-regulated learning support and prior knowledge on improving learning performance. Computers & Education, 126, 37–52. [Google Scholar] [CrossRef]
- Yang, Y., Du, Y., van Aalst, J., Sun, D., & Ouyang, F. (2020). Self-directed reflective assessment for collective empowerment among pre-service teachers. British Journal of Educational Technology, 51(6), 1961–1981. [Google Scholar] [CrossRef]
- Yiğiter, M., & Boduroğlu, E. (2025). Examining the performance of artificial intelligence in scoring students’ handwritten responses to open-ended items. Education and Science, 50, 1–18. [Google Scholar] [CrossRef]
- Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education—Where are the educators? International Journal of Educational Technology in Higher Education, 16(1), 1–27. [Google Scholar] [CrossRef]
- Zhang, K., & Aslan, A. B. (2021). AI technologies for education: Recent research & future directions. Computers and Education: Artificial Intelligence, 2, 100025. [Google Scholar] [CrossRef]
- Zhang, K., Yılmaz, R., Ustun, A. B., & Karaoğlan Yılmaz, F. G. (2023). Learning analytics in formative assessment: A systematic literature review. Journal of Measurement and Evaluation in Education and Psychology, 14, 359–381. [Google Scholar] [CrossRef]
- Zhang, L., Weitlauf, A. S., Amat, A. Z., Swanson, A., Warren, Z. E., & Sarkar, N. (2020). Assessing social communication and collaboration in autism spectrum disorder using intelligent collaborative virtual environments. Journal of Autism and Developmental Disorders, 50(1), 199–211. [Google Scholar] [CrossRef]
- Zhao, F., Gaschler, R., Schnotz, W., & Wagner, I. (2020). Regulating distance to the screen while engaging in difficult tasks. Frontline Learning Research, 8(6), 59–76. [Google Scholar] [CrossRef]
- Zhao, R., Zhuang, Y., Zou, D., Xie, Q., & Yu, P. L. H. (2023). AI-assisted automated scoring of picture-cued writing tasks for language assessment. Education and Information Technologies, 28(6), 7031–7063. [Google Scholar] [CrossRef]












| Facet | Search Terms | Rationale |
|---|---|---|
| Educational context | educat*; learn*; teach*; pedagog*; student*; instruct*; school; college; university; class*; K-12; K12 | Identify studies situated in formal education across primary, secondary, and higher education. |
| Measurement | measur*; assess*; evaluat*; test; exam* | Identify any form of educational measurement, assessment, evaluation, testing, or exams. |
| Emerging technology | “emerg* technolog*”; AI; “artificial intelligence”; “learning analytics”; “virtual reality”; “augmented reality”; “mixed reality”; “intelligent tutor*”; “adaptive learning” | Identify a broad set of ETs used in education. |
| ET Category (Code) | Operational Definition | Typical Indicators in Text |
|---|---|---|
| Generative artificial intelligence/Large language model systems | Implemented generative artificial intelligence or a large language model to generate, transform, or interpret language or code as part of the measurement workflow. | Generative artificial intelligence (GenAI); large language model (LLM); retrieval-augmented generation (RAG); agent-based chatbot |
| Machine learning & deep learning (non-LLM) | Implemented machine learning or deep learning models used for prediction, classification, or representation learning when the core model is not an LLM. | Machine learning (ML); deep learning (DL); artificial neural network/deep neural network (ANN/DNN); support vector machine (SVM); random forest |
| Natural language processing (non-LLM) | Implemented non-LLM language processing used as measurement evidence (feature extraction, classification, or text analytics). | Natural language processing (NLP); term frequency-inverse document frequency (TF-IDF); rule-based text analysis; linguistic feature extraction |
| Automated scoring & feedback systems | Implemented scoring and/or feedback pipeline that converts evidence into scores and/or actionable feedback, regardless of the underlying model family. | automated scoring; automated grading; automated feedback; auto-evaluation pipeline |
| Learning analytics/Educational data mining | Implemented analysis of learner process data (logs/traces) that produces indicators, predictions, or monitoring outputs used for measurement or decision-making. | Learning analytics (LA); educational data mining (EDM); dashboards; early-warning indicators; log-based analytics |
| Knowledge tracing & learner modeling | Implemented modeling of learner knowledge states to infer mastery or trajectories over time. | Knowledge tracing (KT); Bayesian knowledge tracing (BKT); deep knowledge tracing (DKT); additive factors model (AFM); hidden Markov model (HMM) |
| Adaptive systems & Intelligent tutoring systems | Implemented systems that adapt instruction, practice, or support based on inferred learner state (instructional adaptation). | Intelligent tutoring system (ITS); adaptive learning technology (ALT); AI tutor; personalized adaptive system |
| Computer-adaptive assessment & test delivery | Implemented adaptive assessment administration focused on measurement delivery (routing/item selection). | Computer-adaptive testing (CAT); computer-adaptive assessment; adaptive test delivery |
| Multimodal & sensor-based measurement | Implemented multimodal sensing and/or fusion used as measurement evidence. | Multimodal learning analytics (MMLA); electroencephalography (EEG); functional near-infrared spectroscopy (fNIRS); electrodermal activity (EDA); multimodal fusion |
| Speech technologies | Implemented speech-based evidence capture and/or processing used for measurement. | Automatic speech recognition (ASR); speech analytics; transcription-based evidence capture; text-to-speech (TTS) when used in dialog-based measurement |
| Computer vision | Implemented image/video-based evidence capture and/or processing used for measurement (e.g., posture, action, facial or behavioral cues). | Computer vision (CV); video analytics; image recognition; facial/action detection for measurement |
| Immersive/Simulation & Extended reality | Implemented immersive or simulation environments where virtual/augmented/mixed/extended reality interaction is central to performance and evidence generation. | Virtual reality (VR); augmented reality (AR); mixed reality (MR); extended reality (XR); immersive virtual reality (IVR); virtual patient simulation |
| Analytic Coding Dimension | Code | n | % |
|---|---|---|---|
| Grain size | Micro | 839 | 88.88 |
| Meso | 95 | 10.06 | |
| Macro | 10 | 1.06 | |
| Building block | Construct map | 50 | 3.19 |
| Item design | 157 | 10.01 | |
| Outcome space | 649 | 41.39 | |
| Measurement model | 712 | 45.41 | |
| Emerging technology category | Generative AI/LLM systems | 159 | 8.27 |
| Machine learning & deep learning (ML & DL) | 336 | 17.48 | |
| Natural language processing (NLP) | 37 | 1.93 | |
| Automated scoring & feedback systems | 274 | 14.26 | |
| Learning analytics & educational data mining (LA & EDM) | 499 | 25.96 | |
| Knowledge tracing & learner modeling | 39 | 2.03 | |
| Adaptive systems & intelligent tutoring systems | 145 | 7.54 | |
| Computer-adaptive assessment & test delivery | 9 | 0.47 | |
| Multimodal & sensor-based measurement | 130 | 6.76 | |
| Speech technologies | 141 | 7.34 | |
| Computer vision | 54 | 2.81 | |
| Immersive/simulation & extended reality | 99 | 5.15 |
| Grain Size | Building Block | Block Total ET Tags | Block % Within Grain | Block Entropy H | Block HHI | Top1 ET | Top1 n | Top2 ET | Top2 n | Top3 ET | Top3 n |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Micro | Construct map | 107 | 3.5 | 3.319 | 0.108 | Learning analytics & educational data mining | 20 | Automated scoring & feedback systems | 15 | Machine learning & deep learning | 13 |
| Item design | 313 | 10.3 | 3.059 | 0.132 | Automated scoring & feedback systems | 74 | Learning analytics & educational data mining | 59 | Immersive/simulation & extended reality | 42 | |
| Outcome space | 1238 | 40.8 | 3.07 | 0.13 | Learning analytics & educational data mining | 296 | Automated scoring & feedback systems | 206 | Machine learning & deep learning | 176 | |
| Measurement model | 1378 | 45.4 | 3.16 | 0.123 | Learning analytics & educational data mining | 347 | Machine learning & deep learning | 259 | Automated scoring & feedback systems | 180 | |
| Meso | Construct map | 7 | 2.4 | 2.236 | 0.224 | Machine learning & deep learning | 2 | Generative AI/LLM systems | 1 | Learning analytics & educational data mining | 1 |
| Item design | 18 | 6.3 | 2.503 | 0.188 | Machine learning & deep learning | 5 | Automated scoring & feedback systems | 3 | Speech technologies | 3 | |
| Outcome space | 123 | 42.9 | 2.34 | 0.249 | Learning analytics & educational data mining | 64 | Machine learning & deep learning | 30 | Generative AI/LLM systems | 7 | |
| Measurement model | 139 | 48.4 | 2.529 | 0.184 | Machine learning & deep learning | 46 | Learning analytics & educational data mining | 50 | Speech technologies | 10 | |
| Macro | Construct map | 0 | 0 | 0 | 0 | — | 0 | — | 0 | — | 0 |
| Item design | 2 | 9.1 | 1 | 0.5 | Generative AI/LLM systems | 1 | Automated scoring & feedback systems | 1 | Machine learning & deep learning | 0 | |
| Outcome space | 6 | 27.3 | 1.459 | 0.361 | Machine learning & deep learning | 3 | Learning analytics & educational data mining | 2 | Speech technologies | 1 | |
| Measurement model | 14 | 63.6 | 2.039 | 0.276 | Machine learning & deep learning | 7 | Learning analytics & educational data mining | 4 | Automated scoring & feedback systems | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yu, L.; Wong, G.K.W.; Zhang, B.; Wang, F. Educational Measurement with Emerging Technologies: A Systematic Review Through Evidentiary Lens on Granularity and Constructing Measures Theory. Educ. Sci. 2026, 16, 661. https://doi.org/10.3390/educsci16040661
Yu L, Wong GKW, Zhang B, Wang F. Educational Measurement with Emerging Technologies: A Systematic Review Through Evidentiary Lens on Granularity and Constructing Measures Theory. Education Sciences. 2026; 16(4):661. https://doi.org/10.3390/educsci16040661
Chicago/Turabian StyleYu, Linwei, Gary K. W. Wong, Bingjie Zhang, and Feifei Wang. 2026. "Educational Measurement with Emerging Technologies: A Systematic Review Through Evidentiary Lens on Granularity and Constructing Measures Theory" Education Sciences 16, no. 4: 661. https://doi.org/10.3390/educsci16040661
APA StyleYu, L., Wong, G. K. W., Zhang, B., & Wang, F. (2026). Educational Measurement with Emerging Technologies: A Systematic Review Through Evidentiary Lens on Granularity and Constructing Measures Theory. Education Sciences, 16(4), 661. https://doi.org/10.3390/educsci16040661

