Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations

Pang, Weina; Wei, Zhe

doi:10.3390/info16020095

Open AccessReview

Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations

by

Weina Pang

^1,* and

Zhe Wei

^2,*

¹

Academic Affairs Office, Zhejiang Yuexiu University, Shaoxing 312000, China

²

School of Computer Science, Civil Aviation Flight University of China, Guanghan 618307, China

^*

Authors to whom correspondence should be addressed.

Information 2025, 16(2), 95; https://doi.org/10.3390/info16020095

Submission received: 17 December 2024 / Revised: 15 January 2025 / Accepted: 25 January 2025 / Published: 27 January 2025

(This article belongs to the Special Issue Generative AI Technologies: Shaping the Future of Higher Education)

Download

Browse Figures

Versions Notes

Abstract

Generative Artificial Intelligence (GAI) is rapidly reshaping the landscape of higher education, offering innovative solutions to enhance student engagement, personalize learning experiences, and improve academic performance prediction. This study provides an in-depth exploration of GAI applications in educational contexts, drawing insights from 67 case studies meticulously selected from over 300 papers presented at the AIED 2024 conference. The research focuses on eight key themes from student engagement and behavior analysis to the integration of generative models into educational tools. These case studies illustrate the potential of GAI to optimize teaching practices, enhance student support systems, and provide tailored interventions that address individual learning needs. However, this study also highlights challenges such as scalability, the need for balanced and diverse datasets, and ethical concerns regarding data privacy and bias. Further, it emphasizes the importance of improving model accuracy, transparency, and real-world applicability in educational settings. The findings underscore the need for continued research to refine GAI technologies, ensuring they are scalable, adaptable, and equitable, ultimately enhancing the effectiveness and inclusivity of AI-driven educational tools across diverse higher education environments. It should be noted that this study primarily draws from papers presented at the AIED 2024 conference, which may limit global representativeness and introduce thematic biases. Future studies are encouraged to include broader datasets from diverse conferences and journals to ensure a more comprehensive understanding of GAI applications in higher education.

Keywords:

generative AI; higher education; personalized learning; performance prediction; student engagement; educational assessment; AI applications

1. Introduction

Generative Artificial Intelligence (GAI) is at the forefront of transforming higher education, offering innovative solutions to longstanding challenges while creating new opportunities for learning and teaching. By harnessing the capabilities of advanced AI models, GAI enables personalized learning, enhances student engagement, predicts academic performance, and supports career readiness initiatives. These technologies empower educators and institutions to make data-driven decisions, optimize educational practices, and address diverse student needs with unprecedented precision and efficiency. The ongoing integration of GAI into higher education signals a paradigm shift in how learning outcomes are achieved and academic success is supported.

This study provides a comprehensive exploration of the applications and impact of GAI in higher education, grounded in cutting-edge research and practical use cases. It draws from 67 case studies meticulously selected from over 300 papers presented at the Artificial Intelligence in Education 25th International Conference (AIED 2024). As one of the premier global forums for AI and education, AIED 2024 serves as a rich source of authoritative insights into the latest advancements and trends in the field. These case studies showcase how GAI is being implemented across diverse educational settings, highlighting its role in transforming teaching methodologies, enhancing assessment processes, and driving student success through predictive analytics.

To synthesize these findings, case studies are organized into eight interconnected themes from student engagement and academic performance prediction to educational assessment, each addressing a critical area of application. These themes illustrate the breadth of GAI’s influence, from reshaping traditional instructional models to enabling new, data-driven approaches to education. By focusing on these key areas, this study demonstrates how GAI not only augments existing systems but also facilitates the development of innovative frameworks for teaching and learning.

The contribution of this paper lies in its systematic analysis of how GAI is reshaping higher education through advanced technologies such as large language models (LLMs), machine learning algorithms, predictive analytics, and deep learning techniques. These tools are central to creating adaptive, personalized, and scalable educational solutions. By integrating these technologies, GAI addresses critical challenges such as improving student retention, predicting and preventing dropouts, identifying at-risk students, and assessing career readiness. The insights provided by this research aim to guide educators, policymakers, and researchers in leveraging AI-driven innovations to enhance academic success and institutional effectiveness.

The structure of this paper is as follows. Section 2 delves into the technological foundations of GAI in higher education, exploring the methods and tools that enable its applications. Section 3 discusses the methodological framework of this study for evaluating GAI technologies in higher education. Section 4 presents an in-depth analysis of the case studies, categorizing them into eight thematic areas that reflect the diverse ways in which GAI is applied. Section 5 examines the challenges and limitations hindering GAI’s widespread adoption and scalability, including issues related to data quality, model interpretability, ethical considerations, and user trust. Section 6 discusses future directions, focusing on opportunities for advancing research, refining technologies, and improving the practical deployment of GAI in educational contexts. Finally, Section 7 concludes the paper by summarizing the key findings and offering actionable recommendations for integrating GAI into higher education to support student and institutional success.

2. Technological Foundations of GAI in Higher Education

Generative Artificial Intelligence (GAI) refers to AI technologies capable of creating new content, such as text, code, images, and videos, based on learned patterns and datasets. In higher education, GAI leverages technologies like large language models (LLMs), machine learning (ML), and Natural Language Processing (NLP) to enhance teaching and learning experiences. These technologies introduce innovative approaches to teaching, learning, and institutional management, offering powerful tools to enable personalized learning [1], predict student outcomes, and automate educational processes [2].

By enabling adaptive feedback systems, automated assessment tools, and personalized learning pathways, GAI addresses challenges such as scalability, student engagement, and dropout prevention [3]. For instance, tools like GPT-4 and reinforcement learning algorithms empower educators to make data-driven decisions while providing students with individualized support [4]. Moreover, GAI technologies are increasingly addressing critical challenges such as improving performance prediction, detecting at-risk students early, and fostering adaptive and scalable learning environments [5,6].

This section critically evaluates the technological foundations of GAI in higher education and explores its integration into academic practices. It provides a comprehensive understanding of GAI’s potential by reviewing the core technologies and methodologies underpinning its adoption in higher education, grounded in current literature. Through a detailed analysis of these foundations, the transformative potential of GAI in reshaping educational ecosystems becomes apparent, particularly in improving teaching practices and institutional decision making.

2.1. Large Language Models (LLMs): A Foundation for Interaction

Large language models (LLMs), such as GPT-4, ChatGPT, and BERT, are among the most widely adopted GAI technologies in education. At their core, LLMs are designed to process and generate human-like text, making them invaluable for automating communication and providing personalized feedback. Refs. [7,8] have shown that LLMs excel at tasks such as automated essay grading and delivering personalized instructional support. These models enable conversational agents that engage students in interactive learning, as well as systems that streamline repetitive administrative tasks.

The versatility of LLMs lies in their ability to adapt to diverse contexts. For example, GPT-based systems analyze student inputs to provide context-aware responses, making them particularly effective in knowledge tracing and personalized tutoring. Research further suggests that LLMs are integral to improving student satisfaction by creating dynamic, on-demand learning experiences [9].

2.2. Machine Learning (ML): Unlocking Patterns and Predictions

Machine learning models are essential for harnessing the power of data in higher education. Algorithms such as Random Forests, XGBoost, and decision trees excel at analyzing large datasets to uncover patterns that inform actionable insights. For example, ML has been widely adopted to predict student performance and detect early warning signs of disengagement. Refs. [2,10] demonstrate how ML models enable institutions to proactively address dropout risks by identifying trends in student behavior and engagement.

Beyond predictive analytics, ML facilitates the creation of personalized learning pathways by tailoring content recommendations to individual needs. Techniques such as sequence modeling and LSTM networks adaptively optimize instructional materials based on historical student data, significantly improving retention and learning outcomes [1]. These developments position ML as a cornerstone for data-driven education.

2.3. Natural Language Processing (NLP): Bridging Communication Gaps

Natural Language Processing (NLP) technologies are transforming the way educational content is analyzed and delivered. Tools such as BERT and RoBERTa enable the automation of text analysis tasks, including essay grading, sentiment analysis, and the generation of personalized feedback. Refs. [11,12] highlight how NLP systems provide real-time academic support, enabling educators to quickly evaluate and respond to student needs.

In addition to streamlining assessment processes, NLP tools also foster inclusivity. By analyzing linguistic patterns, these models ensure that feedback is equitable and tailored to diverse student populations. Furthermore, sentiment analysis powered by NLP has been shown to improve student engagement by identifying learners who may require additional emotional or academic support [13].

2.4. Reinforcement Learning (RL): Adaptive and Interactive Learning Systems

Reinforcement learning (RL) algorithms bring a dynamic dimension to educational technology by continuously adapting to student progress in real time. Unlike static instructional methods, RL models optimize learning paths based on the ongoing interactions between students and systems. As highlighted in [14], RL has been used in intelligent tutoring systems, where lesson difficulty is dynamically adjusted to match a learner’s progress and engagement level.

These adaptive capabilities ensure that students remain appropriately challenged while receiving the support they need to succeed. RL-driven systems are particularly impactful in subjects requiring progressive mastery, such as mathematics and science, where iterative problem solving and personalized feedback are critical.

2.5. Explainable AI (XAI): Building Transparency and Trust

Explainable AI (XAI) frameworks are critical for fostering trust in AI-driven educational systems. By making the decision-making processes of AI models transparent, XAI allows educators to understand and validate predictions and recommendations. For example, grading systems powered by XAI provide detailed explanations for assigned scores, enabling instructors to evaluate the reasoning behind these decisions [15,16].

Moreover, XAI addresses ethical concerns by ensuring accountability and mitigating biases in AI-driven assessments. This transparency not only increases acceptance of AI tools among educators but also encourages their integration into sensitive educational processes, such as student evaluations and resource allocation.

2.6. Multimodal Learning Analytics (MMLA): Integrating Diverse Data Sources

Multimodal Learning Analytics (MMLA) represents a powerful approach to understanding student engagement by combining data from various sources. These include learning management systems, classroom interactions, and even physiological signals. For example, ref. [17] highlights how MMLA tools improve learning outcomes by offering educators a holistic view of student performance, enabling tailored interventions.

By synthesizing multimodal data, MMLA supports personalized teaching strategies that address individual learning needs. For example, integrating data from virtual and in-person interactions allows educators to predict engagement levels and adjust instructional methods accordingly. Such insights ensure that no learner is left behind, particularly in blended or hybrid learning environments.

The integration of these technologies has already shown promising results in enhancing the effectiveness of teaching and learning environments. Predictive models powered by machine learning are helping institutions identify students at risk of dropping out, while tools like LLMs are transforming the way students interact with course content and receive feedback. Similarly, reinforcement learning techniques are being used to design adaptive learning environments, and explainable AI ensures that these systems are transparent and understandable to educators and students.

A word cloud, as shown in Figure 1, has been created to visually represent the prominence of these key technologies and methods. The word cloud illustrates the frequency of the most frequently discussed technologies in the academic and research literature. Prominent terms, such as Natural Language Processing and deep learning, are featured prominently, highlighting their central role in shaping the future of education through GAI technologies. This visualization can serve as a quick reference for identifying the key themes and technologies currently at the forefront of research and application. It also provides a useful tool for quickly grasping the essential concepts within the field and understanding which technologies are driving the most interest and innovation in higher education.

In addition, we also provide an index table of key technologies and methods drawn from the case studies, emphasizing their application in educational contexts. Table 1 summarizes these technologies, the representative references from which they are derived, and their specific use cases in higher education settings.

These GAI technologies collectively contribute to building a more data-driven, personalized, and adaptive educational ecosystem by complementing and enhancing each other’s capabilities. Large language models (LLMs), such as GPT-3 and GPT-4, excel in enabling interactive and dynamic learning environments through their ability to provide real-time feedback and personalized learning experiences. Beyond their standalone applications, these models often serve as the interface for other AI technologies, integrating insights from predictive algorithms, such as machine learning (ML), to deliver adaptive, student-centered interventions [29].

For instance, machine learning models, like XGBoost and Random Forests, analyze historical data to identify students at risk of underperformance or dropout. By synthesizing these predictions into actionable insights, institutions can implement targeted interventions to improve retention and academic outcomes [28]. When integrated with NLP-powered feedback systems, ML insights help tailor educational content and feedback to individual students’ needs, fostering more meaningful engagement [14].

Natural Language Processing (NLP) tools, such as BERT and RoBERTa, provide the linguistic backbone for many AI applications in education, particularly in automating assessment tasks, like essay grading and sentiment analysis [31]. These tools not only increase efficiency but also ensure that feedback is context aware and aligned with individual learning trajectories. By combining NLP capabilities with reinforcement learning (RL), adaptive feedback loops can be established to dynamically adjust educational content based on student interactions [38].

In addition to these technologies, hybrid systems, like reinforcement learning (RL) and knowledge tracing (KT), form the core of adaptive learning platforms. RL algorithms continuously refine learning paths by responding to real-time input from students, ensuring that content difficulty and pacing are tailored to maximize engagement and learning outcomes [21]. Reinforcement learning (RL) and knowledge tracing systems provide a continuous record of a student’s knowledge progression, identifying gaps and recommending targeted support [1]. Together, these systems enable educators to create highly personalized, scalable learning environments that address individual needs while leveraging the power of data-driven insights.

This ecosystem of interconnected technologies illustrates the transformative potential of GAI in higher education. By integrating foundational AI technologies, such as LLMs, ML, NLP, RL, and KT, institutions can create holistic and adaptive educational solutions that empower both educators and learners. These innovations not only enhance engagement and performance but also pave the way for a more inclusive, equitable, and efficient education system [39].

3. Methodological Framework for Analyzing GAI in Higher Education

This section outlines the methodological framework for evaluating Generative Artificial Intelligence (GAI) technologies in higher education. The framework consists of two key components: a systematic review of the relevant literature and an analytical evaluation of eight case studies. These components provide insights into the role, impact, and challenges of GAI applications in addressing educational needs.

3.1. Rationale for the Review in the Context of Existing Knowledge

Generative Artificial Intelligence has become a transformative tool in higher education, addressing challenges such as engagement, personalized learning, and predictive analytics. However, the growing number of studies necessitates a systematic synthesis to contextualize these advancements and identify their potential and limitations.

This study focuses exclusively on the AIED 2024 conference proceedings, a leading venue for research in artificial intelligence in higher education. The conference’s rigorous peer-review process ensures high-quality contributions, making it a reliable source for exploring the latest advancements of GAI in higher education. The reviewed papers align with broader literature trends as follows:

(1): Large language models (LLMs) are widely used for personalized learning and automated grading.
(2): Machine learning (ML) is applied to predict student performance and identify at-risk students.
(3): Reinforcement learning (RL) is leveraged for adaptive learning environments.

By situating the findings within this context, the review contributes to understanding GAI’s role in enhancing higher education outcomes.

3.2. Criteria for Inclusion in the Conference and Review

3.2.1. AIED 2024 Conference Inclusion Criteria

Papers accepted at the AIED 2024 conference were evaluated based on the following:

(1): Originality. Novel contributions to AI in education.
(2): Relevance. Research addressing key challenges in higher education.
(3): Methodological Rigor. Studies with clear research designs and valid evaluation metrics.
(4): Impact. Contributions with the potential to address critical gaps or advance the field.

3.2.2. Review Inclusion Criteria

From the conference proceedings, 67 papers were selected for this review based on the following:

(1): Educational Relevance. Focus on higher education challenges.
(2): Technological Diversity. Inclusion of LLMs, NLP, ML, and other GAI technologies.
(3): Empirical Evidence. Preference for papers with measurable outcomes.
(4): Comprehensive Scope. Representation across applications such as assessment, engagement, and predictive analytics.

Table 2 summarizes the inclusion criteria.

3.3. Systematic Review of the Academic Literature

3.3.1. Search Strategy

The search for relevant studies was conducted exclusively within the AIED 2024 conference proceedings. The following steps were undertaken to identify papers for the review:

(1): Keywords and Themes. Papers were identified based on their alignment with themes such as “Generative AI in education”, “personalized learning”, “adaptive learning systems”, and “AI-driven educational analytics”.
(2): Scope of Search. The search included all tracks and sessions of the conference, ensuring comprehensive coverage of GAI applications in higher education.
(3): Abstract Screening. Abstracts and keywords were screened to determine relevance to this study’s focus on higher education and core GAI technologies.
(4): Full-Text Review. Papers that passed the abstract screening stage were reviewed in full to ensure they presented empirical findings, actionable insights, or theoretical advancements.

This focused search strategy provided a representative view of recent advancements in GAI technologies, particularly in their application to higher education.

3.3.2. Screening and Selection

This systematic review adhered to the PRISMA 2020 guidelines, ensuring transparency and replicability in the study selection process. A total of 300 papers were initially identified from the AIED 2024 proceedings. Following the removal of duplicates and the application of inclusion criteria, 67 papers were selected for full-text review. These papers were evaluated based on their alignment with this study’s objectives, focusing on the use of GAI technologies in higher education. The selection process is summarized in Figure 2.

3.3.3. Synthesis of Findings

The 67 selected papers were categorized based on the following dimensions:

(1): Technological Focus. Papers were grouped according to the core GAI technologies discussed, such as large language models (LLMs), machine learning (ML), Natural Language Processing (NLP), reinforcement learning (RL), and knowledge tracing (KT).
(2): Application Areas. Studies were further categorized based on their application to higher education challenges, such as personalized learning, predictive analytics, and assessment automation.
(3): Reported Outcomes. The empirical findings, such as engagement improvements, were extracted to evaluate the impact of GAI technologies.

The synthesis highlighted trends in GAI adoption, identified key areas of progress, and outlined challenges and opportunities for future research.

3.4. Analytical Framework for Evaluating Case Studies

3.4.1. Selection Criteria

The eight case studies were selected to ensure technological diversity, empirical relevance, and applicability to higher education challenges. The following criteria guided the selection process:

(1): Technological Diversity. The case studies represent a range of GAI technologies, including large language models (LLMs), Natural Language Processing (NLP), machine learning (ML), and explainable AI (XAI).
(2): Educational Relevance. The cases address critical challenges in higher education, such as engagement, personalized learning, assessment, and early intervention.
(3): Empirical Evidence. Only case studies with measurable outcomes, such as retention improvements, or reduced workload were included.
(4): Comprehensive Scope. The selected cases span a wide array of applications, from automated grading to career readiness prediction.

These criteria ensured a representative sample of GAI applications in higher education, enabling a balanced analysis of their strengths and limitations.

The selected case studies, summarized in Table 3, illustrate the diverse applications and measurable outcomes of GAI technologies in higher education.

3.4.2. Evaluation Framework

Each case study was evaluated using a structured framework to assess its role, impact, and challenges. This framework consisted of the following dimensions:

(1): Technological Role. How GAI technologies address specific challenges.
(2): Applications Across Contexts. Implementation in areas such as grading, personalized learning, and performance prediction.
(3): Challenges and Limitations. Barriers to adoption, such as technical complexity and ethical concerns

4. Case Studies and Applications of GAI in Higher Education

4.1. Leveraging Generative AI for Enhancing Student Engagement and Behavior Analysis

GAI is playing a pivotal role in transforming how higher education addresses student engagement and behavior management. Tools like Learning Analytics Dashboards (LADs), PBChat for diagnosing student behaviors, and multimodal learning analytics in mixed-reality environments are at the forefront of this change. These technologies provide personalized support, real-time insights, and adaptive learning experiences, offering improvements to educational outcomes across various learning contexts. Table 4 summarizes key studies in this area, showcasing the diverse ways GAI is applied to enhance student engagement and behavior analysis.

4.1.1. Enhancing Self-Regulated Learning with Learning Analytics Dashboards (LADs)

The study by [18] explores the use of Learning Analytics Dashboards (LADs) to support self-regulated learning (SRL) in a blended learning environment. By leveraging temporal and sequence data, LADs provide real-time insights into student learning behaviors, promoting engagement and adaptability. Although not explicitly generative AI (GAI) tools, LADs share functional similarities with GAI in their ability to deliver personalized and adaptive learning experiences. Integrating GAI capabilities, such as predictive modeling and context-aware feedback, could enhance these dashboards by forecasting students’ learning trajectories and proactively addressing knowledge gaps. The findings suggest that GAI-powered LADs could revolutionize student engagement and learning effectiveness, offering a promising foundation for future developments in educational technology.

4.1.2. Diagnosing Student Problem Behaviors with PBChat

Ref. [20] introduces PBChat, a domain-specific GAI tool designed to diagnose student problem behaviors in educational contexts. By refining the ChatGLM2 model with the QLoRA algorithm, PBChat provides a scalable and systematic solution for identifying undesirable behaviors, overcoming the limitations of manual observation and subjective interpretation. The tool’s evaluation, which combines automated metrics with human assessments, shows superior accuracy and contextual sensitivity compared to generic language models. This allows for more precise and actionable insights, facilitating timely interventions that enhance classroom management and overall educational outcomes. This study highlights the transformative potential of GAI in addressing behavioral challenges and improving learning environments through systematic diagnostic processes.

4.1.3. Multimodal Learning Analytics in Mixed-Reality Educational Environments

Ref. [17] examines the integration of machine learning (ML) and Multimodal Learning Analytics (MMLA) in a mixed-reality educational setting to teach scientific concepts, like photosynthesis. By analyzing multimodal data—such as students’ physical movements, gaze direction, and emotional states—through 3D simulations, the study offers a comprehensive framework for Interaction Analysis (IA). This “AI-in-the-loop” approach emphasizes the collaboration between AI and educators, where AI aids in data analysis without replacing human judgment. Despite its effectiveness in providing real-time insights into student behaviors, the approach faces challenges related to aligning diverse data types and ensuring seamless integration. Expanding the framework to include real-time instructional support could further enhance its impact, making it a valuable tool for a wide range of educational settings.

These studies demonstrate the diverse and transformative applications of GAI in higher education, from enhancing self-regulated learning with LADs to diagnosing student behaviors with PBChat and analyzing multimodal data in immersive environments. GAI offers personalized, adaptive interventions that have the potential to improve student outcomes and the effectiveness of educational systems. However, challenges such as data integration, scalability, and broader application remain, suggesting that further refinement and exploration are needed to fully unlock the potential of these technologies in education.

4.2. Advancing Educational Assessment with Generative AI

GAI is revolutionizing educational assessment by automating grading, enhancing feedback, and refining learning evaluation processes. The integration of LLMs in grading systems has shown considerable potential for improving assessment practices and personalizing learning experiences. This section highlights several studies demonstrating how GAI is being incorporated into educational assessment frameworks, from improving grading transparency to enhancing complex grading tasks in programming education. Table 5 summarizes key advancements in the use of GAI technologies for educational assessment.

4.2.1. Enhancing Grading Transparency and Interpretability

Ref. [19] explores the use of Neural Additive Models (NAMs) in automatic grading, emphasizing the critical need for transparency in AI systems. NAMs improve grading accuracy while making AI decisions more interpretable, which is essential for educational contexts, especially in the case of generative AI applications. By comparing NAMs with logistic regression and DeBERTa, the study establishes a framework for evaluating the performance and explainability of generative AI technologies. This research contributes to the development of more reliable and understandable grading systems, which is crucial for their acceptance and integration into educational environments.

4.2.2. Improving Assessment in Programming Education

Ref. [2] introduces a machine learning-based approach to automate the assessment of documentation quality in programming assignments. The study demonstrates how large language models (LLMs), like BERT and GPT, can be fine tuned to classify the relevance of docstrings in relation to the code they describe. By achieving an accuracy rate of 89% with a fine-tuned CodeBERT model, this research highlights the potential of generative AI to enhance the educational experience by providing real-time, comprehensive feedback on student submissions. The findings suggest that GAI can be used to create more efficient and holistic grading systems that evaluate not only code functionality but also documentation quality.

4.2.3. Improving Complex Grading Tasks with NLP

Ref. [11] introduces a novel approach to Automated Long Answer Grading (ALAG) using advanced Natural Language Processing (NLP) techniques to assess complex student responses. Unlike traditional short-answer or essay grading systems, this method leverages large language models (BERT and GPT) to evaluate student answers based on a rubric that measures specific criteria. This approach facilitates a more nuanced and personalized assessment process, which can be scaled to accommodate large cohorts of students, thus demonstrating the transformative potential of GAI in grading complex, fact-based responses in educational settings.

4.2.4. Categorizing and Evaluating Student Responses

Ref. [41] investigates how generative AI can be applied to categorize student responses as correct, incorrect, or irrelevant using Natural Language Inference (NLI) techniques with models, like BERT and RoBERTa. This approach identifies omissions in student answers by comparing them to a gold standard, enhancing the grading process in educational contexts. Although the study provides strong foundational insights, it also calls for further research to explore the scalability of this “Marking” task across different disciplines beyond biology, highlighting the broader potential of GAI in the educational assessment landscape.

4.2.5. Advancing Essay Scoring with Large Language Models

Ref. [42] explores the application of large language models (LLMs), such as GPT-4 and Qwen-1.8B, in Automated Essay Scoring (AES) for Chinese language education. The study highlights how fine tuning LLMs can enhance scoring accuracy and the alignment of automated scores with human evaluations. Despite the promising results, the study also addresses the challenges posed by the linguistic complexity of the Chinese language, suggesting the need for continued research on prompt generation and dataset development to further improve AES systems for non-English languages. This work expands the understanding of how generative AI can be effectively applied to language-specific educational contexts.

4.2.6. Refining Automated Feedback Mechanisms

Ref. [40] presents a Multi-Task Automated Assessment (MTAA) system designed to provide multi-dimensional, detailed feedback on students’ essays. By utilizing advanced AI techniques, like multi-task learning (MTL) with Orthogonality Constraints (OCs) and Dynamic Learning Rate Decay (DLRD), this system goes beyond traditional grading methods, offering nuanced feedback across multiple criteria. While the system demonstrates the potential of generative AI to automate and refine the assessment process, it also reveals the limitations of current models, like ChatGPT, particularly in maintaining consistency and precision for multi-dimensional essay scoring.

4.2.7. Addressing Linguistic Diversity in Essay Grading

Ref. [10] investigates the integration of first language (L1) diversity in Automated Essay Grading (AEG) systems to improve the accuracy of assessments for second language (L2) learners. Using machine learning algorithms, such as XGBoost, the study introduces 11 distinct AEG models trained on essay data from various L1 groups (e.g., Arabic, Chinese, French). This research emphasizes the importance of considering linguistic features in grading systems and demonstrates how generative AI can provide more equitable, culturally responsive assessments for diverse student populations.

4.2.8. Enhancing Essay Scoring Accuracy with Contextual Features

Ref. [31] introduces a method to improve Automated Essay Scoring (AES) for Portuguese essays by integrating contextualized feature extractors tailored to the ENEM exam. By incorporating features, like conjunctions, syntactic quantification, and entity recognition, this approach enhances the predictive performance of AES systems. The study highlights the importance of adapting automated grading methods to specific regional and linguistic contexts, demonstrating how generative AI can be used to improve essay scoring for non-English languages and specific educational assessments.

4.2.9. Improving Confidence Estimation in AES

Ref. [26] focuses on improving the reliability of Automated Essay Scoring (AES) systems by incorporating confidence estimation along with score prediction. By enhancing a BERT-based neural network model, the research demonstrates that adding confidence estimation improves the performance of AES models, offering a more accurate and cost-effective grading system. This study emphasizes the importance of integrating confidence measures into AI-based grading tools to enhance their applicability in high-stakes educational assessments.

4.2.10. Improving AI Feedback with Error Handling

Ref. [12] investigates how students respond to false positive and false negative errors in AI-generated feedback during science essay writing. The study finds that students are more likely to correct false negative feedback than false positive feedback, which often goes unaddressed. This highlights the need for enhanced guidance to help students interpret AI-generated feedback effectively, especially when errors are involved, ensuring that AI-driven assessments optimize learning outcomes and support skill development.

4.2.11. Optimizing AI Models for Scoring Efficiency

Ref. [23] introduces a knowledge distillation method to create smaller, faster models for Automated Essay Scoring. By distilling a fine-tuned BERT model into a more efficient student model, the research shows that smaller models can outperform traditional models, achieving better accuracy with faster inference speeds. This approach highlights the potential of knowledge distillation for optimizing AI systems in educational contexts, especially when hardware constraints are a consideration.

4.2.12. Enhancing Text Cohesion Assessment with AI

Ref. [16] addresses the limitations of traditional Automated Essay Scoring (AES) methods by proposing an innovative approach that combines Item Response Theory (IRT) with machine learning models to assess cohesion in essays. This method focuses on improving the evaluation of textual coherence, especially in the context of Brazilian basic education essays. The preliminary results show that the integration of IRT with machine learning can improve the scoring of essay cohesion, offering a more nuanced evaluation of student writing.

4.2.13. Optimizing Grading Efficiency with AI

Ref. [13] introduces an AI-assisted grading system designed to improve the efficiency of grading handwritten answer sheets. Using advanced AI techniques to detect question regions and highlight key phrases in student responses, the system reduces grading time by over 30%. This research demonstrates the potential of AI technologies to streamline grading processes, making them more efficient and cost effective, particularly in large-scale assessments.

4.2.14. Improving AI Scoring Accuracy Through Example Selection

Ref. [43] investigates the impact of example selection on the accuracy of AI-based Automated Essay Scoring (AES) systems using GPT-3.5 and GPT-4. The study finds that the choice of examples plays a critical role in the performance of GPT models, with optimized example selection improving accuracy and reducing biases. The research underscores the importance of careful prompt design and highlights the potential of optimizing lower-cost models, like GPT-3.5, for educational applications.

These studies collectively illustrate the impact of generative AI technologies on educational assessment and feedback mechanisms. By enhancing grading accuracy, providing more personalized feedback, and improving the scalability of educational tools, GAI is transforming how educators assess and support student learning. While there are challenges to overcome, such as improving model consistency, addressing linguistic diversity, and optimizing feedback interpretation, these advancements offer promising avenues for the future of educational technology. With continued research and development, GAI has the potential to revolutionize the way assessments are conducted, making education more efficient, personalized, and accessible for diverse learning contexts.

4.3. Advancements in Predictive Modeling for Student Performance

Generative AI techniques have enhanced knowledge tracing and student performance prediction in educational settings. These innovations leverage LLMs, machine learning algorithms, and physiological data to offer more accurate, personalized, and scalable solutions. The studies presented here demonstrate the transformative potential of AI in predicting student outcomes, providing early warnings for at-risk students, and enhancing personalized education. Table 6 summarizes key advancements in the use of GAI technologies for student performance prediction.

4.3.1. Knowledge Tracing with Large Language Models (LLMs)

A novel approach to knowledge tracing (KT) utilizing a large language model architecture with Transformer decoders has been introduced in [21]. This method analyzes historical learning interactions to predict student performance, demonstrating superior accuracy compared to traditional deep learning-based KT models. Evaluated on the EdNet dataset, the LLM-KT model highlights the transformative potential of generative AI in enhancing personalized education. The study emphasizes the scalability and adaptability of LLMs, suggesting their applicability for real-time knowledge tracing across diverse educational contexts.

4.3.2. Recurrent Neural Collaborative Filtering (RNCF) for Knowledge Tracing

Ref. [1] presents a novel method called Recurrent Neural Collaborative Filtering (RNCF), designed to improve the accuracy of student performance prediction and tailor educational content. Compared to the Deep Knowledge Tracing (DKT) method, RNCF offers richer insights by better separating students and task representations. This improvement enhances personalized learning and curriculum development. The RNCF model’s ability to cluster students and tasks into meaningful groups also offers promising applications for adaptive tutoring strategies and real-time educational systems. The paper suggests further expansion of RNCF’s use across broader datasets and additional features, such as response time and grammatical errors.

4.3.3. Data-Driven Prediction of Student Performance

The use of data-driven approaches to predict student performance, particularly in online courses, has been gaining traction in educational research. Ref. [44] proposes a decision tree-based model for predicting the final performance of students in online computer programming courses. The model, which utilizes early-stage student activity data, achieved accuracy rates of 76% at one-third of the course and 82–83% at the mid-course stage. The decision tree’s interpretability and compactness allow for easy visualization by educators, enabling early interventions and course design adjustments. Although promising, future research should incorporate more comprehensive datasets, including sequential data on student interactions, to further improve the model’s predictive accuracy.

4.3.4. Physiological Signals for Performance Prediction

Ref. [6] explores the use of electrodermal activity (EDA) signals for predicting student grades, focusing on both tonic and phasic components of the EDA signal. The study found that a comprehensive analysis of the full EDA signal allowed for highly accurate predictions of student performance, with a mean squared error (MSE) of 0.002 and a root mean square error (RMSE) of 0.045. This approach offers the potential for identifying students affected by stress, enabling early intervention by educators. Despite its promising results, the study’s sample size and demographic limitations suggest a need for broader validation across diverse student populations. Further research could also examine the relationship between exam content and stress-induced physiological responses.

4.3.5. Algorithmic Fairness in Performance Prediction

Addressing algorithmic fairness in machine learning applications, Ref. [24] investigates a student performance prediction (SPP) system applied to a CS1 programming course. The study compares various machine learning models, including decision trees (DTs), Random Forests (RFs), and logistic regression (LR), with a focus on mitigating biases related to age, gender, and ethnicity. By employing dimensionality reduction and transfer learning techniques, the study improves model performance and interpretability, contributing valuable insights into developing equitable predictive systems in higher education.

4.3.6. Predicting Performance in Automated Essay Scoring with IRT Integration

Ref. [32] introduces a novel method integrating Item Response Theory (IRT) with Automated Essay Scoring (AES) systems to address the “nonequivalent groups design” problem in large-scale assessments. This method standardizes ability estimates across groups without the need for overlapping examinees or raters, thus improving the comparability of performance in settings such as university admissions. The integration of IRT with AES also enhances the accuracy of ability measurements, providing a more reliable and scalable solution to the test linking problem while addressing biases introduced by human raters.

4.3.7. Early Prediction in Intelligent Tutoring Systems

In [27], an innovative method for early prediction of student exercise outcomes is presented, utilizing a model called EPATT (Early Prediction using an Affect-aware Transformer and Timing information). The model integrates visual affective analysis and learning log data to predict student success in mathematics exercises within the first few seconds of engagement. By capturing affective signals, like facial expressions and timing data, the model refines its predictive accuracy and enables ITS to provide real-time interventions. This research highlights the potential of affective computing to enhance student engagement and learning outcomes.

The integration of advanced AI and generative technologies in student performance prediction has enhanced the precision, adaptability, and fairness of educational assessments. From leveraging LLMs for knowledge tracing to using physiological signals for stress analysis, these innovations enable personalized and equitable learning experiences. However, challenges such as data diversity, fairness, and computational scalability remain areas for ongoing exploration. As AI continues to evolve, its potential to transform education lies in its ability to provide adaptive, data-driven solutions tailored to individual student needs.

4.4. Generative AI in Enhancing Programming Education: Tools, Methods, and Insights

As GAI technologies continue to evolve, their applications in higher education, particularly in programming education, have become increasingly prominent. These AI-driven tools enable personalized learning, automate feedback generation, and assist in task solving, thus fostering a more engaging and effective educational environment. In programming education, GAI has shown promise in enhancing key skills, such as debugging, code comprehension, and collaboration, while also offering innovative approaches to assess student performance. Below, we present several case studies that explore how generative AI is being integrated into programming education to support student learning and improve teaching outcomes.

Table 7 summarizes the key findings from studies that explore how generative AI is being applied in programming education. These studies illustrate the various ways in which GAI is enhancing student engagement, improving coding skills, and automating aspects of the learning and assessment process.

4.4.1. Enhancing Debugging Skills Through LLMs

Ref. [7] introduces an innovative educational tool that integrates LLMs into programming education to enhance students’ debugging skills. By acting as Teaching Assistants (TAs) in debugging tasks, students engage with LLM-generated buggy code, which fosters deliberate practice. This method not only streamlines the generation of training materials but also improves students’ debugging performance, as evidenced by a 12% improvement in pre- to post-test scores. The tool exemplifies how generative AI can support skill development in programming education.

4.4.2. Generative AI for Conceptual Support and Debugging

Ref. [45] investigates the role of generative AI in an introductory programming course (CS1), highlighting its use for debugging assistance and conceptual clarity. The study reveals that students primarily utilize AI tools to resolve coding issues and understand concepts, rather than generating complete solutions. This finding underscores the potential of generative AI in promoting student-centered learning and enhancing engagement by providing on-demand, efficient support in programming tasks.

4.4.3. Assessing Code Explanations with Semantic Similarity

Ref. [8] explores the use of LLMs for evaluating students’ self-explanations of code, focusing on identifying gaps or incomplete reasoning through semantic similarity analysis. The findings demonstrate that semantic similarity approaches outperform zero-shot prompting in assessing code comprehension. This highlights the utility of LLMs in generating targeted feedback, fostering adaptive learning environments in higher education, and enhancing students’ understanding of programming concepts.

4.4.4. Logic Block Analysis for Feedback in Programming Submissions

Ref. [25] presents a method for analyzing student submissions using generative AI techniques, such as abstract syntax trees (ASTs) and decision tree (DT) classifiers. The approach effectively identifies critical logic blocks, predicts pass/fail outcomes, and uncovers common coding patterns and errors. By facilitating personalized feedback, this method showcases the potential of AI in improving teaching and learning outcomes in coding education, especially for large-scale or introductory programming courses.

4.4.5. ChatGPT’s Role in Programming Education

Ref. [39] examines the integration of ChatGPT into a university-level Python programming course, providing insights into its role as a supportive tool. The study highlights the tool’s effectiveness in debugging, code generation, and conceptual explanations, enhancing the learning experience over an eight-week period. Despite its benefits, the research identifies challenges, such as limited exploration of long-term interactions and students’ dependency on AI. The findings contribute to the discourse on generative AI’s integration in programming education, emphasizing the need for further optimization and best practices for its use.

4.4.6. Gamified Programming Exercises with Generative AI

Ref. [46] introduces GAMAI, a tool leveraging OpenAI’s GPT-based models to create gamified programming exercises. By combining storytelling and problem solving, GAMAI enhances student engagement while minimizing teachers’ workload. Although effective, the tool requires refinement to address minor quality issues in generated exercises. This study highlights the potential of gamification powered by generative AI to transform programming education.

4.4.7. Dynamic Feedback in Collaborative Programming

Ref. [9] explores the use of ChatGPT in an online programming exercise bot for a Cloud Computing course. By integrating reflection triggers and contextualized feedback, the tool enhances collaborative learning and supports dynamic interaction among students. The study demonstrates how LLMs can foster interactive and adaptive learning environments, contributing to improved educational outcomes in programming-focused higher education.

The integration of generative AI into programming education has demonstrated the potential to transform traditional teaching and learning practices. From enhancing students’ debugging skills and providing conceptual support to fostering collaboration and improving performance prediction, these AI tools are shaping the future of programming education. Furthermore, the adaptability and scalability of these technologies offer promising solutions for personalized learning, particularly in large-scale and resource-constrained settings. As educators continue to explore and refine AI-driven teaching tools, future research could focus on addressing challenges, such as AI dependency, fairness, and data diversity, ensuring that generative AI continues to provide equitable and effective support to all students.

4.5. Advancements in GAI for Educational Question Generation

GAI technologies are transforming educational content creation, particularly in question generation. By leveraging large and small language models (LLMs and sLMs), AI can automate the generation of questions aligned with various cognitive levels, providing scalable and personalized learning experiences. This section explores recent studies that apply GAI in educational question generation, highlighting their potential to improve content accuracy, engagement, and educational outcomes.

Table 8 summarizes the key findings from studies on generative AI in educational question generation. These studies demonstrate the potential of GAI to automate and enhance the creation of diverse types of questions, including multiple-choice and Bloom’s Taxonomy-based questions, while addressing challenges such as scalability, accuracy, and ethical considerations.

4.5.1. Large vs. Small Language Models for Educational Question Generation

Ref. [47] compares the use of large and small language models (LLMs and sLMs) for educational question generation. While LLMs like GPT offer powerful capabilities, sLMs provide a more efficient, resource-friendly, and privacy-compliant alternative. This comparison offers a balanced approach to adopting AI in education, ensuring scalability while addressing ethical concerns.

4.5.2. Reinforcement Learning for Educational Question Generation

Ref. [38] explores the application of reinforcement learning (RL) in enhancing educational question generation using the FLAN-T5 LLM. The study demonstrates improved performance by addressing biases and inconsistencies, suggesting RL’s potential to refine AI-generated educational content for more accurate and relevant question creation.

4.5.3. Improving Content Generation with Retrieval-Augmented Generation (RAG)

Ref. [30] investigates the integration of retrieval-augmented generation (RAG) with GPT to improve content accuracy in educational contexts. RAG reduces hallucinations and increases accuracy, demonstrating the potential of RAG-enhanced GPT models for personalized learning experiences in MOOCs and other educational platforms.

4.5.4. Contextualized Multiple-Choice Question Generation for Mathematics

Ref. [34] presents a method for generating contextualized multiple-choice questions (MCQs) in mathematics using AI tools, like LangChain and ChatGPT. The study highlights AI’s ability to generate high-quality questions comparable to human experts, offering a practical solution for reducing educators’ workload in question creation.

4.5.5. Evaluating LLMs for Bloom’s Taxonomy-Based Question Generation

Ref. [48] evaluates LLMs for generating questions based on Bloom’s Taxonomy, assessing their ability to cover a range of cognitive levels. The study shows that LLMs, like GPT-4, excel in generating questions with high pedagogical relevance but face challenges in higher-order cognitive skill assessments, such as analysis and synthesis.

4.5.6. GPT-4 for Bloom’s Taxonomy-Aligned MCQ Generation in Biology

Ref. [49] examines GPT-4’s effectiveness in generating multiple-choice questions (MCQs) in biology based on Bloom’s Taxonomy. The study finds that while GPT-4 produces relevant and usable questions, improvements are needed to ensure the model accurately addresses higher cognitive levels, like “Analyze” and “Evaluate”.

Generative AI offers the potential for automating and improving educational question generation, with applications ranging from personalized content creation to adaptive assessments. While the studies reviewed highlight various techniques, such as reinforcement learning and RAG, challenges remain in ensuring the accuracy, fairness, and scalability of these AI models. Ongoing research is needed to refine these tools, ensuring they meet educational standards and effectively address higher-order cognitive skills. As GAI continues to evolve, its role in education will only grow, shaping more interactive and personalized learning environments.

4.6. GAI for Educators: Enhancing Decision Making and Teaching Practices

GAI is transforming the educational landscape, offering new ways to support decision making, improve teaching practices, and enhance learning experiences. By leveraging the power of AI-driven tools, educators can streamline administrative tasks, personalize learning, and make data-informed decisions. This section explores the various applications of generative AI in education, focusing on how these technologies are empowering educators, enhancing instructional practices, and improving assessment systems.

Table 9 below summarizes the key advancements and findings from studies that demonstrate the integration of generative AI into educational settings to improve decision making, teaching practices, and assessments.

4.6.1. Enhancing Decision Making with AI-Driven Learning Analytics

Ref. [4] introduces VizChat, an open-source prototype chatbot designed to augment Learning Analytics Dashboards (LADs) by providing AI-generated, context-sensitive explanations for data visualizations. Using GPT-4V and retrieval-augmented generation (RAG), this tool enables users, especially those with limited data visualization literacy, to better interpret complex educational data. By shifting from exploratory to explanatory visualizations, VizChat improves decision making, reflective practices, and teaching effectiveness. This integration of generative AI demonstrates the potential for AI to assist educators and stakeholders in making data-driven decisions that are both personalized and actionable.

4.6.2. Improving Teacher Professional Development with AI Frameworks

Ref. [50] presents LLMAgent-CK, a framework that utilizes multi-agent large language models (LLMs) to assess teachers’ mathematical content knowledge (CK) in professional development (PD) systems. This approach allows for the evaluation of free-text responses without the need for extensive labeled datasets, overcoming traditional challenges associated with rule-based and machine learning methods. By offering reasoning for each result, LLMAgent-CK aligns AI-driven assessments with human expertise, advancing automated methods in teacher training. This framework supports the use of generative AI for enhancing professional development and continuous learning in educational contexts.

4.6.3. Automating Educational Test Item Evaluation with AI

Ref. [51] introduces a novel aspect-based semantic textual similarity method for assessing the similarity between educational test items, specifically designed for English as a Foreign Language (EFL) assessments. By leveraging LLMs, like GPT-4 and LLaMA2, this study creates a benchmark dataset to improve the accuracy of exam quality management and facilitate personalized student learning. This application highlights how generative AI can enhance educational assessments, making them more context aware and efficient by automating similarity checks that traditionally required human input.

4.6.4. Personalized Pedagogical Recommendations with AI

Ref. [52] explores the development of a Pedagogical Design Pattern (PDP) Recommender System driven by generative AI. Using LLMs and a Retrieval Augmented Generation (RAG) framework, this system helps educators select evidence-based pedagogical practices from a knowledge base, delivering personalized recommendations tailored to the classroom context. The system’s high accuracy (83%) in suggesting relevant PDPs demonstrates the potential of generative AI to improve teaching practices and assist educators, both novice and experienced, in adopting best practices for classroom engagement.

4.6.5. AI for Quality Control in Multiple-Choice Question Design

Ref. [53] introduces SAQUET, an AI-based tool designed to evaluate the quality of multiple-choice questions (MCQs) in educational assessments. By applying the Item-Writing Flaws (IWF) rubric, the toolkit leverages NLP techniques, including GPT-4 and word embeddings, to automate flaw detection across various academic domains. Achieving over 94% accuracy in identifying flaws, SAQUET addresses the limitations of traditional NLP metrics, providing educators with a reliable tool to ensure high-quality, pedagogically sound assessments.

4.6.6. Enhancing Personalized Learning with Memory Modeling

Ref. [28] examines the use of LSTM Autoencoder Collaborative Filtering (LACF) for memory modeling in online learning systems. The model, designed to handle sparse interaction data, leverages machine learning techniques to predict learners’ memory states and improve adaptive learning systems. This approach is aligned with recent advancements in personalized learning, particularly in higher education, where understanding and modeling learner memory plays a critical role in enhancing engagement and retention.

4.6.7. AI-Enhanced Peer Feedback for Collaborative Learning

Ref. [54] explores the use of a virtual agent within a peer assessment tool to provide tailored feedback based on students’ personality traits, such as Conscientiousness and Emotional Stability. The study shows how generative AI can be used to enhance collaborative learning processes, providing personalized support to students working in teams. This approach highlights the potential of AI to improve teamwork dynamics and ensure that students receive the emotional and cognitive support they need to succeed.

4.6.8. Analyzing Collaborative Learning with AI in Interdisciplinary Contexts

Ref. [55] investigates the use of GPT-4-Turbo to analyze collaborative learning in interdisciplinary contexts. By studying how students integrate physics and computing concepts, the study demonstrates the model’s ability to provide actionable feedback that supports educators in understanding complex student learning processes. This research suggests that generative AI can offer valuable insights into cross-disciplinary learning, making it an important tool for interdisciplinary STEM education.

4.6.9. Conversational AI for Engaging Biology Education

Ref. [56] discusses the development of a Conversational Tutoring System (CTS) leveraging large language models (LLMs) to enhance biology education. The system automates content authoring and orchestrates student–tutor conversations, allowing students to engage interactively with AI-driven agents. This application of generative AI offers a promising approach to improving student engagement and cognitive understanding in educational settings.

4.6.10. Ensuring Academic Integrity with AI-Generated Content Detection

Ref. [57] addresses the challenge of Unauthorized Content Generation (UCG) in higher education, proposing a refined approach to authorship verification (AV) for detecting contract cheating and unacknowledged AI use. By enhancing the Feature Vector Difference (FVD) AV method, the study offers improved transparency and interpretability in detecting AI-generated content, contributing to academic integrity efforts in educational contexts.

4.6.11. Optimizing Question Difficulty Estimation with Knowledge Graphs

Ref. [58] explores the integration of knowledge graphs (KGs) into text-based question difficulty estimation (QDE) models. By leveraging KGs to contextualize question topics, the proposed method improves the accuracy of difficulty predictions, with an 8% reduction in mean absolute error (MAE) compared to text-only models. This study demonstrates how generative AI, combined with knowledge-enhanced NLP techniques, can optimize educational assessments and contribute to more effective personalized learning.

Generative AI is revolutionizing the way educators make decisions, engage students, and assess learning outcomes. From improving teaching practices with personalized recommendations to enhancing the quality of assessments and detecting academic misconduct, the potential of generative AI to support educators is vast. However, challenges such as bias, data privacy, and scalability must be addressed to ensure that these technologies are implemented effectively across diverse educational contexts. The future of education is undoubtedly shaped by the continued integration of generative AI, which holds the promise of transforming teaching and learning in profound and lasting ways.

4.7. GAI for Learners: Enhancing Personalized Learning and Educational Experiences

GAI technologies have the potential to transform learning environments by providing personalized learning experiences, improving student engagement, and fostering critical thinking. These technologies offer new ways to support learners through tailored content generation, adaptive feedback, and dynamic interaction. The applications of GAI in higher education are vast, ranging from enhancing intelligent tutoring systems to personalizing learning paths. In Table 10 below, there are several studies that highlight how generative AI can be integrated into educational contexts, offering insights into how these tools can benefit learners, improve teaching practices, and enhance educational outcomes.

4.7.1. Improving Learner Engagement with AI-Generated Scaffoldings in ITS

Ref. [59] explores the integration of large language models (LLMs) into intelligent tutoring systems (ITSs), specifically aiming to enhance learner engagement and foster critical thinking. The proposed framework uses LLMs to generate dynamic scaffolding for educational tasks, such as hints and step-by-step guidance. These scaffoldings are designed to promote deeper understanding and independent learning, supporting the “teaching to fish” philosophy. The preliminary results suggest that LLM-generated scaffoldings improve learning outcomes, highlighting the potential of GAI in creating more human-centered, interactive learning environments.

4.7.2. Cultural Intelligence and Personalized Learning with AI Chatbots

In [22], the study investigates the use of LLM-based chatbots in educational settings to enhance personalized learning, lesson planning, and professional development. The study focuses on measuring the cultural intelligence of these chatbots using tools traditionally developed for humans. The findings emphasize the ethical and cultural considerations in applying generative AI in diverse educational contexts while also pointing to the need for future research to explore the adaptability of these chatbots across varying cultural and educational environments to ensure broader applicability and fairness.

4.7.3. Task-Based Learning and Instructional Video Enhancement with Generative AI

Ref. [60] addresses the integration of LLMs and generative AI in task-based learning, particularly for complex, hands-on tasks traditionally taught through demonstrations. The study presents a multimodal learning pipeline that allows learners to switch seamlessly between text instructions and video demonstrations. Using LangChain and GPT-4, the system aligns instructional steps with official procedure documents, creating structured and knowledge-rich learning experiences. While the method shows potential for high-stakes educational contexts, further improvements are needed, especially in object recognition capabilities and expanding its application across other instructional areas.

4.7.4. Expanding Online Learning Content with AI-Generated Educational Videos

Ref. [61] presents EDEN, a generative AI method designed to enhance online learning by expanding academic video databases. By utilizing fine-tuned LLMs and stable diffusion models, EDEN can generate or retrieve academic videos that match the style and format of existing content. The system has demonstrated impressive results in terms of content generation speed and performance. EDEN addresses the challenge of keeping educational resources relevant and up to date, ensuring that students in areas with limited resources have access to high-quality learning materials.

4.7.5. AI-Driven Conversational Systems for Healthcare Education

Ref. [62] explores the use of conversational AI in nursing education, focusing on a chatbot with a visual avatar designed to simulate courtroom testimony trials. This AI-powered tool allows nursing students to practice legal procedures and terminology in an interactive, risk-free environment. By leveraging generative AI technologies, the study shows how these systems can provide immersive, personalized training for students in healthcare education, preparing them for real-world legal and ethical scenarios.

4.7.6. Reinforcement Learning for Personalized Feedback in Math Education

Ref. [14] introduces a framework that uses reinforcement learning and human preference alignment to generate automated feedback for math education. By utilizing LLMs, like GPT-4 and Llama 2, this study demonstrates how GAI can improve student learning outcomes by providing personalized feedback that is both accurate and pedagogically aligned. The findings highlight the effectiveness of this approach in enhancing student engagement and performance, with the potential for broader application in diverse educational contexts.

4.7.7. AI-Powered Conversational Assistants for Online Classrooms

In [63], the study explores Jill Watson, a conversational AI-powered assistant designed to support online classrooms. The system uses ChatGPT to answer student queries related to course logistics and content, offering a modular design that integrates with new APIs and processes extensive documents. The study highlights how generative AI can alleviate instructor workload while providing continuous, interactive support to students. Jill Watson’s ability to enhance student engagement and facilitate effective learning in online settings underscores the growing potential of AI-driven educational tools.

4.7.8. Hybrid AI–Human Collaboration for Educational Data Coding

Ref. [64] investigates the use of LLMs, specifically ChatGPT (GPT-4), to improve the process of coding qualitative data in educational research. The study compares fully manual, fully automated, and hybrid approaches, with hybrid methods—where ChatGPT is used for initial code identification or codebook refinement—producing codebooks with improved reliability and quality. This research demonstrates how hybrid AI–human collaboration can streamline coding processes in large-scale qualitative studies, enhancing efficiency and accuracy in educational data analysis.

4.7.9. Personalized Vocabulary Learning with AI

Ref. [37] presents a hybrid AI recommendation system designed to improve English vocabulary acquisition. By integrating word vectors, cosine similarity, and Long Short-Term Memory (LSTM) networks, the system provides personalized recommendations based on a learner’s mastery level, forgetting curve, and word examination frequency. This system can be particularly useful in higher education, where language learning apps can use generative AI to offer tailored learning experiences that improve both engagement and efficiency.

4.7.10. Intelligent Tutoring System for Advanced Mathematics

Ref. [65] discusses Xiaomai, an intelligent tutoring system (ITS) developed to assist Chinese college students in mastering advanced mathematics. The system incorporates features such as automated feedback for free-response questions and metacognitive error reflection. Although the study shows that multiple-choice questions yielded better outcomes, it highlights the potential of integrating generative AI to enhance feedback and adaptive learning capabilities in ITS, particularly in mathematics education.

4.7.11. Generative AI for Scalable Feedback Generation in Higher Education

Ref. [5] examines the use of LLMs to generate feedback in intelligent tutoring systems (ITSs). The study emphasizes the importance of grounding AI-generated feedback in educational theory and empirical research. It calls for more systematic approaches to constructing and evaluating prompts for LLMs to ensure their effectiveness in educational contexts. This work highlights the need for careful integration of generative AI with existing educational frameworks to optimize feedback generation in ITS.

4.7.12. ChatGPT’s Impact on Knowledge Comprehension

Ref. [66] analyzes the impact of generative AI, particularly ChatGPT, on university students’ knowledge comprehension. The study investigates how engagement with ChatGPT mediates the relationship between prior understanding and final academic performance. The findings underscore the importance of guiding students in effective AI usage, particularly strategies for verifying and comparing information, to improve learning outcomes and comprehension.

4.7.13. AI-Driven Personalized Learning Feedback for Large-Scale Courses

Ref. [15] presents the implementation of Prescriptive Learning Analytics (PLA) combined with explainable AI (XAI) to provide personalized feedback in large-scale university courses. The research finds that machine-generated feedback outperforms teacher-written feedback in multiple quality criteria, demonstrating the potential of AI-driven feedback systems to enhance student learning experiences in higher education.

GAI is revolutionizing higher education by enhancing personalized learning, automating feedback processes, and fostering more effective engagement between students and educators. From AI-powered conversational assistants to personalized learning recommendations and automated assessments, these technologies are contributing to more dynamic, adaptive, and accessible learning environments. While significant advancements have been made, further research is necessary to refine these systems, address challenges, such as bias and scalability, and ensure equitable outcomes across diverse educational settings. As generative AI continues to evolve, its integration into higher education promises to further transform teaching and learning, providing students with more tailored, efficient, and impactful educational experiences.

4.8. Generative AI for Student Success: Early Detection and Career Readiness Prediction

GAI technologies are increasingly being applied in higher education to support student success and improve learning outcomes. One of the most promising applications of these technologies is in the early identification of at-risk students, enabling timely interventions to prevent dropout and failure. Additionally, generative AI is being used to predict career readiness, providing educators and institutions with insights into students’ preparedness for entering the workforce. In Table 11 below, there is a summary of key studies exploring how generative AI is used for both the early detection of at-risk students and career readiness prediction.

4.8.1. Multitasking Models for Identifying At-Risk Students

Ref. [35] explores the use of advanced multitasking AI models, including deep multitask learning, Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), to identify and classify at-risk students in the Moroccan education system. The integration of Explainable Artificial Intelligence (XAI) techniques enhances the model’s transparency by identifying the factors contributing to predictions, thus enabling timely interventions. This approach illustrates how generative AI can improve the identification of students at risk of failure, success, or dropout, offering valuable support for decision making in educational settings.

4.8.2. LLMs for Predicting Student Dropout

In [29], large language models (LLMs) are used to predict student dropout, an issue of critical concern in higher education. By combining Retrieval Augmented Generation (RAG) with Few-Shot Learning (FSL), the study enhances the predictive capabilities of LLMs, addressing challenges such as data imbalance and missing values. The authors demonstrate that LLMs, with their adaptability and contextual understanding, outperform traditional machine learning models and provide not only improved predictions but also textual analysis of student data. This research suggests that LLMs can serve as effective tools for early detection and intervention in at-risk students, ultimately contributing to better student retention rates.

4.8.3. Learning Analytics for Student Dropout Prediction

Ref. [3] introduces LANSE, a learning analytics tool designed to predict student dropout and failure risks through machine learning algorithms. With its cloud-based architecture, LANSE collects, processes, and visualizes student data to provide weekly predictions of students at risk. This tool has proven effective in distance learning environments, helping educators optimize workload management and track student behavior. While LANSE shows promise, challenges remain, such as ensuring privacy compliance and managing real-time data processing. The study underscores the importance of continuous research to refine AI-driven solutions for higher education.

4.8.4. ACO-LSTM Model for Dropout Prediction

Ref. [36] takes a unique approach by combining Ant Colony Optimization (ACO) with Long Short-Term Memory (LSTM) networks to predict student retention in higher education. The ACO-LSTM method optimizes the learning rate for the LSTM model, improving dropout prediction accuracy. This work demonstrates how heuristic optimization techniques can enhance predictive models, contributing to better decision making regarding student retention and success. The study provides a valuable perspective on using generative AI to optimize dropout prediction in educational contexts.

4.8.5. Career Readiness Prediction Using LLMs

Ref. [67] investigates the potential of large language models (LLMs) in predicting students’ career readiness by analyzing their narrative responses. The study uses LLMs to classify students’ career-related identity statuses according to Marcia’s identity framework, which categorizes identity formation into four stages: Diffusion, Foreclosure, Moratorium, and Achievement. The study highlights the feasibility of using LLMs for predicting career preparedness, emphasizing how AI can serve as a resource-efficient tool for assessing career readiness in both secondary and post-secondary educational settings.

Generative Artificial Intelligence (GAI) is making significant strides in higher education by enhancing predictive capabilities and providing personalized solutions to support student success and career readiness. From identifying at-risk students to predicting career preparedness, GAI technologies, such as large language models (LLMs), machine learning (ML), Natural Language Processing (NLP), and reinforcement learning (RL), offer promising solutions for early intervention and tailored educational support. These tools enable institutions to make data-driven decisions that improve both academic outcomes and workforce readiness.

Figure 3 highlights the percentage contributions of LLMs, ML, NLP, and RL to different educational application areas, such as student engagement, dropout prevention, automated grading, and learning personalization. These visualizations emphasize the distinct roles each technology plays in addressing key educational challenges, particularly in fostering engagement, automating assessments, and enhancing predictive analytics.

Moreover, Figure 4 illustrates the interconnections between core GAI technologies and their respective applications in higher education. For instance, NLP tools facilitate automated grading and sentiment analysis, while reinforcement learning supports dynamic learning paths and adaptive feedback systems. Together, these technologies enable institutions to create holistic and adaptive learning environments tailored to individual student needs.

Despite the promising results demonstrated in the case studies, challenges such as data availability, model generalizability, and privacy concerns persist. These issues highlight the importance of continued research and innovation to address such limitations. As GAI technologies continue to evolve, their potential to revolutionize higher education by supporting students’ academic journeys and preparing them for successful careers becomes increasingly apparent. By effectively leveraging these tools, educators and institutions can build more inclusive, data-driven, and equitable educational ecosystems.

5. Challenges and Limitations

In the case study in Section 4.1, while GAI tools, like PBChat, demonstrate potential in diagnosing student behaviors, the over-reliance on language-based AI raises concerns about accessibility for students with disabilities. Additionally, the study on multimodal learning analytics highlights challenges in integrating diverse data sources, suggesting that scalability and data alignment remain critical hurdles. These findings align with concerns raised in the broader literature about the limitations of current AI models in handling heterogeneous datasets [10].

In this section, we explore some of the key challenges and limitations highlighted across the case studies, particularly focusing on the technical and ethical issues surrounding the use of Generative Artificial Intelligence in higher education. Despite the promising applications of GAI, several obstacles must be overcome to ensure its effective and equitable integration into educational systems. The following discussion covers critical areas such as scalability, data quality, model accuracy, user trust, and ethical concerns.

5.1. Scalability and Consistency

GAI applications in automated assessment systems, such as the case in [40], have demonstrated the potential for enhancing educational assessment. However, scalability remains a significant challenge, particularly when comparing GAI models to human raters and specialized assessment methods, like Multi-Trait and Multi-Actor models. While GAI systems can handle large volumes of data, their ability to maintain consistency and reliability across different contexts and educational settings is still a work in progress. Future work should focus on refining generative models to provide multi-dimensional feedback that is more reliable and scalable across diverse educational contexts.

The study in [26], which combines regression-based score prediction with classification-based confidence estimation, offers promising results but highlights that scalability is constrained by the current limitations in model performance and dataset size. As the educational landscape becomes more complex, scaling these models to handle diverse student populations and assessment types will be crucial.

5.2. Data Imbalances and Quality Issues

Many studies, such as [10], have noted the challenges posed by imbalanced datasets, particularly when certain student groups (e.g., those from non-majority linguistic backgrounds) are underrepresented. This lack of diversity can undermine the robustness and fairness of predictive models. Ensuring that datasets are comprehensive and inclusive is essential for creating reliable and equitable AI systems in education.

Additionally, Ref. [23] demonstrates the utility of knowledge distillation in optimizing AI models, but it also points out the limited scope of the datasets used, which were primarily focused on science and mathematics. To improve generalizability, future research should explore the performance of these models across broader educational datasets and contexts, ensuring that they can handle a variety of learning environments and subjects.

5.3. Model Accuracy and Transparency

Model accuracy is a critical concern in the deployment of AI systems in educational contexts, as seen in the [12] case study, which discusses the AWE system’s performance in providing individualized feedback. Although the system can personalize feedback, it has been found to sometimes generate false positives, leading to a breakdown in user trust. This highlights a fundamental challenge in ensuring that GAI systems are not only accurate but also transparent and explainable.

Ref. [8] also raises concerns about the evolving nature of large language models, which may cause inconsistencies and inaccuracies in their assessments over time. Given that educational assessments are high stakes for students, there is a pressing need for models that are both reliable and interpretable. Future research should aim to improve the accuracy and consistency of GAI-based assessments, as well as enhance transparency and user trust by incorporating explainability features.

5.4. Ethical Concerns

As [62] highlights, the ethical challenges of using GAI in education, particularly around issues of data privacy, consent, and AI model biases, are profound. Many AI systems require access to sensitive student data, raising concerns about how these data are collected, used, and stored. Moreover, the possibility of algorithmic biases affecting student outcomes—such as biased career readiness predictions or unfair grading practices—raises questions about fairness and equity.

In the case of [59], the potential for GAI models to “hallucinate” or generate irrelevant content is a known issue, particularly with LLMs. This can lead to misleading information being presented to students, further exacerbating concerns about the reliability of AI-generated feedback. Ethical considerations must be at the forefront of future research, especially regarding how to mitigate these risks and ensure that GAI technologies are used responsibly in educational settings.

5.5. User Trust and Engagement

While GAI systems like [16] show improvements over traditional methods, there is still a significant gap in assessing the impact of these technologies on user experience. The [47] study notes that while small language models (sLMs) show promise for educational question generation (EdQG), they do not fully match the output quality of LLMs, which can impact student engagement and learning outcomes.

Building user trust is particularly challenging when students encounter AI feedback that may be inconsistent or erroneous, as discussed in [12]. Addressing user concerns by improving model accuracy, offering clear explanations of AI decisions, and incorporating human oversight are essential strategies for increasing the effectiveness of GAI in education. Moreover, increasing user engagement with these systems requires addressing the psychological factors that influence how students perceive and interact with AI-based tools.

5.6. Practical Implementation and Real-World Applicability

The transition from theoretical models to real-world applications poses significant challenges. Ref. [37] notes that while LSTM-based predictive models can outperform baseline algorithms, their performance is still constrained by the dataset size, which affects the accuracy of predictions.

Expanding these models to handle more complex scenarios and evaluating them across diverse educational environments are necessary steps to improve their real-world applicability. Moreover, Ref. [44] suggests that future research should focus on refining the application of ensemble methods and advanced machine learning techniques to enhance prediction accuracy and robustness across varying educational contexts.

5.7. Long-Term Impact and Sustainability

Many of the studies examined, such as [30], emphasize the importance of exploring the long-term impacts of GAI models, particularly in terms of student engagement, retention, and motivation. While these models show short-term improvements, it remains unclear how they will perform over time, especially in large-scale, diverse educational settings. Further research is needed to assess the sustainability and scalability of these technologies, particularly in terms of long-term educational outcomes.

In addition, the integration of GAI in educational settings must be approached with caution to avoid over-reliance on AI tools. As noted in [45], there is a risk that students may become overly dependent on AI for problem solving, which could stifle the development of critical thinking skills. Balancing the use of AI tools with the promotion of independent learning and decision making is crucial to ensure that students benefit from GAI without diminishing their cognitive abilities.

5.8. Failures in GAI Applications

While the case studies highlight numerous successful applications of GAI, failures also deserve attention. Figure 5 presents the proportion of successful outcomes and failures observed in the analyzed GAI applications. While the majority of case studies highlight promising results, failures such as bias, scalability, and data quality remain significant concerns that require further exploration. For instance, LLM-based chatbots can sometimes produce irrelevant or biased responses, undermining their effectiveness in high-stakes educational contexts. Similarly, predictive ML models may fail to generalize across diverse student populations due to biased training datasets. These examples underscore the need for further research to address these limitations and ensure the responsible and equitable implementation of GAI technologies in higher education.

6. Future Directions in Generative AI in Education

The integration of GAI into higher education has shown great promise, yet many areas remain ripe for exploration. In this section, we discuss the potential future directions in the application of GAI in educational contexts, as well as key challenges that need to be addressed to fully realize its potential. Drawing on the case studies and challenges identified in earlier sections, we outline critical areas that require further investigation, technological refinement, and ethical consideration.

6.1. Enhancing Personalization and Adaptivity

Future research should focus on refining the adaptability of generative models to cater to the diverse learning needs of students. While personalized learning has been a significant theme in AI applications, much work remains in improving AI systems’ ability to generate truly individualized content, feedback, and learning pathways. Current models often struggle with adjusting to unique learning styles, cognitive abilities, and emotional states, which could limit their effectiveness in diverse educational environments.

Further advancements in multimodal learning systems that integrate data from various sources (e.g., learning behavior, sensor data, academic performance, emotional cues) could lead to more effective personalized experiences. Additionally, developing models that can adjust dynamically to student progress, content complexity, and pedagogical approaches will be critical for scaling GAI in educational institutions.

6.2. Ethical and Bias Considerations in Generative AI

As discussed in the case studies, issues related to bias in AI models remain a significant concern. While AI systems have the potential to democratize education, they also risk reinforcing or amplifying existing biases present in data. For instance, biased training datasets may lead to the underrepresentation of certain demographic groups, negatively impacting the performance of GAI in applications, such as career readiness prediction or dropout risk assessment.

Future work must address these biases through improved data diversity and fairness frameworks. This includes integrating fairness checks and developing tools to mitigate bias in real time during AI model operation. Moreover, there is a need for deeper investigation into the ethical implications of using AI in sensitive educational areas, such as academic integrity, student autonomy, and the potential misuse of AI-generated content.

6.3. Improving System Usability and Trustworthiness

A recurring challenge identified across case studies is the issue of user trust, particularly in relation to the accuracy of AI-generated feedback. Students and educators may be hesitant to fully engage with AI systems if they perceive the outputs as unreliable or prone to errors. Addressing this concern will require significant improvements in system transparency, accuracy, and the explainability of AI decision-making processes.

Future research should explore methods for enhancing the interpretability of AI outputs, providing more context for students and educators to understand how decisions are made. Additionally, there is a need to develop systems that can effectively communicate uncertainty or error in AI-generated assessments, ensuring that users have the necessary information to critically engage with the results and make informed decisions.

6.4. Scalable and Robust AI Models for Diverse Educational Contexts

Scalability is another critical challenge identified in the case studies. Many current AI models are limited by their reliance on specific datasets, narrow scopes of application, or complex infrastructure requirements. To make GAI viable across a wide range of educational settings—from small classrooms to large, diverse online courses—future work should focus on developing more scalable AI models that can operate across different educational contexts, disciplines, and geographical locations.

This includes designing lightweight models that can operate efficiently on a broad range of devices, such as smartphones and low-cost computers, making GAI tools more accessible to underserved educational communities. Additionally, research should explore the use of federated learning and other decentralized AI methods to support scalability while addressing concerns about data privacy and security.

6.5. Long-Term Impact and Integration into Curriculum Design

While current studies have shown the potential of GAI to improve educational outcomes in the short term, more research is needed to understand the long-term impacts of GAI on student learning, behavior, and career success. Understanding the lasting effects of AI-driven educational tools will be crucial for determining the sustainability and effectiveness of these technologies in the long run.

Furthermore, integrating GAI into existing curriculum frameworks remains a complex challenge. Future research should explore how AI can be used to complement traditional teaching methods and align with established pedagogical principles. Developing best practices for the integration of GAI into curricula will ensure that these technologies are used in ways that enhance educational goals rather than disrupt them.

6.6. Interdisciplinary Collaboration for Holistic AI Development

To address these challenges and advance GAI technologies in higher education, interdisciplinary collaboration is essential. AI researchers, educators, ethicists, and policymakers must work together to build more comprehensive solutions that consider not only technical performance but also educational theory, ethical standards, and societal impact.

This collaborative approach will enable the development of generative AI systems that are not only technically advanced but also pedagogically sound and ethically responsible. Future research must prioritize the development of hybrid AI–human systems, as demonstrated in collaborative learning environments in the case study in Section 4.6 These systems effectively leverage the computational strengths of AI while retaining critical human oversight, addressing concerns about ethical risks and pedagogical implications [52,62]. For instance, hybrid systems in collaborative learning platforms in the study case mitigate ethical challenges associated with fully automated models and enhance transparency in decision-making processes.

Furthermore, efforts to improve explainability, as exemplified by NAM-based grading systems in the case study in Section 4.2, must be extended to other GAI applications to ensure transparency and build user trust. Studies employing explainable AI in the case study in Section 4.7 emphasize the importance of increasing interpretability to foster trust among educators and students alike. These findings highlight the necessity of cross-disciplinary partnerships and the involvement of diverse stakeholders in the design, deployment, and evaluation of generative AI tools in education. By aligning technical advancements with ethical and pedagogical frameworks, GAI technologies can evolve into systems that are both effective and equitable.

7. Conclusions

This paper has provided a comprehensive analysis of the role of Generative Artificial Intelligence in transforming higher education, focusing on a selection of case studies presented at the AIED 2024 conference. As GAI technologies, such as large language models and predictive analytics, continue to evolve, their applications in education are demonstrating considerable potential in improving student engagement, personalized learning, academic performance prediction, and retention. These advancements are not only reshaping traditional teaching and learning models but are also creating innovative ways to support diverse student needs.

However, the integration of GAI into educational settings is not without challenges. Issues such as data imbalances, model accuracy, scalability, and the ethical implications of AI use must be addressed to ensure these technologies benefit all students. While the case studies showcase promising results, the limitations identified, such as inaccuracies in feedback, biased predictions, and the need for more diverse datasets, suggest that further research is necessary to refine GAI systems for broader and more reliable applications.

Future research should focus on enhancing the fairness, transparency, and generalizability of GAI models, especially in contexts involving diverse student populations. This includes improving the scalability of models for larger datasets, enhancing AI’s ability to handle more complex student behaviors, and ensuring that ethical concerns, such as privacy and bias, are adequately addressed. Additionally, the role of GAI in supporting teachers and administrators, through tools for performance prediction and career readiness, should continue to be explored in greater depth.

This study acknowledges that the reliance on AIED 2024 conference papers, while valuable for capturing cutting-edge advancements, may introduce thematic bias due to the limited scope of this single source. Expanding the dataset to include peer-reviewed journals, multi-disciplinary conferences, and diverse geographical contexts will help mitigate this limitation and provide a more holistic understanding of GAI applications and challenges.

The selected case studies utilized diverse methodologies, encompassing both qualitative and quantitative approaches, which introduced complexity in drawing direct comparisons. While thematic categorization provided insights into GAI applications, the absence of a standardized framework for evaluating case study quality remains a limitation. Standardized metrics such as reproducibility, scalability, and evidence-based impact should be adopted in future studies to improve the comparability and reliability of the findings.

In addition, to facilitate the immediate adoption of GAI in higher education, institutions can begin with low-cost, high-impact solutions, such as educational chatbots and automated grading systems. These tools require minimal infrastructure and offer measurable benefits, including real-time feedback, reduced administrative workload, and enhanced student engagement. By starting with these scalable applications, educators and administrators can gradually expand their use of GAI technologies, integrating more advanced systems as familiarity and institutional capacity grow.

To sum up, while GAI technologies are poised to revolutionize higher education by offering more personalized, efficient, and data-driven learning experiences, their successful implementation will depend on addressing the challenges and limitations discussed in this paper. By continuing to refine these technologies, we can unlock their full potential, shaping the future of higher education in ways that benefit both students and educators.

Author Contributions

Conceptualization, W.P. and Z.W.; methodology, W.P. and Z.W.; investigation, W.P. and Z.W.; resources, W.P. and Z.W.; writing—original draft preparation, W.P. and Z.W.; writing—review and editing, W.P. and Z.W.; funding acquisition, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is partially supported by the Fundamental Research Funds for the Central Universities under grant Nos. JG2023-53, J2022-042, and PHD2023-027, the Civil Aviation Professional Project under grant No. MHJY2023027, the Sichuan Science and Technology Program under grant No. 2023YFG0308, the CAAC Academy of Flight Technology and Safety Project under grant No. FZ2022ZZ01, and the 2024 Statistical Education Reform Project under No. 2024JG0226.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable suggestions and constructive feedback, which have significantly improved the quality of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Moore, R.; Caines, A.; Buttery, P. Recurrent Neural Collaborative Filtering for Knowledge Tracing. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Messer, M.; Shi, M.; Brown, N.C.C.; Kölling, M. Grading Documentation with Machine Learning. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Cechinel, C.; Queiroga, E.M.; Primo, T.T.; dos Santos, H.L.; Ramos, V.F.C.; Munoz, R.; Mello, R.F.; Machado, M.F.B. LANSE: A Cloud-Powered Learning Analytics Platform for the Automated Identification of Students at Risk in Learning Management Systems. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Yan, L.; Zhao, L.; Echeverria, V.; Jin, Y.; Alfredo, R.; Li, X.; Gaševi’c, D.; Martinez-Maldonado, R. VizChat: Enhancing Learning Analytics Dashboards with Contextualised Explanations Using Multimodal Generative AI Chatbots. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Stamper, J.; Xiao, R.; Hou, X. Enhancing LLM-Based Feedback: Insights from Intelligent Tutoring Systems and the Learning Sciences. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Machado, G.M.; Soni, A. Predicting Academic Performance: A Comprehensive Electrodermal Activity Study. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Ma, Q.; Shen, H.; Koedinger, K.; Wu, S.T. How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Banjade, R.; Oli, P.; Sajib, M.I.; Rus, V. Identifying Gaps in Students’ Explanations of Code Using LLMs. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Naik, A.; Yin, J.R.; Kamath, A.; Ma, Q.; Wu, S.T.; Murray, C.; Bogart, C.; Sakr, M.; Rose, C.P. Generating Situated Reflection Triggers About Alternative Solution Paths: A Case Study of Generative AI for Computer-Supported Collaborative Learning. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Hwang, H. The Role of First Language in Automated Essay Grading for Second Language Writing. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Sonkar, S.; Ni, K.; Tran Lu, L.; Kincaid, K.; Hutchinson, J.S.; Baraniuk, R.G. Automated Long Answer Grading with RiceChem Dataset. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Dey, I.; Gnesdilow, D.; Passonneau, R.; Puntambekar, S. Potential Pitfalls of False Positives. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Sil, P.; Chaudhuri, P.; Raman, B. Can AI Assistance Aid in the Grading of Handwritten Answer Sheets? In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Scarlatos, A.; Smith, D.; Woodhead, S.; Lan, A. Improving the Validity of Automatically Generated Feedback via Reinforcement Learning. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Liang, Z.; Sha, L.; Tsai, Y.S.; Gašević, D.; Chen, G. Towards the Automated Generation of Readily Applicable Personalised Feedback in Education. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Rosa, B.A.B.; Oliveira, H.; Mello, R.F. Prediction of Essay Cohesion in Portuguese Based on Item Response Theory in Machine Learning. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Fonteles, J.; Davalos, E.; Ashwin, T.S.; Zhang, Y.; Zhou, M.; Ayalon, E.; Lane, A.; Steinberg, S.; Anton, G.; Danish, J.; et al. A First Step in Using Machine Learning Methods to Enhance Interaction Analysis for Embodied Learning Environments. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Villalobos, E.; Pérez-Sanagustín, M.; Broisin, J. From Learning Actions to Dynamics: Characterizing Students’ Individual Temporal Behavior with Sequence Analysis. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Condor, A.; Pardos, Z. Explainable Automatic Grading with Neural Additive Models. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Chen, P.; Fan, Z.; Lu, Y.; Xu, Q. PBChat: Enhance Student’s Problem Behavior Diagnosis with Large Language Model. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Zhan, B.; Guo, T.; Li, X.; Hou, M.; Liang, Q.; Gao, B.; Luo, W.; Liu, Z. Knowledge Tracing as Language Processing: A Large-Scale Autoregressive Paradigm. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Blanchard, E.G.; Mohammed, P. On Cultural Intelligence in LLM-Based Chatbots: Implications for Artificial Intelligence in Education. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Latif, E.; Fang, L.; Ma, P.; Zhai, X. Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
de Souza Cabral, L.; Dwan Pereira, F.; Ferreira Mello, R. Enhancing Algorithmic Fairness in Student Performance Prediction Through Unbiased and Equitable Machine Learning Models. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Pan, K.W.; Jeffries, B.; Koprinska, I. Predicting Successful Programming Submissions Based on Critical Logic Blocks. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Uto, M.; Takahashi, Y. Neural Automated Essay Scoring for Improved Confidence Estimation and Score Prediction Through Integrated Classification and Regression. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Yu, H.; Allessio, D.A.; Rebelsky, W.; Murray, T.; Magee, J.J.; Arroyo, I.; Woolf, B.P.; Bargal, S.A.; Betke, M. Affect Behavior Prediction: Using Transformers and Timing Information to Make Early Predictions of Student Exercise Outcome. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Li, T.; Bian, C.; Wang, N.; Xie, Y.; Chen, K.; Lu, W. Modeling Learner Memory Based on LSTM Autoencoder and Collaborative Filtering. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Aboukacem, A.; Berrada, I.; Bergou, E.H.; Iraqi, Y.; Mekouar, L. Investigating the Predictive Potential of Large Language Models in Student Dropout Prediction. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Miladi, F.; Psyché, V.; Lemire, D. Leveraging GPT-4 for Accuracy in Education: A Comparative Study on Retrieval-Augmented Generation in MOOCs. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Galhardi, L.; Herculano, M.F.; Rodrigues, L.; Miranda, P.; Oliveira, H.; Cordeiro, T.; Bittencourt, I.I.; Isotani, S.; Mello, R.F. Contextual Features for Automatic Essay Scoring in Portuguese. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Aramaki, K.; Uto, M. Collaborative Essay Evaluation with Human and Neural Graders Using Item Response Theory Under a Nonequivalent Groups Design. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Tsutsumi, E.; Nishio, T.; Ueno, M. Deep-IRT with a Temporal Convolutional Network for Reflecting Students’ Long-Term History of Ability Data. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Li, R.; Wang, Y.; Zheng, C.; Jiang, Y.H.; Jiang, B. Generating Contextualized Mathematics Multiple-Choice Questions Utilizing Large Language Models. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Elbouknify, I.; Berrada, I.; Mekouar, L.; Iraqi, Y.; Bergou, E.H.; Belhabib, H.; Nail, Y.; Wardi, S. Student At-Risk Identification and Classification Through Multitask Learning: A Case Study on the Moroccan Education System. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Singh, A.K.; Karthikeyan, S. Heuristic Technique to Find Optimal Learning Rate of LSTM for Predicting Student Dropout Rate. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Li, J.; Xu, M.; Zhou, Y.; Zhang, R. Research on Personalized Hybrid Recommendation System for English Word Learning. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Lamsiyah, S.; El Mahdaouy, A.; Nourbakhsh, A.; Schommer, C. Fine-Tuning a Large Language Model with Reinforcement Learning for Educational Question Generation. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Ma, B.; Chen, L.; Konomi, S. Enhancing Programming Education with ChatGPT: A Case Study on Student Perceptions and Interactions in a Python Course. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Chen, S.; Lan, Y.; Yuan, Z. A Multi-task Automated Assessment System for Essay Scoring. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Sonkar, S.; Liu, N.; Mallick, D.B.; Baraniuk, R.G. Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Feng, H.; Du, S.; Zhu, G.; Zou, Y.; Phua, P.B.; Feng, Y.; Zhong, H.; Shen, Z.; Liu, S. Leveraging Large Language Models for Automated Chinese Essay Scoring. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Yoshida, L. The Impact of Example Selection in Few-Shot Prompting on Automated Essay Scoring Using GPT Models. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Wang, Z.; Koprinska, I.; Jeffries, B. Interpretable Methods for Early Prediction of Student Performance in Programming Courses. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Ghimire, A.; Edwards, J. Coding with AI: How Are Tools Like ChatGPT Being Used by Students in Foundational Programming Courses. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Montella, R.; Giuseppe De Vita, C.; Mellone, G.; Ciricillo, T.; Caramiello, D.; Di Luccio, D.; Kosta, S.; Damasevicius, R.; Maskeliunas, R.; Queiros, R.; et al. GAMAI, an AI-Powered Programming Exercise Gamifier Tool. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Fawzi, F.; Balan, S.; Cukurova, M.; Yilmaz, E.; Bulathwela, S. Towards Human-Like Educational Question Generation with Small Language Models. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Scaria, N.; Dharani Chenna, S.; Subramani, D. Automated Educational Question Generation at Different Bloom’s Skill Levels Using Large Language Models: Strategies and Evaluation. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Hwang, K.; Wang, K.; Alomair, M.; Choa, F.S.; Chen, L.K. Towards Automated Multiple Choice Question Generation and Evaluation: Aligning with Bloom’s Taxonomy. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Yang, K.; Chu, Y.; Darwin, T.; Han, A.; Li, H.; Wen, H.; Copur-Gencturk, Y.; Tang, J.; Liu, H. Content Knowledge Identification with Multi-agent Large Language Models (LLMs). In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Do, H.; Lee, G.G. Aspect-Based Semantic Textual Similarity for Educational Test Items. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Dehbozorgi, N.; Kunuku, M.T.; Pouriyeh, S. Personalized Pedagogy Through a LLM-Based Recommender System. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Moore, S.; Costello, E.; Nguyen, H.A.; Stamper, J. An Automatic Question Usability Evaluation Toolkit. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Saccardi, I.; Masthoff, J. Adapting Emotional Support in Teams: Quality of Contribution, Emotional Stability and Conscientiousness. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Cohn, C.; Snyder, C.; Montenegro, J.; Biswas, G. Towards a Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Schmucker, R.; Xia, M.; Azaria, A.; Mitchell, T. Ruffle &Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Oliveira, E.A.; Mohoni, M.; Rios, S. Towards Explainable Authorship Verification: An Approach to Minimise Academic Misconduct in Higher Education. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Gherardi, E.; Benedetto, L.; Matera, M.; Buttery, P. Using Knowledge Graphs to Improve Question Difficulty Estimation from Text. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Pian, Y.; Li, M.; Lu, Y.; Chen, P. From “Giving a Fish” to “Teaching to Fish”: Enhancing ITS Inner Loops with Large Language Models. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Kwon, C.; King, J.; Carney, J.; Stamper, J. A Schema-Based Approach to the Linkage of Multimodal Learning Sources with Generative AI. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Thareja, R.; Dwivedi, D.; Garg, R.; Baghel, S.; Mohania, M.; Shukla, J. EDEN: Enhanced Database Expansion in eLearning: A Method for Automated Generation of Academic Videos. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Garg, A.; Chaudhury, R.; Godbole, M.; Seo, J.H. Leveraging Language Models and Audio-Driven Dynamic Facial Motion Synthesis: A New Paradigm in AI-Driven Interview Training. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Taneja, K.; Maiti, P.; Kakar, S.; Guruprasad, P.; Rao, S.; Goel, A.K. Jill Watson: A Virtual Teaching Assistant Powered by ChatGPT. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Barany, A.; Nasiar, N.; Porter, C.; Zambrano, A.F.; Andres, A.L.; Bright, D.; Shah, M.; Liu, X.; Gao, S.; Zhang, J.; et al. ChatGPT for Education Research: Exploring the Potential of Large Language Models for Qualitative Codebook Development. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Fang, Y.; He, B.; Liu, Z.; Liu, S.; Yan, Z.; Sun, J. Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Chen, L.; Li, G.; Ma, B.; Tang, C.; Okubo, F.; Shimada, A. How Do Strategies for Using ChatGPT Affect Knowledge Comprehension? In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]
Cui, C.; Abdalla, A.; Wijaya, D.; Solberg, S.; Bargal, S.A. Large Language Models for Career Readiness Prediction. In Proceedings of the 25th International Conference, AIED 2024, Recife, Brazil, 8–12 July 2024. [Google Scholar]

Figure 1. A word cloud representing the key technologies and methods of GAI in higher education.

Figure 2. Process of paper selection.

Figure 3. Distribution of GAI technologies across applications.

Figure 4. Concept map of relationships between GAI technologies and applications.

Figure 5. Proportion of successful outcomes and failures in GAI applications.

Table 1. Related technology, representative references, and use cases.

Technology	Representative References	Use Cases
Sequence Analysis, Temporal Analysis	[18]	Applied in analyzing temporal patterns of student behavior and engagement across multiple learning sessions, helping to identify trends and predict potential performance issues or dropout risks.
Neural Additive Models (NAMs)	[19]	Utilized in building predictive models that are both accurate and interpretable, helping instructors understand the contributing factors behind student success and performance outcomes.
Explainable AI (XAI)	[15,16]	Employed to make AI-driven predictions (e.g., in automated grading or student success models) more understandable and transparent, enabling instructors to trust and act on AI recommendations.
ChatGLM2 Model, QLoRA Algorithm	[20]	Applied in diagnosing and predicting student problem behaviors, providing a personalized learning assistant via chat, and offering tailored interventions to improve student outcomes.
Multimodal Learning Analytics (MMLA)	[17]	Combines data from different learning platforms (e.g., learning management systems, student interactions) to track performance, predict engagement levels, and deliver tailored interventions.
Large Language Models (LLMs)	[21,22,23]	Used for knowledge tracing, automated essay grading, and providing personalized learning experiences through adaptive conversational agents, such as ChatGPT, enhancing student engagement and feedback.
Knowledge Tracing (KT)	[1]	Used in tracking and predicting student knowledge progression, identifying gaps in understanding, and providing timely interventions for at-risk students, often incorporated in intelligent tutoring systems.
Machine Learning (ML)	[2,10]	Applied in various educational contexts to predict student success, personalize learning pathways, and optimize learning materials based on past performance, thus enhancing student retention.
Decision Trees (DTs), Random Forests (RFs)	[24,25]	Used for predicting student success, dropout rates, and performance in assessments by analyzing historical student data, enabling early interventions for struggling students.
Transformers (BERT, RoBERTa, TinyBERT)	[26,27]	BERT-based models are applied in automated essay grading, improving the accuracy of text comprehension tasks and enhancing the evaluation of written assignments in real time.
Collaborative Filtering (CF)	[28]	Used in adaptive learning systems and recommendation engines to suggest personalized learning content, resources, and activities based on student preferences and learning behaviors.
Knowledge Graphs (KGs)	[29]	Employed to model the relationships between various learning concepts and student data, improving question difficulty estimation and enhancing dropout prediction by capturing contextual information.
Reinforcement Learning (RL)	[14],	Applied in adaptive learning environments where AI models learn from student interactions, optimizing learning paths and predicting when and how to provide interventions for improved outcomes.
Retrieval-Augmented Generation (RAG)	[29,30]	Used in dropout prediction models and personalized learning systems, where the AI model retrieves relevant learning materials to generate tailored recommendations for at-risk students.
Automated Essay Grading (AEG)	[31,32]	Implemented in AI-based grading systems to evaluate student essays efficiently, offering real-time feedback on content quality, grammar, structure, and relevance while reducing instructor workload.
Prescriptive Learning Analytics (PLA)	[15]	Combined with XAI, prescriptive analytics offer actionable insights to educators, suggesting personalized interventions to improve student outcomes, particularly for struggling learners.
Item Response Theory (IRT)	[16,33]	Applied in assessing student proficiency on standardized tests or quizzes, analyzing responses to predict overall mastery levels, and informing personalized learning interventions.
Natural Language Processing (NLP)	[11,12,13]	Used in automated feedback generation, essay grading, and student sentiment analysis, enhancing interactions with AI tools and enabling real-time academic support based on student input.
LangChain Framework	[34]	Utilized for integrating various generative AI tools (e.g., GPT-4, knowledge graphs) into a cohesive framework for personalized learning, adaptive feedback systems, and interactive student assistants.
XGBoost	[35]	Applied in predictive models for student success, dropout risk, and academic performance by analyzing complex student datasets and making highly accurate predictions.
Long Short-Term Memory (LSTM)	[36,37]	Employed in sequence prediction tasks, such as analyzing student performance trends and predicting future learning trajectories based on historical engagement and test results.

Table 2. Summary of the inclusion criteria.

Criterion	Description
Originality	Novel contributions to AI in education.
Relevance	Addressing significant challenges in higher education.
Methodological Rigor	Clear research designs, reproducible results, and valid metrics.
Impact	Addressing critical gaps or advancing the field of GAI in education.

Table 3. Overview of the selected case studies.

Case Study Title	Technology Used	Application
Leveraging Generative AI for Enhancing Student Engagement and Behavior Analysis	Multimodal Learning Analytics (MMLA), PBChat	Analyzing student behavior and improving engagement
Advancing Educational Assessment with Generative AI	Large Language Models (GPT-4), NLP	Automated essay grading and personalized feedback
Advancements in Predictive Modeling for Student Performance	ML (Random Forests), LLMs	Knowledge tracing and student performance prediction
Generative AI in Enhancing Programming Education	Programming-Specific GAI Tools	Debugging, code comprehension, and automated feedback
Advancements in GAI for Educational Question Generation	LLMs, Small Language Models (sLMs)	Generating diverse educational questions
GAI for Educators: Enhancing Decision-Making and Teaching Practices	GPT, NLP	Improving assessment and streamlining administrative tasks
GAI for Learners: Enhancing Personalized Learning and Educational Experiences	Intelligent Tutoring Systems (ITSs), LLMs	Personalized learning and adaptive feedback
Generative AI for Student Success: Early Detection and Career Readiness Prediction	Multitasking AI Models, Career Readiness Models	Early identification of at-risk students and career readiness prediction

Table 4. Applications of GAI in enhancing student engagement and behavior analysis.

Study	Key Focus	Key Findings and Innovations
[18,20]	Enhancing Self-Regulated Learning with LADs and Diagnosing Student Behaviors with PBChat	LADs support SRL by providing real-time insights into learning behaviors. PBChat uses GAI for diagnosing student behaviors, offering systematic and scalable interventions.
[17]	Multimodal Learning Analytics in Mixed-Reality Educational Environments	Integration of ML and MMLA in a mixed-reality setting provides real-time insights into student behaviors, enhancing engagement in scientific learning tasks, like photosynthesis.

Table 5. Advancing educational assessment with GAI.

Study	Study Focus	Key Findings and Innovations
[19]	Grading Transparency	NAMs improve grading accuracy and offer more interpretable AI decisions, enhancing AI integration in educational systems.
[2,40]	Assessment in Programming and Feedback	GAI automates the assessment of code and documentation quality, offering personalized, real-time feedback through models like BERT and GPT.
[10,11,41,42]	Complex Grading and Linguistic Diversity	GAI improves long-answer grading and addresses linguistic diversity in essay grading, providing equitable assessments across languages.
[26,31]	Essay Scoring and Confidence Estimation	Integrating IRT and confidence estimation enhances essay scoring accuracy, making GAI more reliable in high-stakes assessments.
[12,13,23]	Grading Efficiency and Feedback Optimization	AI-assisted grading systems streamline grading processes and reduce time, improving feedback quality in large-scale assessments.
[16,43]	Feedback and Scoring with Example Selection	Careful example selection improves AI feedback accuracy and reduces biases in essay scoring systems.

Table 6. Advancements in predictive modeling for student performance.

Study	Study Focus	Key Findings and Innovations
[1,21]	Knowledge Tracing with LLMs and RNCF	LLMs and RNCF improve student performance prediction by enhancing personalization, scalability, and adapting to real-time learning data.
[6,44]	Data-Driven and Physiological Prediction	Data-driven models and electrodermal activity (EDA) signals offer accurate predictions of student performance, identifying stress and enabling early interventions.
[24]	Algorithmic Fairness in Prediction	Efforts to mitigate biases in performance prediction models contribute to more equitable and interpretable results across student demographics.
[27,32]	IRT Integration and Early Prediction	IRT integration in AES improves performance comparison, while affective computing enables early prediction of student success, refining real-time interventions.

Table 7. GAI in enhancing programming education.

Study	Study Focus	Key Findings and Innovations
[7,45]	Enhancing Debugging Skills and Conceptual Support	Generative AI tools improve debugging skills and provide on-demand support, fostering student-centered learning and enhancing engagement through real-time assistance.
[8,25]	Assessing Code Explanations and Logic Block Analysis	LLMs and AI techniques, such as abstract syntax trees (ASTs), are used for evaluating code explanations and providing personalized feedback, significantly improving student understanding of programming concepts.
[39,46]	ChatGPT and Gamified Exercises	ChatGPT and GPT-based models enhance learning experiences through interactive feedback, gamified programming tasks, and code generation, although further refinement is needed to optimize long-term interactions and content quality.
[9]	Dynamic Feedback in Collaborative Programming	The use of ChatGPT in collaborative programming exercises supports dynamic student interactions, improving the learning process by integrating contextualized feedback and reflection triggers.

Table 8. Advancements in generative AI for educational question generation.

Study	Study Focus	Key Findings and Innovations
[47]	LLMs vs. sLMs for Question Generation	Compares large and small language models for generating educational questions, balancing efficiency with ethical concerns.
[38]	Reinforcement Learning for Question Generation	Demonstrates reinforcement learning’s role in improving question generation by addressing biases and inconsistencies.
[30]	RAG for Enhanced Content Accuracy	Integrates retrieval-augmented generation (RAG) with GPT to improve accuracy in content generation, reducing hallucinations.
[34]	Contextualized MCQs for Mathematics	Uses AI to generate contextually relevant multiple-choice questions (MCQs) for mathematics, reducing educator workload.
[48]	Evaluating LLMs for Bloom’s Taxonomy	Assesses LLMs for creating questions aligned with Bloom’s Taxonomy, improving pedagogical relevance across cognitive levels.
[49]	GPT-4 for Bloom’s Taxonomy-Aligned MCQs	Evaluates GPT-4 in generating Bloom’s Taxonomy-based MCQs for biology, finding room for improvement in higher cognitive levels.

Table 9. Generative AI for educators.

Study	Study Focus	Key Findings and Innovations
[4,50]	Enhancing Decision Making and Teacher Development	VizChat improves Learning Analytics Dashboards (LADs) with context-sensitive explanations (GPT-4V, RAG). LLMAgent-CK assesses teachers’ content knowledge, improving evaluation accuracy without extensive labeled datasets.
[51,52]	Automating Test Evaluation and Personalized Pedagogical Recommendations	Uses AI (LLMs, RAG) to automate exam quality control and personalize teaching practices via PDP recommender systems, improving teaching and assessment processes.
[53]	AI for Quality Control in MCQ Design	SAQUET automates flaw detection in multiple-choice questions using NLP, ensuring high-quality, pedagogically sound assessments.
[28,54]	Enhancing Personalized Learning and Peer Feedback	Uses LSTM autoencoder for memory modeling to enhance adaptive learning. AI-enhanced peer feedback tailors support based on personality traits, improving collaborative learning.
[55,56]	Analyzing Collaborative Learning and Conversational AI	GPT-4-Turbo aids in analyzing interdisciplinary collaboration in STEM. Conversational AI automates tutoring interactions, enhancing student engagement in biology education.
[57,58]	Ensuring Academic Integrity and Optimizing Question Difficulty	Enhanced authorship verification (AV) detects AI-generated content. Knowledge graphs (KGs) improve question difficulty prediction, contributing to personalized learning and fair assessments.

Table 10. GAI for learners.

Study	Study Focus	Key Findings and Innovations
[22,59]	Learner Engagement and Cultural Intelligence	LLMs used in ITS to generate dynamic scaffolding for enhanced engagement and critical thinking. Chatbots with cultural intelligence tailored personalized learning experiences.
[60,61]	Task-Based Learning and Video Enhancement	Integrated text- and video-based learning for complex tasks. EDEN system expands online content with AI-generated videos, ensuring high-quality resources.
[62]	Conversational AI for Healthcare Education	AI chatbot simulating courtroom scenarios for nursing students, providing immersive and personalized training.
[14,15]	Reinforcement Learning for Feedback	Applied reinforcement learning to provide personalized feedback, improving engagement and performance. PLA with XAI for scalable feedback in large courses.
[63,64]	AI-Powered Conversational Assistants	Jill Watson AI assistant supports online classrooms, increasing engagement. Hybrid AI–human collaboration for more efficient educational data coding.
[37,65]	Personalized Learning and ITS	AI system for personalized vocabulary learning and ITS for advanced math, improving feedback and learning outcomes.
[5,66]	Scalable Feedback and Knowledge Comprehension	LLMs generate scalable feedback in ITS. ChatGPT’s impact on students’ knowledge comprehension through effective AI use.

Table 11. GAI for student success.

Study	Study Focus	Key Findings and Innovations
[3,29,35]	At-Risk Student Detection and Dropout Prediction	Multitasking AI models and LLMs predict at-risk students and dropout risks with high accuracy using data analytics and AI-enhanced learning models.
[36]	Dropout Prediction using ACO-LSTM	ACO-LSTM approach improves dropout prediction accuracy by optimizing the LSTM model’s learning rate, offering better retention predictions.
[67]	Career Readiness Prediction using LLMs	LLMs predict career readiness based on students’ narrative responses, analyzing identity status and preparing students for workforce entry.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pang, W.; Wei, Z. Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations. Information 2025, 16, 95. https://doi.org/10.3390/info16020095

AMA Style

Pang W, Wei Z. Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations. Information. 2025; 16(2):95. https://doi.org/10.3390/info16020095

Chicago/Turabian Style

Pang, Weina, and Zhe Wei. 2025. "Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations" Information 16, no. 2: 95. https://doi.org/10.3390/info16020095

APA Style

Pang, W., & Wei, Z. (2025). Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations. Information, 16(2), 95. https://doi.org/10.3390/info16020095

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Shaping the Future of Higher Education: A Technology Usage Study on Generative AI Innovations

Abstract

1. Introduction

2. Technological Foundations of GAI in Higher Education

2.1. Large Language Models (LLMs): A Foundation for Interaction

2.2. Machine Learning (ML): Unlocking Patterns and Predictions

2.3. Natural Language Processing (NLP): Bridging Communication Gaps

2.4. Reinforcement Learning (RL): Adaptive and Interactive Learning Systems

2.5. Explainable AI (XAI): Building Transparency and Trust

2.6. Multimodal Learning Analytics (MMLA): Integrating Diverse Data Sources

3. Methodological Framework for Analyzing GAI in Higher Education

3.1. Rationale for the Review in the Context of Existing Knowledge

3.2. Criteria for Inclusion in the Conference and Review

3.2.1. AIED 2024 Conference Inclusion Criteria

3.2.2. Review Inclusion Criteria

3.3. Systematic Review of the Academic Literature

3.3.1. Search Strategy

3.3.2. Screening and Selection

3.3.3. Synthesis of Findings

3.4. Analytical Framework for Evaluating Case Studies

3.4.1. Selection Criteria

3.4.2. Evaluation Framework

4. Case Studies and Applications of GAI in Higher Education

4.1. Leveraging Generative AI for Enhancing Student Engagement and Behavior Analysis

4.1.1. Enhancing Self-Regulated Learning with Learning Analytics Dashboards (LADs)

4.1.2. Diagnosing Student Problem Behaviors with PBChat

4.1.3. Multimodal Learning Analytics in Mixed-Reality Educational Environments

4.2. Advancing Educational Assessment with Generative AI

4.2.1. Enhancing Grading Transparency and Interpretability

4.2.2. Improving Assessment in Programming Education

4.2.3. Improving Complex Grading Tasks with NLP

4.2.4. Categorizing and Evaluating Student Responses

4.2.5. Advancing Essay Scoring with Large Language Models

4.2.6. Refining Automated Feedback Mechanisms

4.2.7. Addressing Linguistic Diversity in Essay Grading

4.2.8. Enhancing Essay Scoring Accuracy with Contextual Features

4.2.9. Improving Confidence Estimation in AES

4.2.10. Improving AI Feedback with Error Handling

4.2.11. Optimizing AI Models for Scoring Efficiency

4.2.12. Enhancing Text Cohesion Assessment with AI

4.2.13. Optimizing Grading Efficiency with AI

4.2.14. Improving AI Scoring Accuracy Through Example Selection

4.3. Advancements in Predictive Modeling for Student Performance

4.3.1. Knowledge Tracing with Large Language Models (LLMs)

4.3.2. Recurrent Neural Collaborative Filtering (RNCF) for Knowledge Tracing

4.3.3. Data-Driven Prediction of Student Performance

4.3.4. Physiological Signals for Performance Prediction

4.3.5. Algorithmic Fairness in Performance Prediction

4.3.6. Predicting Performance in Automated Essay Scoring with IRT Integration

4.3.7. Early Prediction in Intelligent Tutoring Systems

4.4. Generative AI in Enhancing Programming Education: Tools, Methods, and Insights

4.4.1. Enhancing Debugging Skills Through LLMs

4.4.2. Generative AI for Conceptual Support and Debugging

4.4.3. Assessing Code Explanations with Semantic Similarity

4.4.4. Logic Block Analysis for Feedback in Programming Submissions

4.4.5. ChatGPT’s Role in Programming Education

4.4.6. Gamified Programming Exercises with Generative AI

4.4.7. Dynamic Feedback in Collaborative Programming

4.5. Advancements in GAI for Educational Question Generation

4.5.1. Large vs. Small Language Models for Educational Question Generation

4.5.2. Reinforcement Learning for Educational Question Generation

4.5.3. Improving Content Generation with Retrieval-Augmented Generation (RAG)

4.5.4. Contextualized Multiple-Choice Question Generation for Mathematics

4.5.5. Evaluating LLMs for Bloom’s Taxonomy-Based Question Generation

4.5.6. GPT-4 for Bloom’s Taxonomy-Aligned MCQ Generation in Biology

4.6. GAI for Educators: Enhancing Decision Making and Teaching Practices

4.6.1. Enhancing Decision Making with AI-Driven Learning Analytics

4.6.2. Improving Teacher Professional Development with AI Frameworks

4.6.3. Automating Educational Test Item Evaluation with AI

4.6.4. Personalized Pedagogical Recommendations with AI

4.6.5. AI for Quality Control in Multiple-Choice Question Design

4.6.6. Enhancing Personalized Learning with Memory Modeling

4.6.7. AI-Enhanced Peer Feedback for Collaborative Learning

4.6.8. Analyzing Collaborative Learning with AI in Interdisciplinary Contexts

4.6.9. Conversational AI for Engaging Biology Education

4.6.10. Ensuring Academic Integrity with AI-Generated Content Detection

4.6.11. Optimizing Question Difficulty Estimation with Knowledge Graphs

4.7. GAI for Learners: Enhancing Personalized Learning and Educational Experiences

4.7.1. Improving Learner Engagement with AI-Generated Scaffoldings in ITS