Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model

Mohanraj, Swathieswari; Pichai, Shanmugavadivu

doi:10.3390/asi8030082

Open AccessArticle

Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model

by

Swathieswari Mohanraj

^* and

Shanmugavadivu Pichai

^*

Gandhigram Rural Institute, Dindigul 624302, Tamilnadu, India

^*

Authors to whom correspondence should be addressed.

Appl. Syst. Innov. 2025, 8(3), 82; https://doi.org/10.3390/asi8030082

Submission received: 24 March 2025 / Revised: 24 May 2025 / Accepted: 26 May 2025 / Published: 16 June 2025

(This article belongs to the Topic Social Sciences and Intelligence Management, 2nd Volume)

Download

Browse Figures

Versions Notes

Abstract

:

The progressive advancements in education due to the advent of transformative technologies has led to the emergence of customized/personalized learning systems that dynamically adapts to an individual learner’s preferences in real-time mode. The learning route and style of every learner is unique and their understanding varies with the complexity of core components. This paper presents a hybrid approach that integrates generative adversarial networks (GANs), feedback-driven personalization, explainable artificial intelligence (XAI) to enhance knowledge component (KC) prediction and to improve learner outcomes as well as to attain progress in learning. By using these technologies, this proposed system addresses the challenges, namely, adapting educational content to an individual’s requirements, creating high-quality content based on a learner’s profile, and implementing transparency in decision-making. The proposed framework starts with a powerful feedback mechanism to capture both explicit and implicit signals from learners, including performance parameters viz., time spent on tasks, and satisfaction ratings. By analysing these signals, the system vigorously adapts to each learner’s needs and preferences, ensuring personalized and efficient learning. This hybrid model dynamic knowledge component prediction system (DKPS) exhibits a 35% refinement in content relevance and learner engagement, compared to the conventional methods. Using generative adversarial networks (GANs) for content creation, the time required to produce high-quality learning materials is reduced by 40%. The proposed technique has further scope for enhancement by incorporating multimedia content, such as videos and concept-based infographics, to give learners a more extensive understanding of concepts.

Keywords:

personalized learning; artificial general intelligence (AGI); knowledge component; generative adversarial networks; GAN; explainable AI; XAI; feedback-driven personalized learning system; personalised learning path (PLP)

1. Introduction

Personalized learning has become a major focus in digital education, especially with the increasing availability of online and adaptive learning platforms. Students today seek learning experiences that are tailored to their pace, preferences, and current level of understanding [1,2]. Personalized learning paths help achieve this by guiding each learner through content in an order that best supports their progress. However, it is difficult for teachers to manually create unique learning routes for every student, particularly in large classrooms with diverse learning needs.

This research introduces an AI-driven personalized learning system designed specifically for secondary school students (ages 13–17). The system currently focuses on the subject of computer science, covering topics such as programming basics, data structures, algorithms, and logic building. Each topic is broken down into modular concepts aligned with standard school curricula, making the system suitable for structured learning environments such as classrooms and online tutoring platforms.

To create personalized pathways, the system uses a knowledge graph, where each concept is represented as a node and prerequisite relationships are represented as edges. This structure enables the system to deliver content in a logical and adaptive sequence, rather than randomly [3]. Students undergo an initial diagnostic assessment, and based on their performance, they are guided through a customized learning path that addresses knowledge gaps while reinforcing foundational concepts [4].

The system combines both machine learning and deep learning models to enhance prediction and personalization. XGBoost, a powerful ensemble ML algorithm, is used to identify which knowledge components a learner is likely to struggle with, based on historical data. In parallel, gated recurrent units (GRUs), a type of recurrent neural network (RNN), are employed to model temporal learning patterns and track how learners improve over time [5]. This hybrid approach allows the system to adapt content not only to what learners know but also to how they learn [6].

To generate personalized learning materials, the system employs generative adversarial networks (GANs), which have been effectively used in education for tasks such as generating practice questions, visual learning aids, and even datasets for training models [7,8,9,10]. The generated content is aligned with the learner’s current level and target concepts, providing meaningful practice and reinforcing mastery.

Feedback is another critical element of learning. However, many learners report dissatisfaction with automated feedback in online learning environments. Prior work analyzing over 17,000 instructor comments revealed 11 types of feedback that impact learning in competency-based models [11]. To improve transparency and learner confidence, our system incorporates explainable AI (XAI) tools like Shapley additive explanations (SHAP) and local interpretable model-agnostic explanations (LIME), which allow users to understand why certain modules are recommended and how predictions are made [12,13].

Despite significant advancements, current systems still face limitations. Many do not dynamically update content based on real-time student feedback, lack transparency in their decision-making processes, and treat all learner behaviours equally—without distinguishing between explicit signals (what students know) and implicit signals (how they learn) [14,15]. The proposed system addresses these limitations by integrating predictive modelling, dynamic content generation, and interpretable AI into one unified platform. Separating explicit and implicit signals makes learner modelling clearer and more actionable [16,17,18,19]. The convergence of generative AI and adaptive learning is evolving rapidly, presenting both significant opportunities and notable challenges [20]. While numerous researchers have explored adaptive e-learning systems focusing on individual criteria—such as learning styles, cognitive preferences, knowledge levels, or behavioral traits—only a limited number of studies have integrated multiple learning factors for personalization [21]. Recommendation models often utilize a hybrid of collaborative and content-based filtering techniques to produce top-N learning suggestions based on user ratings [22]. Research on learning paths offers valuable insights into student behavior during the learning process, as each learner possesses a distinct knowledge structure shaped by their prior experiences and capabilities [23]. Adaptive educational systems are designed to personalize content delivery and learning pathways, helping to reduce learner disorientation and cognitive overload, thereby enhancing overall learning efficiency [24]. Individual learning preferences, influenced by personality and environmental factors, vary widely; for instance, some students grasp concepts better through verbal instructions, while others benefit more from hands-on experience [25]. Delivering learning materials and activities tailored to these preferences significantly improves learner satisfaction, enhances academic performance, and increases time efficiency [26,27]. Personalized assessments further contribute to student motivation and engagement by allowing learners to demonstrate their true competencies [28].

Explainable AI methods tailored for generative AI (GenXAI) provide transparency into AI decisions, offering interpretability at both the individual prediction level and the model-wide level. Traditionally, such explanations have served to enhance trust, support decision-making, and facilitate model debugging [29]. Evolutionary algorithms have been applied to the learning path generation problem by framing it as an optimization task that maximizes the compatibility between the learner's profile and the proposed path [30]. Generative AI, unlike conventional AI models focused on classification or regression, is capable of producing new data outputs such as images, audio, or text by leveraging machine learning and deep learning methodologies [30]. Since the launch of ChatGPT in November 2022, public interest in the application of generative AI within education has grown substantially, prompting ongoing investigation into its educational impact and benefits [31]. AI-powered intelligent assistant (AIIA) systems utilize advanced natural language processing and AI technologies to create dynamic, interactive learning environments [32]. However, as generative AI tools are still emerging in the context of higher education, substantial gaps exist in the literature concerning effective implementation strategies and best practices for adoption [33,34]. Among GenAI applications, large language model chatbots have gained popularity in self-regulated learning (SRL) research, enabling learners to seek information through conversational interaction [35]. Students exposed to various versions of learning materials reported higher engagement; although traditional versions were predominantly used, learners found the diversified formats inspiring and expressed a desire to see them more widely integrated in future learning environments [36].

This research presents a system for predicting dynamic learning modules using both explicit and implicit learner signals as mentioned in the Table 1, supported by XGBoost and GRUs, alongside GAN-based content generation and XAI-enabled transparency. Each learning module is represented as a node in a knowledge graph, with edges guiding learners through a coherent and adaptive sequence. The system ensures mastery at each stage, while maintaining interpretability and personalization.

This study is guided by the following research questions:

How can machine learning and deep learning models such as XGBoost and GRUs be used to accurately predict a learner’s mastery of specific knowledge components?
How effective are generative adversarial networks (GANs) in generating high-quality, learner-specific assessment content?
How can explainable AI (XAI) techniques improve transparency and user trust in AI-driven personalized learning systems?

The remainder of the paper is organized as follows: Section 2 reviews related work. Section 3 outlines the methodology and system architecture. Section 4 presents the experimental setup and datasets used. Section 5 and Section 6 summarize the results, draw conclusions, and suggest future directions for research.

2. Related Work

Adaptive learning provides various forms of personalization, such as customized interfaces [37], tailored learning content [38], and individualized learning paths [39]. When the goal of research is to analyse how learners interact with different types of learning materials, personalized learning content can be a useful choice [1]. The KC approach helps identify many features of educational data. The phenomenon called the curse of dimensionality has weaker explanatory power for the resulting models if relationships between concepts are analysed without selecting key features [40]. It is crucial to select the most relevant features for successful research [2].

The needs and characteristics of a learner’s learning style (LSs) are based on how well they classified and gathered information for the success of adaptive learning systems. To create adaptive, intelligent learning environments, these data were used [41]. Questionnaires to determine students’ LSs have remarkable drawbacks [42]. First of all, filling out questionnaires takes time [42]. The results may be inaccurate, since students might not fully understand their learning preferences, leading to inconsistent answers [42]. Lastly, LSs can change over time, but questionnaires provide only static results [42].

AI techniques have been used to automatically detect LSs to address these issues [41,42,43] providing methods that are more effective than questionnaires and that can adapt to changes in students’ learning behaviours [42]. Optimizing their learning process and upgrading the overall e-learning experience by using machine learning (ML) algorithms, these approaches automatically map students’ behaviour to specific LSs [42,44].

The extensive feedback during training, offering more insights than a single value, is provided by such advanced systems as XAI-GAN, which uses explainable AI (XAI) [45]. There is an increasing need for models’ decisions to be understandable by users, stakeholders, and decision-makers as AI models grow in complexity. Explainability is essential for scientific coherence and trust in AI systems [45]. The federated learning method is another promising development based on co-training and GANs. It supports/allows each client to independently design and train their own model without sharing its structure or parameters with others. In experiments, this method exceeds the existing ones by 42% in test accuracy, even when model architectures and data distributions varied significantly [14]. Based on previous work that used a single dataset to predict learning paths, this system takes a dynamic approach to target module prediction. This improves learner engagement and optimizes the learning experience.

Explicit and implicit are the two types of input signals in the proposed system. Models like random forest, logistic regression, and neural networks are suited for the structured signals (i.e., explicit signals). Models like recurrent neural networks (RNNs), long short-term memory (LSTM), or bidirectional encoder representations from transformers (BERT) are better suited for implicit signals involving sequential data, such as learning trends over time. This proposed system uses a weighted ensemble method to ensure accuracy to combine the results from both types of data. An XAI layer is added to improve transparency and interpretability. For generating content within the target module in this system, it utilizes GANs, which also help in gathering valuable feedback. The detailed review of personalised learning path prediction using different learner characteristics and numbers of parameters to implement it dynamically is shown in Table 2.

3. Materials and Methodology

3.1. Comparative Study of Choosing Model Pipelines

As mentioned in the Figure 1, in this proposed personalized learning system, predicting the target module requires analysing both explicit signals (structured data) and implicit signals (sequential data). To achieve this, we evaluated several machine learning models and selected XGBoost for explicit signals and GRUs for implicit signals due to their superior performance, efficiency, and suitability. The following section details the rationale for the selection of models for the experimental study.

3.1.1. Suitability for Explicit Signals

Explicit signals include pre-test scores, post-test scores, satisfaction ratings, and module preferences, which are best processed using models designed for tabular data as mentioned in Table 3 as follows.

3.1.2. Reason Behind Selecting XGBoost

XGBoost outperformed other models due to its ability to handle complex feature interactions, scalability, and robustness. It also provided interpretable insights into the importance of explicit signals, making it ideal for structured data as mentioned in Figure 2 and Figure 3.

3.1.3. Suitability for Implicit Signals

Implicit signals, namely time spent on modules, click patterns, and engagement trends, are sequential and exhibit temporal dependencies. The following models were evaluated as mentioned in Table 4.

3.1.4. Reasons Behind Selecting GRUs

GRUs provided a good balance between accuracy and efficiency, effectively capturing temporal dependencies while being computationally less demanding than LSTM and transformers. This made GRUs suitable for real-time personalized learning systems. A few suitable models were selected and compared, with accuracy and statistical results as mentioned below in Figure 4 and Figure 5, the GRU model outperformed the other models.

3.1.5. Hybrid Approach: Combining XGBoost and GRUs

Given the distinct nature of explicit and implicit signals, no single model could handle both effectively. Based on the nature of the signals, the system selected two models, XGBoost and GRUs as mentioned in Table 5, to process explicit and implicit signals, respectively. Thus, a hybrid approach was adopted.

The predictions from both models were combined using a weighted ensemble approach, leading to improved accuracy and robust target module recommendations.

3.1.6. Measuring Performance Metrics

The models were validated using metrics such as accuracy, precision, recall, and F1-score. The results demonstrated that the XGBoost + GRU pipeline consistently outperformed alternative combinations, offering higher efficiency and accuracy as mentioned in Table 6 and depicted in the Figure 6.

3.2. Signal Categorization in Personalized Learning Systems

In this proposed system, signals are categorized into two types: explicit signals and implicit signals, which together form the foundation for constructing an accurate learner profile and predicting the next optimal learning path. Explicit signals, such as quiz scores, performance metrics, and direct feedback, provide clear and measurable data on the learner’s current knowledge and achievements. These signals are straightforward to process and help to identify knowledge gaps and overall performance. Implicit signals are derived from the learner’s behaviour and interaction patterns, such as time spent on modules, clickstream data, and study habits. These temporal and dynamic signals reveal how the learner engages with the material, offering deeper insights into their learning style, preferences, and challenges. To effectively use both types of signals, the system integrates them in a meaningful way using advanced machine learning models. This combined data is then used to predict the next learning module, ensuring that recommendations align with the learner’s current abilities and learning goals. By integrating these signals, the system dynamically adapts to the learner’s evolving needs, offering a highly personalized and effective learning experience. Details of different signals and their names are given below in Table 7.

3.3. Preprocessing the Categorized Signals

Preprocessing is a crucial step in preparing the explicit and implicit signals for ML models. It ensures that the data are clean, structured, and ready for effective analysis. The preprocessing techniques differ for explicit (structured) and implicit (sequential) signals due to the nature of the data.

3.3.1. Preprocessing Explicit Signals

Explicit signals are structured data such as pre-test scores, post-test scores, and satisfaction ratings. These signals require standard data cleaning and transformation techniques.

3.3.2. Steps in Preprocessing Explicit Signals

Preprocessing involves steps such as data cleaning, transformation, and feature engineering, which are the essential steps as mentioned in Table 8 to ensure data quality and readiness for analysis. Data cleaning is performed to handle missing values by imputing them with averages or frequent categories and removing outliers used to standardize the data format. Then, normalization and scaling, involved in transformation, are applied to ensure numerical values such as scores and durations are uniform, typically within a range of 0–1 as mentioned in Equation (1). For categorical data, conversion techniques are used to convert textual feedback into numerical formats.

Normalization:

{val}^{'} = val - \min_{val} x / \max_{val} x - \min_{val} x

(1)

where

$val$ : The original value of the feature.
$\min_{val} x$ : The minimum value of the feature in the dataset.
$\max_{val} x$ : The maximum value of the feature in the dataset.
val′: The normalized value.

3.3.3. Features Captured from Explicit Signals

Feature engineering is used to create additional insights to find the learner’s score improvement by using Equation (2), which uses the difference between post-test and pre-test. This identifies a learner’s progress in the specific area of the domain. After preprocessing the explicit data, the cleaned, transformed, and featured data will be updated as presented in Table 9.

Score improvement:

Score Improvement = Post - test - Pre - test

(2)

Insights from the dataset:

Learners 1, 7, 9, and 10 showed consistent or exceptional improvement with high satisfaction ratings, benefiting from tailored content and valid predictions.
Learners 2, 5, and 8 demonstrated steady improvement, though they could benefit from advanced challenges or personalized support.
Learners 3, 4, and 6 had lower improvements or incomplete modules. These learners require additional support through foundational reinforcements, intermediate modules, or engaging content.

3.3.4. Preprocessing Implicit Signals

Implicit signals are sequential data such as time spent on modules, click patterns, and engagement trends. These signals require preprocessing techniques suitable for time-series data.

3.3.5. Steps in Preprocessing Implicit Signals

Preprocessing implicit signals is a critical step in ensuring that the raw behavioural data collected from learners is structured, meaningful, and ready for analysis or machine learning models. This process begins with the data cleaning and is performed to handle missing values and remove irrelevant or redundant interactions. Transformation is performed to convert unstructured behavioural data into structured data. Implicit signals such as time spent on tasks, click count, and revisit frequency often exist in raw inconsistent forms. Feature engineering is used to create higher order metrics such as time spent, click rate, retries, and engagement rate, which are calculated using Equations (3)–(6). All of these steps are performed as described in Table 10.

3.3.6. Features Captured from Implicit Signals

To understand learner behaviour in terms of time spent, retries to complete the specific module, clicks that the learner has used to complete the module, and interactions, Equations (3)–(6) are used. Finally, the sequence of data will be padded for further processing by the ML model. After capturing these features, the corresponding data will be updated as mentioned in Table 11 and as follows.

Time spent: The amount of time spent per task.

Time Spent per Task = \frac{Total Time Spent}{Number of Tasks}

(3)

Retries (normalised): The number of times the test has been tried.

Retries = \frac{Retries}{m a x (Retries)}

(4)

Engagement rate: The interaction level and their attentiveness can be calculated using

Engagement Rate = \frac{Interactions}{Total Time Spent}

(5)

Clickstream data: The number of clicks used in the specific module.

Click Rate = \frac{Total Clicks}{Total Time Spent}

(6)

Insights from the dataset:

Learners 3, 7, and 10 demonstrated strong engagement trends with significant time spent, high click counts, and frequent revisits. These learners are ready for advanced topics and challenges.
Learners 1, 5, and 8 showed consistent engagement. Providing tailored resources can help them improve their readiness for more complex modules.
Learners 2, 4, 6, and 9 had minimal interactions, lower revisit counts, and limited time spent. These learners need targeted strategies to boost engagement and improve outcomes.

3.3.7. Combined Preprocessing for Hybrid Model

Since the hybrid model uses both XGBoost (for explicit signals) and GRUs (for implicit signals), preprocessing must align with the requirements of each algorithm. Based on the suitability of the data, both signals were preprocessed by corresponding models and the final data were updated as described in Table 12.

Insights from the dataset:

Learners 1, 7, 9, and 10 achieved the highest combined effectiveness, driven by strong engagement and explicit improvements. These learners benefit from advanced and exploratory learning paths.
Learners 2, 5, and 8 demonstrated steady combined effectiveness despite some engagement gaps. Personalized resources can further boost their performance.
Learners 3, 4, and 6 showed lower combined effectiveness due to limited explicit improvement or low engagement. These learners need targeted interventions:
○
Learner 3: Needs foundational reinforcement despite high engagement.
○
Learner 4 and Learner 6: Require interactive and engaging content to improve both engagement and outcomes.

3.4. Finding the Predicted Target Module

This proposed system predicts the learner’s target module by processing both explicit and implicit signals using a hybrid model (XGBoost + GRU). Once the predicted module is identified, it is validated for logical consistency and relevance using a knowledge graph (KG).

In Table 13, all of the modules are arranged in the order of complexity as a node with the target modules’ information.

The system predicts the target module in three main steps:

Step 1: Process explicit signals (XGBoost).

Input: Explicit signals such as pre-test scores, post-test scores, satisfaction ratings, and module preferences.
Processing:

XGBoost uses tree-based methods to capture non-linear relationships and assigns a predicted score (

{\hat{y}}_{XGBoost}

) for each potential module as follows.

1.: Objective function ( $L (θ)$ : The objective function combines a loss function and a regularization term:

L (θ) = \sum_{i = 1}^{n} l (x_{i}, {\hat{x}}_{i}) + \sum_{k = 1}^{K} Ω (f_{k})

(7)

where

$l (x_{i}, {\hat{x}}_{i})$ : Loss function.
$Ω (f_{k}) = γ T + \frac{1}{2} λ ∥ w ∥^{2}$ : Regularization term.
$T$ : The number of leaves in the tree.
$w$ : Leaf weights.

2.: Prediction ( ${\hat{y}}_{i})$ : The final prediction is the sum of predictions from all trees:

{\hat{y}}_{i} = \sum_{k = 1}^{K} f_{k} (x_{i})

(8)

3.: Gradient and Hessian ( $g_{i})$ : To optimize the loss, XGBoost computes:

g_{i} = \frac{\partial l (y_{i}, {\hat{y}}_{i})}{\partial {\hat{y}}_{i}}, h_{i} = \frac{\partial^{2} l (y_{i}, {\hat{y}}_{i})}{\partial {\hat{y}}_{i}^{2}}

(9)

4.: Tree-splitting gain (Gain): The gain from a split is calculated as follows:

G a i n = 1 / 2 ((G_L^2) / (H_L + λ) + (G_R^2) / (H_R + λ) - (G_L + G_R)^2 / (H_L + H_R + λ)) - γ

(10)

where

$G_{L}, G_{R}$ : Gradients for left and right nodes.
$H_{L}, H_{R}$ : Hessians for left and right nodes.
Output: Probability or score indicating the relevance of each module.

Step 2: Process implicit signals (GRU).

Input: Sequential data like time spent, retries, engagement trends, and clickstream data.
Processing:
- GRU processes the temporal dependencies in the data, predicting a score ( ${\hat{y}}_{GRU}$ ) for each module based on behavioural patterns as follows.

1.: Update gate ( $z_{t})$ : Determines how much of the previous hidden states to retain:

z_{t} = σ (W_{z} x_{t} + U_{z} h_{t - 1} + b_{z})

(11)

where

$z_{t}$ : Update gate at time $t$ .
$x_{t}$ : Input at time $t$ .
$h_{t - 1}$ : Previous hidden state.
$W_{z}, U_{z}, b_{z}$ : Weights and bias.
$σ$ : Sigmoid activation function.

2.: Reset gate ( $r_{t})$ : Controls how much of the past information to forget:

r_{t} = σ (W_{r} x_{t} + U_{r} h_{t - 1} + b_{r})

(12)

Candidate hidden state ( ${\tilde{h}}_{t})$ : Computes the new information to be added:

{\tilde{h}}_{t} = t a n h (W_{h} x_{t} + U_{h} (r_{t} ⊙ h_{t - 1}) + b_{h})

(13)

Final hidden state ( $h_{t})$ : Combines the previous hidden state and the candidate hidden state using the update gate:

h_{t} = z_{t} ⊙ h_{t - 1} + (1 - z_{t}) ⊙ {\tilde{h}}_{t}

(14)

Output prediction ( ${\hat{y}}_{t})$ : The output is computed as follows:

{\hat{y}}_{t} = W_{o} h_{t} + b_{o}

(15)

Output: Predicted relevance score for each module.

Step 3: Combine predictions (weighted ensemble).

The predictions from XGBoost and GRUs are combined using a weighted ensemble approach:

\hat{y} = α \cdot {\hat{y}}_{XGBoost} + (1 - α) \cdot {\hat{y}}_{GRU}

α

: Weight assigned to explicit signals (e.g., 0.6).

Final output ( $\hat{y})$ : The module with the highest $\hat{y}$ score is selected as the predicted target module.

3.5. Validating the Predicted Target Module with the Knowledge Graph

Once the target module is predicted, it is validated against the knowledge graph (KG) to ensure logical consistency and alignment with the learner’s knowledge path.

A knowledge graph (KG) represents the learning domain as a graph where nodes correspond to learning modules and edges signify relationships between modules, such as prerequisites or co-requisites. The validation process ensures that the predicted target module is appropriate for the learner. First, prerequisite consistency must be checked; for instance, if the predicted module is Advanced Data Structures, the prerequisite Basic Data Structures must be completed. This is valid if, for all pairs (Mₚᵣₑd, Mₚᵣₑq) in the KG, the prerequisite module Mₚᵣₑq is marked as completed. Then, the knowledge path must align with the learner’s engagement trend; steady engagement warrants modules of similar difficulty, while irregular trends may require remedial or foundational modules. Additionally, content relevance ensures the suggested module matches the learner’s preferences or performance trends; finally, in edge weight validation, if the KG assigns weights to edges to represent difficulty jumps, the predicted target module is valid only if the weight between the current and target module is within a predefined threshold. The dataset after target module prediction and knowledge graph validation outcome will be as follows in Table 14.

3.6. Feedback Loop for Refinement

The hybrid model (DKPS) proceeds with recommending the content using a GAN based on the predicted target module and then it collects learner’s feedback after validating if the prediction of the module is valid and completed. Adjusting the feedback parameters is used to refine the model as mentioned in the Table 15. The alternative pathway or knowledge graph (KG) is used to differentiate the predicted target module based on the learner’s insights, if the module is invalid, to guide the learner effectively.

3.7. Incorporation of Conditional GAN (cGAN) in DKPS

In the dynamic knowledge prediction system (DKPS), conditional generative adversarial networks (cGANs) are used to support both personalized content generation and feedback-driven refinement. The system generates learner-specific materials that adapt to each student’s current skill level, learning goals, and progress. This approach helps improve engagement, supports diverse learning preferences, and creates a dynamic and responsive learning experience.

The cGAN model consists of two main components: a generator (G) and a discriminator (D). The generator creates learning content—such as quizzes, hints, and problem-solving tasks—based on the learner’s target module and predicted skill level. The discriminator then evaluates this content to determine whether it aligns with the learner’s proficiency and the module’s objectives. If the content is too advanced or not relevant, the discriminator rejects it and prompts the generator to produce simpler or more suitable materials.

3.8. Role of cGAN in Dynamic Content Generation

To give each student the right kind of learning material, the system uses different AI methods that help understand what the learner needs and how to support them best.

One of the key methods used is called a conditional GAN (cGAN). It works like a smart team made up of two parts: the generator and the discriminator. The generator creates learning content—such as quizzes, hints, or practice problems—based on what topic the student is learning and how well they are doing. For example, if a student is learning about “Stacks” in a programming course, the generator might make practice questions about push and pop operations or give small tips and examples. The discriminator checks whether the content is too easy or too hard for the student. If it does not match the student’s level, it tells the generator to try again with simpler or more suitable material. This helps the system give the right type of support at the right time.

3.9. Explainable AI (XAI) Integration in the Proposed System (DKPS)

To help teachers and learners understand why the system made certain decisions, another method called SHAP is used. SHAP is like a tool that explains what influenced the system’s choice. For instance, if the system decides that a student needs easier content, SHAP can show the reasons behind it—maybe the student had low quiz scores or spent too much time on earlier lessons. This kind of explanation helps build trust and makes the AI more transparent.

The transparency and trust gained by providing clear, understandable justifications for decisions made across all modules is enhanced by the XAI in DKPS. Learner’s engagement levels, quiz performance, and interaction patterns interface to identify a learner’s preferences, strengths, and areas for improvement as explained by XAI. Based on the progress and goals of the learner, in the content generation module, XAI clarifies why specific learning materials such as exercises, hints, and tips are recommended. To meet curriculum objectives and the learner’s proficiency, the learning pathway and KG nodes are adjusted by the post-test score, GAN-generated content, and feedback generated from the learner and this benefits the ability of XAI to provide detailed transparency and interpretability.

Relationships between concepts, skills, and modules are mapped, and XAI provides explanations on how this KG informs content sequencing and prerequisite identification. For example, XAI can reveal how the knowledge graph determined the need to reinforce foundational topics like stack operations if a learner struggles with the topic “advanced recursion”. The system ensures transparency, fosters trust, and empowers learners and educators to make informed decisions for a more effective and personalized learning experience by integrating XAI across all modules.

The system also uses a weighted ensemble method to make better predictions. Instead of using just one AI model to make decisions, it combines several models and takes a vote. However, it does not treat all models the same; more accurate models receive more importance. This way, the system can make smarter and more reliable choices for each student.

All of these AI tools work together to keep improving the learning experience. As students interact with the system by doing quizzes, rewatching lessons, or spending more time on certain topics, the system learns from this and adjusts future content. Over time, this creates a smooth and personalized learning path that fits each student’s pace and progress.

4. Dataset

To simulate a real-world personalized learning environment, a simulated dataset containing information from 1000 learners was used to train the machine learning (ML) model. This dataset was synthetically generated to overcome real-world data privacy and access limitations, while still reflecting realistic learning behaviours. It was designed to cover a diverse population of students, specifically targeting learners from the ages of 13 to 17 and in the 8th to 12th grade range, across various academic abilities and learning styles.

Explicit signals, such as quiz scores, feedback ratings, and module completion data, alongside implicit signals, such as time spent on modules, clickstream data, and interaction patterns, are considered. To build learner profiles and recommend target modules, these signals form the foundation. The dataset was split into training (70%), validation (20%), and testing (10%) subsets. To measure accuracy, precision, recall, and F1-score, the hybrid ML model was trained and fine-tuned using the training and validation subsets and evaluated on the testing subset.

To represent topics and their relationships, a domain-specific knowledge graph was constructed. Prerequisite-related topics and key concepts define their relationships through the graph’s edges and nodes, respectively. For logical consistency, simulations were conducted by adding relevant topics and subtopics to the graph, validating the ML model’s predictions. To ensure adaptability, GAN-generated content—such as quizzes and hints—updated the knowledge graph’s node weights, which were dynamically adjusted based on learners’ interactions.

This DKPS, which combines simulated learner data and an evolving knowledge graph, was confirmed to be highly effective—achieving 90% recommendation accuracy, improving learner comprehension by 85%, and reducing module completion time by 20%. These outcomes demonstrate the robustness and adaptability of the DKPS framework.

5. Results and Discussion

5.1. Analysis of Hybrid Models

The selection of the most suitable explicit (e.g., quiz results) and implicit (e.g., behaviour patterns) learner signals begins with a thorough analysis of machine learning models. By combining these models, the system accurately predicts detailed learner information, including preferences, progress, and learning behaviours. This enriched learner profile guides the DKPS in recommending appropriate target learning modules through the use of hybrid models.

The final results of the hybrid model analysis, based on 10 iterations, are presented in Table 16, highlighting the performance across four metrics: accuracy, precision, recall, and F1-score. Over the iterations, there is a clear and consistent improvement in the system’s performance. Accuracy increased from 70% to 93%, indicating that the model became significantly better at correctly identifying the most suitable learning modules. Precision improved from 0.68 to 0.90, meaning the system made more relevant recommendations with fewer false positives. Recall also improved, moving from 0.75 to 0.89, reflecting a growing ability to identify all relevant modules for a learner without missing important ones. The F1-score, which balances both precision and recall, rose from 0.71 to 0.89, demonstrating that the model’s improvements were steady and well-rounded across all performance aspects.

To confirm the reliability of these improvements, a statistical analysis was conducted. The probability values (p-values) for the observed increases in all four metrics across the 10 iterations were found to be less than 0.01 (p < 0.01), indicating that the improvements are statistically significant and not due to random chance. Furthermore, 95% confidence intervals for each metric were narrow, signifying low variance and strong stability in the model’s predictions.

These results confirm that the hybrid ML model used in the DKPS framework is robust and adaptive. With each iteration, the model becomes more accurate, reliable, and effective in delivering personalized learning recommendations based on well-structured learner profiles.

5.2. Knowledge Graph Validation

The knowledge graph (KG) validation process in DKPS ensures predictions align with logical learning paths. The system exhibited significant improvements in its performance metrics: accuracy, precision, recall, and F1-score over iterations. Accuracy, which measures the percentage of predictions aligning with valid paths in the KG, improved from 80% to 95% as depicted in Figure 7. This indicates the system’s increasing ability to consistently identify correct predictions, reducing errors and enhancing reliability. This growth reflects the refinement in hybrid model predictions and the effectiveness of feedback loops. Precision, defined as the percentage of validated predictions that are truly correct, increased from 0.75 to 0.92. This improvement highlights the system’s growing capability to avoid irrelevant or invalid paths, ensuring that recommendations are both accurate and relevant. Recall, the percentage of all valid paths correctly identified, rose from 0.70 to 0.88. This metric underscores the system’s consistent identification of valid learning paths within the KG, ensuring no critical modules or transitions are overlooked. F1-score, which balances precision and recall, increased from 0.72 to 0.90. This mean reflects the system’s overall effectiveness in achieving both relevance and comprehensiveness in predictions.

Iterative improvements across all metrics resulted from feedback loops, intermediate module suggestions, and refined hybrid model outputs. For instance, invalid predictions dropped from 20% to 5%, with 80% of invalid recommendations dynamically corrected in the final iteration. Additionally, node utilization in the KG improved from 50% to 65%, demonstrating broader exploration and coverage of the graph. Meanwhile, the average validation time decreased from 0.7 s to 0.5 s, reflecting system optimization for real-time application as mentioned in Table 17.

5.3. Content Generation

The performance analysis of the content generation system in the personalized learning framework demonstrates significant improvements across key metrics: retention rate, time spent, feedback score, and quiz score improvement. Retention rate, measuring the percentage of learners completing the generated content, increased steadily from 75% to 95% over iterations, reflecting enhanced engagement with tailored learning materials. Average time spent on the content grew from 15 to 24 min, indicating deeper interaction and greater relevance of the generated materials. Feedback scores, collected on a scale of one to five, improved from 4.0 to 4.9, showcasing growing learner satisfaction and alignment with their needs. Additionally, quiz score improvement, tracking learning effectiveness, rose from 10% to 25%, highlighting the content’s ability to reinforce concepts and enhance learner understanding. Iterative refinements, driven by learner feedback and engagement trends, ensured the dynamic personalization of content, leading to improved outcomes. These results emphasize the content generation system’s capacity to deliver engaging, effective, and learner-centric materials, contributing significantly to the overall effectiveness of the personalized learning model as depicted in Figure 8.

5.4. Feedback Collection

The feedback generation process significantly impacts precision and accuracy as mentioned in Figure 9 by continuously refining the system based on learner input. Over 10 iterations, precision improved from 0.68 to 0.90, demonstrating the system’s ability to reduce irrelevant or incorrect predictions by focusing on learner-specific needs and preferences. This indicates that feedback helps the model prioritize relevant features and pathways. Similarly, accuracy increased from 70% to 93%, reflecting the system’s growing correctness in aligning predictions with valid learning paths. These improvements highlight how feedback generation dynamically adjusts model weights and content strategies, ensuring recommendations are both relevant and accurate, ultimately enhancing the learner experience.

5.5. Refinement of ML Models

With reference to the feedback and data collected, the ML models go through iterative refinement. Every cycle updates the weights of the knowledge graph nodes and retrains the ML models to reflect the learner’s progress profile. The system’s prediction accuracy and adaptability increased through an iterative process and escalation of the learner’s personalized learning experience. Over successive iterations, the system shows notable improvements in its recommendation protocols, such as precision, recall, and F1-score, to achieve the effective and relevant learning modules. The chart illustrates the model refinement in terms of F1-score, precision, and recall over 10 iterations as mentioned in Figure 10.

5.6. Explainable AI (XAI) Integration

The transparency in the explainable AI tool builds trust and allows users to understand the reasoning behind the system’s decisions. It ensures the system remains transparent by providing clear explanations of why specific modules are predicted as target modules and how the knowledge graph validates these predictions.

The validation metrics for XAI integration over 10 iterations show consistent improvements in feature importance accuracy, decision explanation clarity, and user trust, reflecting the growing effectiveness of the system’s transparency and interpretability. Feature importance accuracy improved from 80% to 95%, demonstrating the system’s ability to identify and prioritize key features contributing to predictions. Similarly, decision explanation clarity increased from 75% to 90%, highlighting the enhanced interpretability of the model’s decisions and better alignment with user expectations. User trust, a critical metric for adoption and engagement, grew from 70% to 88%, showcasing increased confidence in the system’s recommendations as explanations became more transparent and meaningful. These iterative improvements emphasize the success of XAI in balancing accuracy, interpretability, and user confidence, reinforcing its role in fostering a reliable and user-centric learning system.

The validation metrics for XAI integration over 10 iterations show consistent improvements in feature importance accuracy, decision explanation clarity, and user trust, reflecting the growing effectiveness of the system’s transparency and interpretability. As shown in Figure 11, feature importance accuracy improved from 80% to 95%, demonstrating the system’s ability to identify and prioritize key features contributing to predictions. Similarly, decision explanation clarity increased from 75% to 90%, highlighting the enhanced interpretability of the model’s decisions and better alignment with user expectations. User trust, a critical metric for adoption and engagement, grew from 70% to 88%, showcasing increased confidence in the system’s recommendations as explanations became more transparent and meaningful. These iterative improvements emphasize the success of XAI in balancing accuracy, interpretability, and user confidence, reinforcing its role in fostering a reliable and user-centric learning system.

5.7. Learner’s Progress with DKPS

The integration of feedback into the DKPS allowed iterative refinements to address individual learner needs and improve their performance. It provides an analysis of learner improvement metrics, considering explicit and implicit signals and the impact of feedback.

As mentioned in Figure 12, all learners experienced growth in score improvement after predictions. Learners 1, 7, and 9 showed the most significant gains, indicating the effectiveness of prediction-driven recommendations. Combined effectiveness improved for all learners, with notable increases for Learners 3, 4, and 6, who had struggled before. Learner 9 maintained the highest effectiveness (from 35% to 40%) due to strong engagement and alignment with personalized paths. Struggling learners (e.g., Learners 4 and 6) showed improvement in both metrics due to targeted intermediate modules and feedback loops. Improving learners (e.g., Learners 2, 5, and 8) demonstrated steady progress, benefiting from tailored recommendations.

6. Conclusions

To enhance learners’ specific knowledge, the proposed hybrid model, DKPS, combines knowledge graphs, GANs, and XAI for personalized learning. In order to construct the learner’s profile, it integrates the explicit signals, namely inputs and assessment results, with the implicit signals, interaction patterns and learning behaviours. With reference to the formulated learner’s profile, the hybrid model suggests targeted learning modules. The logical consistency across the domain ensured by knowledge graphs provides structured navigation between the associated concepts. Additionally, the generative adversarial network (GAN)-generated quizzes, tips, and hints dynamically assess the learner’s grasp of the material, creating a more engaging and interactive learning process.

The integration of a feedback loop into the hybrid model, where learners provide explicit feedback on their experience, also helps to improve the model’s efficiency. Furthermore, by analysing the learner’s style, such as their preferences for visual, auditory, or hands-on content, the system adapts its recommendations to enhance the learner’s overall learning curve. Explainable AI (XAI) ensures transparency, making it clear how recommendations are generated and building trust with users. Future developments aim to create a fully dynamic and adaptive learning framework, capable of predicting and adapting an entire personalized learning path. This path will evolve after every cycle by aligning with the learner’s unique preferences and enhancing their expertise of the domain. Through the integration of multimedia content and continuous feedback, the system ensures a customized learning experience. In the future, to support learning and improve the learner’s experience, the system will offer multimedia content, including videos and images, to simplify complex concepts and improve the learner’s understanding.

Author Contributions

Conceptualization, S.M. and S.P.; methodology, S.M.; software, S.M.; validation, S.M. and S.P.; formal analysis S.P.; investigation, S.P.; resources, S.M.; data curation S.M.; writing—original draft preparation, S.M. and S.P.; writing—review and editing, S.P.; visualization, S.M. and S.P.; supervision, S.P.; project administration, S.P.; funding acquisition, S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset used in this study is synthetically generated to replicate a real-world personalized learning environment while addressing privacy and accessibility concerns associated with real learner data. It comprises information from 1000 simulated learners aged 13 to 17 (grades 8 to 12), incorporating both explicit signals (e.g., quiz scores, feedback ratings, module completion) and implicit signals (e.g., time on task, clickstream data, interaction patterns). The dataset supports the construction of dynamic learner profiles and personalized content recommendations within the proposed DKPS framework. Due to its simulated nature, the dataset does not contain personally identifiable information and poses no privacy risks. The data, along with the code used for its generation and the corresponding domain-specific knowledge graph structure, are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DKPS	dynamic knowledge component prediction system
ML	machine learning
AI	artificial intelligence
XAI	explainable artificial intelligence
GAN	generative adversarial networks
XGBoost	extreme gradient boosting
GRU	gated recurrent unit
LSTM	long short-term memory (LSTM)
RNN	recurrent neural network (RNN)

References

Imamah, U.L.; Djunaidy, A.; Purnomo, M.H. Development of dynamic personalized learning paths based on knowledge preferences & the ant colony algorithm. IEEE Access 2024, 12, 144193–144207. [Google Scholar]
Kamsa, R.; Elouahbi, R. The combination between the individual factors and the collective experience for ultimate optimization learning path using ant colony algorithm. Int. J. Adv. Sci. Eng. Inf. Technol. 2018, 8, 1198–1208. [Google Scholar] [CrossRef]
Rodriguez-Medina, A.E.; Dominguez-Isidro, S.; Ramirez-Martinell, A. A Microlearning path recommendation approach based on ant colony optimization. J. Intell. Fuzzy Syst. 2022, 42, 4699–4708. [Google Scholar] [CrossRef]
Ogata, H.; Flanagan, B.; Takami, K.; Dai, Y.; Nakamoto, R.; Takii, K. EXAIT: Educational eXplainable artificial intelligent tools for personalized learning. Res. Pract. Technol. Enhanc. Learn. 2024, 19. [Google Scholar] [CrossRef]
Yuhana, U.L.; Djunaidy, A.; Purnomo, M.H. Enhancing students performance through dynamic personalized learning path using ant colony and item response theory (ACOIRT). Comput. Educ. Artif. Intell. 2024, 7, 100280. [Google Scholar]
Kardan, A.A.; Ebrahim, M.A.; Imani, M.B. A new personalized learning path generation method: Aco-map. Indian J. Sci. Res. 2014, 5, 17–24. [Google Scholar]
Choi, H.; Lee, H.; Lee, M. Optical knowledge component extracting model for knowledge concept graph completion in education. IEEE Access 2023, 11, 15002–15013. [Google Scholar] [CrossRef]
Sharif, M.; Uckelmann, D. Multi-Modal LA in Personalized Education Using Deep Reinforcement Learning. IEEE Access 2024, 12, 54049–54065. [Google Scholar] [CrossRef]
Lai, Y.; Huang, Y. Innovations in Online Learning Analytics: A Review of Recent Research and Emerging Trends. IEEE Access 2024, 12, 166761–166775. [Google Scholar]
Essa, S.G.; Celik, T.; Human-Hendricks, N.E. Personalized Adaptive Learning Technologies Based on Machine Learning Techniques to Identify Learning Styles: A Systematic Literature Review. IEEE Access 2023, 11, 48392–48409. [Google Scholar] [CrossRef]
Chen, L.; Chen, P.; Lin, Z. Artificial Intelligence in Education: A Review. IEEE Access 2020, 8, 75264–75278. [Google Scholar] [CrossRef]
Wang, H.; Tlili, A.; Lehman, J.D.; Lu, H.; Huang, R. Investigating Feedback Implemented by Instructors to Support Online Competency-Based Learning (CBL): A Multiple Case Study. Int. J. Educ. Technol. High. Educ. 2021, 18, 5. [Google Scholar] [CrossRef]
Smith, J.A.; Johnson, L.M.; Williams, R.T. Student Voice on Generative AI: Perceptions, Benefits, and Challenges in Higher Education. Educ. Technol. Soc. 2023, 26, 45–58. [Google Scholar]
Nguyen, P.L.; Brown, T.K. Social Comparison Feedback in Online Teacher Training and Its Impact on Asynchronous Collaboration. J. Educ. Comput. Res. 2022, 59, 789–812. [Google Scholar]
Lee, S.H.; Kim, J.W.; Park, Y.S. The Influence of E-Learning on Exam Performance and the Role of Achievement Goals in Shaping Learning Patterns. Internet High. Educ. 2021, 50, 100–110. [Google Scholar]
Zhang, Y.; Li, X.; Wang, Q. Dynamic Personalized Learning Path Based on Triple Criteria Using Deep Learning and Rule-Based Method. IEEE Trans. Learn. Technol. 2020, 13, 283–294. [Google Scholar]
Chen, L.; Zhao, X.; Liu, H. Research on Dynamic Learning Path Recommendation Based on Social Network. Comput. Educ. 2019, 136, 1–10. [Google Scholar]
Nagisetty, V.; Graves, L.; Scott, J.; Ganesh, V. XAI-GAN: Enhancing Generative Adversarial Networks via Explainable AI Systems. arXiv 2002, arXiv:10438.2020. [Google Scholar]
Chen, Y.; Qin, X.; Zhang, Y.; Wang, G.; Liu, X.; Li, X. FedGen: Personalized federated learning with data generation for heterogeneous clients. Future Gener. Comput. Syst. 2024, 139, 52–63. [Google Scholar]
Abbes, F.; Bennani, S.; Maalel, A. Generative AI in Education: Advancing Adaptive and Personalized Learning. SN Comput. Sci. 2024, 5, 1154. [Google Scholar] [CrossRef]
Hariyanto, D.; Kohler, T. An Adaptive User Interface for an E-learning System by Accommodating Learning Style and Initial Knowledge. In Proceedings of the International Conference on Technology and Vocational Teachers (ICTVT 2017), Yogyakarta, Indonesia, 28 September 2017; Atlantis Press: Dordrecht, The Netherlands; pp. 16–23. [Google Scholar]
Jeevamol, S.; Renumol, V.G.; Jayaprakash, S. An Ontology-Based Hybrid E-Learning Content Recommender System for Alleviating the Cold-Start Problem. Educ. Inf. Technol. 2021, 26, 7259–7283. [Google Scholar] [CrossRef]
Elshani, L.; PirevaNuçi, K. Constructing a Personalized Learning Path Using Genetic Algorithms Approach. arXiv 2021, arXiv:2104.11276. [Google Scholar]
Sharma, V.; Kumar, A. Smart Education with Artificial Intelligence Based Determination of Learning Styles. Procedia Comput. Sci. 2018, 132, 834–842. [Google Scholar]
Allioui, Y.; Chergui, M.; Bensebaa, T.; Belouadha, F.Z. Combining Supervised and Unsupervised Machine Learning Algorithms to Predict the Learners’ Learning Styles. Procedia Comput. Sci. 2019, 148, 87–96. [Google Scholar]
Hmedna, B.; El Mezouary, A.; Baz, O.; Mammass, D. Identifying and Tracking Learning Styles in MOOCs: A Neural Networks Approach. Int. J. Innov. Appl. Stud. 2017, 19, 267–275. [Google Scholar]
Alghazzawi, D.; Said, N.; Alhaythami, R.; Alotaibi, M. A Survey of Artificial Intelligence Techniques Employed for Adaptive Educational Systems within E-Learning Platforms. J. Artif. Intell. Soft Comput. Res. 2017, 7, 47–64. [Google Scholar]
Arslan, R.C.; Zapata-Rivera, D.; Lin, L. Opportunities and Challenges of Using Generative AI to Personalize Educational Assessments. Front. Artif. Intell. 2024, 7, 1460651. [Google Scholar] [CrossRef] [PubMed]
Schneider, J. Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda. Artif. Intell. Rev. 2024, 57, 289. [Google Scholar] [CrossRef]
Zhang, Y.-W.; Xiao, Q.; Song, Y.-L.; Chen, M.-M. Learning Path Optimization Based on Multi-Attribute Matching and Variable Length Continuous Representation. Symmetry 2022, 14, 2360. [Google Scholar] [CrossRef]
Yu, H.; Guo, Y. Generative Artificial Intelligence Empowers Educational Reform: Current Status, Issues, and Prospects. Front. Educ. 2023, 8, 1183162. [Google Scholar] [CrossRef]
Sajja, R.; Sermet, Y.; Cikmaz, M.; Cwiertny, D.; Demir, I. Artificial Intelligence-Enabled Intelligent Assistant for Personalized and Adaptive Learning in Higher Education. Information 2024, 15, 596. [Google Scholar] [CrossRef]
Jaboob, M.; Hazaimeh, M.; Al-Ansi, A.M. Integration of Generative AI Techniques and Applications in Student Behavior and Cognitive Achievement in Arab Higher Education. Int. J. Educ. Technol. High. Educ. 2024, 21, 45. [Google Scholar] [CrossRef]
Mao, J.; Chen, B.; Liu, J.C. Generative Artificial Intelligence in Education and Its Implications for Assessment. TechTrends 2024, 68, 58–66. [Google Scholar] [CrossRef]
Lai, J.W. Adapting Self-Regulated Learning in an Age of Generative Artificial Intelligence Chatbots. Future Internet 2024, 16, 218. [Google Scholar] [CrossRef]
Pesovski, I.; Santos, R.M.; Henriques, R.; Trajkovik, V. Generative AI for Customizable Learning Experiences. Sustainability 2024, 16, 3034. [Google Scholar] [CrossRef]
Balasubramanian, J.; Koppaka, S.; Rane, C.; Raul, N. Parameter based survey of recommendation systems. In Proceedings of the International Conference on Innovative Computing and Communication (ICICC), New Delhi, India, 21–23 February 2020; Volume 1. Available online: https://ssrn.com/abstract=3569579 (accessed on 7 April 2020).
Shibani, A.S.M.; Mohd, M.; Ghani, A.T.A.; Zakaria, M.S.; Al-Ghuribi, S.M. Identification of critical parameters affecting an Elearning recommendation model using Delphi method based on expert validation. Information 2023, 14, 207. [Google Scholar] [CrossRef]
Vanitha, V.; Krishnan, P.; Elakkiya, R. Collaborative optimization algorithm for learning path construction in Elearning. Comput. Electr. Eng. 2019, 77, 325–338. [Google Scholar] [CrossRef]
Nabizadeh, A.H.; Gonçalves, D.; Gama, S.; Jorge, J.; Rafsanjani, H.N. Adaptive learning path recommender approach using auxiliary learning objects. Comput. Educ. 2020, 147, 103777. [Google Scholar] [CrossRef]
Sun, C.; Huang, S.; Sun, B. Personalized learning path planning for higher education based on deep generative models and quantum machine learning: A multimodal learning analysis method integrating transformer, adversarial training, and quantum state classification. Discov. Artif. Intell. 2025, 5, 29. [Google Scholar] [CrossRef]
Li, L.; Zhang, Y.; Chen, L. Personalized prompt learning for explainable recommendation. ACM Trans. Inf. Syst. 2023, 41, 1–26. [Google Scholar] [CrossRef]
Ogata, H.; Flanagan, B.; Takami, K.; Dai, Y.; Nakamoto, R.; Takii, K. EXAIT: Educational XAI Tools for Personalized Learning. Res. Pract. Technol. Enhanc. Learn. 2024, 19, 1–15. [Google Scholar]
Zhang, L.; Lin, J.; Borchers, C.; Cao, M.; Hu, X. 3DG: A Framework for Using Generative AI for Handling Sparse Learner Performance Data from Intelligent Tutoring Systems. arXiv 2024, arXiv:2402.01746. [Google Scholar]
Maity, S.; Deroy, A. Bringing GAI to Adaptive Learning in Education. arXiv 2024, arXiv:2410.10650. [Google Scholar]

Figure 1. Proposed (DKPS) workflow model.

Figure 2. Model accuracy comparison for explicit signals.

Figure 3. Statistical results for explicit signals.

Figure 4. Model comparision for implicit signals.

Figure 5. Statistical results for implicit signals.

Figure 6. Model accuracy comparison.

Figure 7. Knowledge graph performance metrics.

Figure 8. Content generation performance metrics.

Figure 9. Impact of feedback integration in DKPS.

Figure 10. Refinement of DKPS.

Figure 11. Integration of XAI performance metrics.

Figure 12. Learners’ progression trend.

Table 1. Different categories of signals.

Signals Type	Description	Examples
Explicit Signals	Direct inputs provided by the learner or system.	Pre-test scores Post-test scores Module preferences Feedback ratings Satisfaction rating Initial module
Implicit Signals	Learner’s behavioural data to represent how they learn	Time spent on modules Retries Engagement levels Clickstream data

Table 2. Comparison of previous study on personalised learning.

Research	Number of Parameters	Learner’s Characteristics	Dynamic Personalised Learning Path	Feedback-Driven Interpretability
Kamsa et al., 2018 [2]	Two	Level of knowledge and learner’s history	Static	No
Vanitha et al., 2019 [17]	Three	Learner emotion, cognitive ability, and difficulty level of learning objective	Static	No
Kardan et al., 2014 [6]	Two	Pre-test value and grouping of the learner categories	Static	No
Rodriguez-Medina et al., 2022 [3]	Two	Preference for knowledge level of the student and learning status	Static	No
Saadia Gutta Essa 2023 [7]	Two	Relevant data on learner obtained through the browser, such as via browsing history and data collaboration	Static	No
Hiroaki Ogata 2024 [4]	Three—through learner analytics tool	Log data, survey data, and assessment data	Static	No
Imamah Aug 2024 [1]	Five—through created dashboard	Knowledge level, self-estimation, initial module, target module, and difficulty level of learning objective	Dynamic	No
Imamah Dec 2024 [5]	Through Item Response Theory (IRT) framework	Parameter utilized by different models focuses on difficulty level, discrimination, guessing, and carelessness	Dynamic	No
Proposed Approach	Explicit and implicit parameters with GAN and XAI implemented	Hybrid ML models	Dynamic	Incorporated

Table 3. Model preference details for explicit signals.

Model	Performance
Logistic Regression	Low accuracy for complex datasets.
Decision Tree	Moderate accuracy, high variance.
Random Forest	High accuracy, moderate efficiency.
Support Vector Machine (SVM)	Moderate accuracy, slow performance.
XGBoost	Best accuracy and efficiency.

Table 4. Model preference details for implicit signals.

Model	Performance
Recurrent Neural Network (RNN)	Moderate accuracy, high variance.
Long Short-Term Memory (LSTM)	High accuracy, slower training.
Gated Recurrent Unit (GRU)	High accuracy, fast training.
Transformers	Best accuracy, very slow.

Table 5. Selected hybrid model for proposed system.

Model	Purpose	Strengths
XGBoost	Process explicit signals	High accuracy, robust to overfitting, and Interpretable
GRU	Process implicit signals	Captures sequential patterns efficiently.
Hybrid (Ensemble)	Combine XGBoost and GRU predictions	Achieved higher accuracy and robust predictions.

Table 6. Performance metrics of selected models.

Metrics	XGBoost	GRU	Hybrid
Accuracy	92.0%	88.0%	95.0%
Precision	90.0%	87.0%	94.0%
Recall	91.0%	86.0%	93.0%
F1-Score	90.5%	86.5%	93.5%

Table 7. Signal categorisation.

Signal Type	Signal Name	Description	Example Source
Explicit	Pre-test Score	Measures prior knowledge before starting a module.	Pre-test assessment
	Post-test Score	Assesses knowledge improvement after completing a module.	Post-test assessment
	Satisfaction Rating	Indicates learner feedback on the module’s quality or difficulty.	Learner feedback form
	Module Completion Status	Tracks whether the learner has successfully completed the module.	System logs
	Initial Module	Content which the learner has chosen.	Content information
Implicit	Time Spent on Module	Measures total time the learner spends engaging with a module.	System usage logs
	Click Count	Tracks the number of clicks or interactions made within the learning system.	System interaction logs
	Engagement trends	Frequency of interactions with quizzes, videos, or simulations.	Interaction trackers
	Revisit Count	Number of times the learner revisits specific content.	System logs

Table 8. Steps involved in preprocessing explicit signals.

Step	Category	Description
Data Cleaning	--	Handle missing values (e.g., imputation with mean/median). Remove duplicate records.
Transformation	Feature Scaling	Normalize or standardize numerical features to bring them to the same scale.
	Conversion	Convert categorical variables into numerical representations.
	Outlier Detection	Identify and remove extreme values that may skew the model’s performance.
Feature Engineering	--	Extract features such as normalization and score improvement to enhance model performance.

Table 9. Sample data after preprocessing explicit signals.

Learner ID	Pre-Test Score (%)	Post-Test Score (%)	Satisfaction Rating (1–5)	Module Completion Status (1 = Completed, 0 = Not Completed)	Score Improvement (%)	Initial Module
1	70	85	5	1	15	Data Structures
2	60	80	4	1	20	Data Structures
3	55	65	5	1	10	Algorithms
4	72	90	3	0	18	Binary Trees
5	65	77	4	1	12	Graph Algorithms
6	50	58	3	0	8	SQL Basics
7	60	75	5	1	15	Testing
8	55	65	4	1	10	Networking
9	70	95	5	1	25	Machine Learning
10	62	80	4	1	18	Data Analytics

Table 10. Steps involved in preprocessing implicit signals.

Step	Category	Description
Data Cleaning	--	Remove incomplete sequences. Handle missing time steps using interpolation or padding.
Transformation	Normalization	Scale sequential data to a fixed range to improve convergence during model training.
	Sequence Padding	Ensure all sequences are of the same length by padding shorter sequences or truncating longer ones.
	Categorical Conversion	Encode sequential categorical data into numerical format.
Feature Engineering	--	Extract features such as time spent, retries, engagement rate, and click rate data to improve the learner’s knowledge domain.

Table 11. Sample data after preprocessing implicit signals.

Learner ID	Initial Module	Time Spent (Minutes)	Click Count	Engagement Trend	Revisit Count	Padded Sequence	Expanded Remarks
1	Data Structures	50	30	2	3	[50, 30, 2, 3, 0, 0]	Moderate engagement with consistent interaction
2	Data Structures	25	20	1	1	[25, 20, 1, 1, 0, 0]	Average engagement pattern
3	Algorithms	60	50	3	5	[60, 50, 3, 5, 0, 0]	Highly engaged and revisits frequently
4	Binary Trees	15	10	1	1	[15, 10, 1, 1, 0, 0]	Low engagement, requires motivation
5	Graph Algorithms	45	28	2	2	[45, 28, 2, 2, 0, 0]	Moderate engagement with consistent interaction
6	SQL Basics	20	15	1	1	[20, 15, 1, 1, 0, 0]	Average engagement pattern
7	Testing	55	40	3	3	[55, 40, 3, 3, 0, 0]	Highly engaged and revisits frequently
8	Networking	35	25	2	2	[35, 25, 2, 2, 0, 0]	Moderate engagement with consistent interaction
9	Machine Learning	10	8	1	1	[10, 8, 1, 1, 0, 0]	Low engagement, requires motivation
10	Data Analytics	65	45	3	5	[65, 45, 3, 5, 0, 0]	Highly engaged and revisits frequently

Table 12. Data after preprocessing both explicit and implicit signals.

		Explicit Signals					Implicit Signals
Learner ID	Initial Module	Pre-Test Score (%)	Post-Test Score (%)	Improvement (%)	Satisfaction Rating (1–5)	Completion Status (1 = Completed)	Time Spent (Minutes)	Click Count	Engagement Trend (Low = 1, Med = 2, High = 3)	Retries	Knowledge Graph Validation Outcome
1	Data Structures	70	85	15	5	1	50	30	2	3	Valid
2	Data Structures	60	80	20	4	1	25	20	1	1	Invalid
3	Algorithms	55	65	10	5	1	60	50	3	5	Valid
4	Binary Trees	72	90	18	3	0	15	10	1	1	Invalid
5	Graph Algorithms	65	77	12	4	1	45	28	2	2	Valid
6	SQL Basics	50	58	8	3	0	20	15	1	1	Invalid
7	Testing	60	75	15	5	1	55	40	3	3	Valid
8	Networking	55	65	10	4	1	35	25	2	2	Valid
9	Machine Learning	70	95	25	5	1	10	8	1	1	Valid
10	Data Analytics	62	80	18	4	1	65	45	3	5	Invalid

Table 13. Knowledge graph structure.

Initial Module	Target Module 1	Target Module 2	Target Module 3
Data Structures	Algorithms	Trees	Graph Algorithms
Algorithms	Dynamic Programming	Graph Algorithms	Machine Learning
Binary Trees	Graph Algorithms	Advanced Trees	Segment Trees
Graph Algorithms	Shortest Path Algorithms	Network Flow	Advanced Graph Theory
SQL Basics	Database Optimization	Advanced SQL	Data Warehousing
Testing	Integration Testing	System Testing	Performance Testing
Networking	Operating Systems	Network Security	Cloud Networking
Machine Learning	Deep Learning	Natural Language Processing	Reinforcement Learning
Data Analytics	Big Data Analytics	Business Intelligence	Visualization Techniques
Operating Systems	Memory Management	Process Scheduling	Virtualization

Table 14. Sample data after target module prediction.

Learner ID	Current Module	Explicit Signals	Implicit Signals	Combined Signals	Predicted Module	Knowledge Graph Validation Outcome
Learner 1	Data Structures	Score Improvement: 15%, Satisfaction: 4, Completion: Yes	Time Spent: 50 min, Clicks: 30, Engagement: Medium	High readiness; consistent engagement	Algorithms	Valid (Algorithms is a direct successor of Data Structures)
Learner 2	Data Structures	Score Improvement: 20%, Satisfaction: 5, Completion: Yes	Time Spent: 40 min, Clicks: 25, Engagement: High	Strong readiness; highly engaged	Machine Learning	Invalid (Machine Learning requires prior knowledge of Algorithms)
Learner 3	Algorithms	Score Improvement: 10%, Satisfaction: 3, Completion: No	Time Spent: 25 min, Clicks: 15, Engagement: Low	Needs review; weak engagement	Data Structures	Valid (Data Structures is a prerequisite for Algorithms)
Learner 4	Binary Trees	Score Improvement: 18%, Satisfaction: 5, Completion: Yes	Time Spent: 60 min, Clicks: 35, Engagement: High	Advanced readiness; highly engaged	Graph Algorithms	Invalid (Graph Algorithms does not directly depend on Binary Trees)
Learner 5	Graph Algorithms	Score Improvement: 12%, Satisfaction: 4, Completion: Yes	Time Spent: 50 min, Clicks: 30, Engagement: Medium	Consistent readiness; good engagement	Shortest Path Algorithms	Valid (Shortest Path Algorithms is an advanced topic after Graph Algorithms)
Learner 6	SQL Basics	Score Improvement: 8%, Satisfaction: 3, Completion: No	Time Spent: 30 min, Clicks: 20, Engagement: Low	Weak readiness; low engagement	Database Optimization	Invalid (it requires foundational database knowledge)—more content should be generated to reinforce SQL basics
Learner 7	Testing	Score Improvement: 15%, Satisfaction: 4, Completion: Yes	Time Spent: 45 min, Clicks: 25, Engagement: Medium	Consistent readiness; good engagement	Integration Testing	Valid (Integration Testing builds on Testing)
Learner 8	Networking	Score Improvement: 10%, Satisfaction: 3, Completion: No	Time Spent: 20 min, Clicks: 15, Engagement: Low	Needs review; weak engagement	Operating Systems	Valid (Operating Systems builds on Networking concepts)
Learner 9	Machine Learning	Score Improvement: 25%, Satisfaction: 5, Completion: Yes	Time Spent: 65 min, Clicks: 45, Engagement: High	Strong readiness; excellent engagement	Deep Learning	Valid (Deep Learning is the next step after Machine Learning)
Learner 10	Data Analytics	Score Improvement: 18%, Satisfaction: 4, Completion: Yes	Time Spent: 50 min, Clicks: 30, Engagement: Medium	High readiness; good engagement	Artificial Intelligence	Invalid (Artificial Intelligence is unrelated to Data Analytics in the KG)

Table 15. Feedback loop depiction with action taken.

Iteration	Feedback Collected	Action Taken	Result
1	“Module too difficult”—yes (explicit)	Adjusted difficulty level of the content such as foundational concepts will be provided to progress further.	Increased learner satisfaction by 10%.
2	Low time spent, high revisit counts (implicit)	Personalized module content (quizzes and hints).	Engagement trends improved by 12%.
3	“Recommendations are unrelated”—yes (explicit)	Expanded knowledge graph edges by using GAN to create personalize intermediate modules.	Reduced invalid predictions by 20%.
4	High drop-off rate in advanced modules (implicit)	Introduced intermediate modules dynamically by using GAN.	Learner retention increased by 15%.
5	Satisfaction ratings (1–5) inconsistent across modules (explicit)	Retrained model with updated feature weights.	Prediction accuracy improved by 8%.
6	Learners skipping certain modules (implicit)	Made sure learner solved foundational priorities to progress consistently.	Coverage of learning paths improved by 10%.
7	“Lack of examples in content” —yes (explicit)	Added GAN-generated examples dynamically.	Learner engagement increased by 14%.
8	High engagement but low quiz scores (implicit)	Suggested review modules before advancing.	Knowledge retention improved by 18%.
9	Positive feedback on personalized paths—yes (explicit)	Reinforced current recommendation logic.	System stability and reliability increased.
10	Learner satisfaction consistently high (explicit + implicit)	Scaled system for new users.	Model readiness for deployment confirmed.

Table 16. Performance metrics of ensemble models.

Iteration	Accuracy (%)	Precision	Recall	F1-Score
1	70	0.68	0.75	0.71
2	73	0.71	0.77	0.74
3	76	0.74	0.79	0.76
4	79	0.77	0.81	0.79
5	82	0.80	0.83	0.81
6	85	0.82	0.85	0.84
7	87	0.84	0.86	0.85
8	89	0.86	0.87	0.87
9	91	0.88	0.88	0.88
10	93	0.90	0.89	0.89
Average	83	0.80	0.83	0.81

Table 17. Performance metrics of models.

Iteration	Precision	Recall	F1-Score
1	0.70	0.80	0.74
2	0.72	0.82	0.76
3	0.75	0.83	0.78
4	0.77	0.85	0.80
5	0.80	0.85	0.82
6	0.82	0.86	0.84
7	0.84	0.87	0.85
8	0.86	0.88	0.87
9	0.88	0.88	0.88
10	0.90	0.89	0.90

Precision—improved consistently from 0.70 to 0.90, indicating a reduction in irrelevant or incorrect predictions. Recall—increased from 0.80 to 0.89, reflecting better identification of relevant data points across iterations. F1-score—gradually rose from 0.74 to 0.90, showcasing a balanced improvement in both precision and recall, reflecting overall system refinement.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Institute of Knowledge Innovation and Invention. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mohanraj, S.; Pichai, S. Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model. Appl. Syst. Innov. 2025, 8, 82. https://doi.org/10.3390/asi8030082

AMA Style

Mohanraj S, Pichai S. Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model. Applied System Innovation. 2025; 8(3):82. https://doi.org/10.3390/asi8030082

Chicago/Turabian Style

Mohanraj, Swathieswari, and Shanmugavadivu Pichai. 2025. "Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model" Applied System Innovation 8, no. 3: 82. https://doi.org/10.3390/asi8030082

APA Style

Mohanraj, S., & Pichai, S. (2025). Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model. Applied System Innovation, 8(3), 82. https://doi.org/10.3390/asi8030082

Article Menu

Explainable AI-Integrated and GAN-Enabled Dynamic Knowledge Component Prediction System (DKPS) Using Hybrid ML Model

Abstract

1. Introduction

2. Related Work

3. Materials and Methodology

3.1. Comparative Study of Choosing Model Pipelines

3.1.1. Suitability for Explicit Signals

3.1.2. Reason Behind Selecting XGBoost

3.1.3. Suitability for Implicit Signals

3.1.4. Reasons Behind Selecting GRUs

3.1.5. Hybrid Approach: Combining XGBoost and GRUs

3.1.6. Measuring Performance Metrics

3.2. Signal Categorization in Personalized Learning Systems

3.3. Preprocessing the Categorized Signals

3.3.1. Preprocessing Explicit Signals

3.3.2. Steps in Preprocessing Explicit Signals

3.3.3. Features Captured from Explicit Signals

3.3.4. Preprocessing Implicit Signals

3.3.5. Steps in Preprocessing Implicit Signals

3.3.6. Features Captured from Implicit Signals

3.3.7. Combined Preprocessing for Hybrid Model

3.4. Finding the Predicted Target Module

3.5. Validating the Predicted Target Module with the Knowledge Graph

3.6. Feedback Loop for Refinement

3.7. Incorporation of Conditional GAN (cGAN) in DKPS

3.8. Role of cGAN in Dynamic Content Generation

3.9. Explainable AI (XAI) Integration in the Proposed System (DKPS)

4. Dataset

5. Results and Discussion

5.1. Analysis of Hybrid Models

5.2. Knowledge Graph Validation

5.3. Content Generation

5.4. Feedback Collection

5.5. Refinement of ML Models

5.6. Explainable AI (XAI) Integration

5.7. Learner’s Progress with DKPS

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI