You are currently viewing a new version of our website. To view the old version click .
Computers
  • Article
  • Open Access

9 December 2025

Enhancing Student Engagement and Performance Through Personalized Study Plans in Online Learning: A Proof-of-Concept Pilot Study

,
,
,
and
1
Department of Information Technology, University of Moratuwa, Karubedda, Moratuwa 10400, Sri Lanka
2
Univeristy of Colombo School of Computing, University of Colombo, UCSC Building Complex, Reid Avenue, Colombo 00700, Sri Lanka
*
Authors to whom correspondence should be addressed.
This article belongs to the Section Human–Computer Interactions

Abstract

This study examines how interaction data from Learning Management Systems (LMSs) can be leveraged to predict student performance and enhance academic outcomes through personalized study plans tailored to individual learning styles. The research followed three phases: (i) analyzing the relationship between engagement and performance, (ii) developing predictive models for academic outcomes, and (iii) generating customized study plan recommendations. Clustering analysis identified three distinct learner profiles—high-engagement–high-performance, low-engagement–high-performance, and low-engagement–low-performance—with no cases of high-engagement–low-performance, underscoring the pivotal role of engagement in academic success. Among clustering approaches, K-Means produced the most precise grouping. For prediction, Support Vector Machines (SVMs) achieved the highest accuracy (68.8%) in classifying students across 11 grade categories, supported by oversampling techniques to address class imbalance. Personalized study plans, derived using K-Nearest Neighbor (KNN) classifiers, significantly improved student performance in controlled experiments. To the best of our knowledge, this represents a novel attempt in this context to align predictive modeling with the full grading structure of undergraduate programs. These findings highlight the potential of integrating LMS data with machine learning to foster engagement and improve learning outcomes. Future work will focus on expanding datasets, refining predictive accuracy, and incorporating additional personalization features to strengthen adaptive learning.

1. Introduction

Over the past two decades, eLearning has progressively evolved into a widely adopted educational paradigm, with its growth accelerating significantly due to the global shift necessitated by the COVID-19 pandemic [1,2,3,4,5]. In response, educational institutions have increasingly embraced online platforms, with Learning Management Systems (LMSs) emerging as a central component in facilitating virtual learning. These platforms support a variety of instructional activities, including resource distribution, asynchronous and synchronous communication, collaborative learning, and assessments [6].
Beyond supplementing traditional face-to-face learning, LMSs now serve as the backbone of fully online education environments, offering unprecedented opportunities for self-directed and flexible learning. Consequently, they generate vast amounts of interaction data that reflect student engagement, learning behaviors, and performance patterns. This data-rich environment provides a fertile ground for data-driven educational interventions aimed at improving learning outcomes [7].
Despite these advancements, online learning environments still pose considerable challenges—particularly the lack of face-to-face interaction, which can result in learner isolation, diminished motivation, and inconsistent engagement [8]. Maintaining student interest and ensuring meaningful interaction with course content remain persistent obstacles [9]. Numerous studies have highlighted that student engagement is a key determinant of academic success, strongly correlated with learning outcomes and student satisfaction [10,11]. The same insights are further validated through recent analysis on virtual learning environments, which emphasizes the importance of persistence and consistency in engagement as strong predictors of learning performance [12,13]. Accordingly, numerous successful approaches have been proposed for predicting student performance in online learning environments [14,15]. Despite these developments, technological mechanisms to proactively support engagement and personalization remain underexplored. There is an increasing need for systems that not only predict student performance but also provide actionable, personalized recommendations to enhance learning outcomes—particularly in large-scale online learning environments. As AI continues to be integrated into educational platforms, customized learning experiences are expected to become a fundamental requirement in next-generation Learning Management Systems (LMSs) [16].
In this study, we aim to address this gap by investigating how LMS interaction data can be harnessed to support online learners through personalized study plan recommendations. We conceptualize engagement as the extent of interaction with LMS activities, and performance as the measurable outcome of assessments. The overarching goal is to explore whether personalized, data-driven study plans aligned with individual learning styles can significantly improve student outcomes. This study is guided by the following research questions:
RQ1:
What is the nature of the relationship between student engagement with LMS activities and academic performance?
RQ2:
How accurately can student performance be predicted based on LMS interaction patterns?
RQ3:
How effective are personalized study plans, aligned with learning styles, in enhancing student performance in online learning environments?
The remainder of this paper is organized as follows. Section 2 reviews related work on student engagement, performance prediction, and personalized learning in online education. Section 3 presents the research methodology, including data collection, preprocessing, and model development. Section 4 presents the results of the study detailing the clustering analysis used to explore the relationship between student engagement and academic performance, the predictive modeling techniques employed to forecast student outcomes, and the personalized study plan recommendation system. It further presents the results of the evaluation of the model effectiveness through a controlled experiment. Section 5 discusses key findings, implications, and limitations, while Section 6 concludes this study and outlines directions for future research.

3. Research Design

The research was carried out in three distinct phases: (1) identifying the relationships among learner engagement, interaction behavior, and academic performance; (2) predicting student performance based on these behavioral indicators; and (3) recommending personalized study plans based on individual learning styles. The overall research design concept, highlighting the key artifacts and deliverables associated with each stage of the research process, is depicted in Figure 1.
Figure 1. Research Design.
The three components of the research are highly interconnected and heavily reliant on a unified dataset comprising student interaction data and performance records. During the cluster analysis phase, the relationship between student engagement and academic performance was examined using both interaction and performance data. In the student performance prediction phase, engagement and performance metrics were used as inputs to a predictive model, which estimated expected academic outcomes based on current behavioral patterns. Additionally, to facilitate personalized study plan recommendations, individual learning styles were collected and systematically mapped to the corresponding interaction and performance data.
The evaluation of the study plan recommendation was conducted through a controlled experiment. Participants were randomly assigned to either an experimental group or a control group. Prior to the commencement of the experiment, the study procedure was clearly explained to all participants, and informed consent was obtained. Subjects in the experimental group were provided with personalized study plans, while those in the control group did not receive any intervention. To assess the effectiveness of the proposed study plan, the actual academic grades achieved by participants in both groups were compared against their respective predicted grades. The impact of the personalized study plan on academic performance was thus evaluated. Further details of the evaluation procedure are presented in the study plan recommendation section.

3.1. Data Collection and Preprocessing

This study required data from three key perspectives: (1) how students interacted with the learning platform, (2) their academic performance, and (3) their individual learning styles.
Participants (N = 846) were selected from the Faculty of Information Technology at the University of Moratuwa, Sri Lanka, using a convenience sampling method. The sample included undergraduates from three consecutive academic years—students in their second, third, and final years. From the wide range of courses offered through the Moodle LMS, six course modules were chosen for this study: Data Mining and Data Warehousing, Human–Computer Interaction, Software Engineering, Object-Oriented Analysis and Design, Management Information Systems, and Database Management Systems. These courses were selected because they posed similar levels of difficulty, helping ensure that students had a fairly uniform learning experience.
  • Interaction Data: The interaction data were extracted from Moodle’s system logs. These logs recorded every action performed on the LMS—not just by students, but also by instructors and administrators. Since the focus was on student behavior, the first step was to filter out only the student interactions. Then, the data were cleaned and processed to highlight meaningful learning activities. For each student, the number of interactions with each learning activity was counted using a custom program developed using Python 3.10.6 specifically for this purpose. The processed data were saved as a CSV file for use in the later stages of the research.
  • Performance Data: The students’ academic performance was recorded based on their final grades for the selected course modules.
  • Learning Style Data: To understand each student’s preferred learning style, the VARK questionnaire was used. This tool helped classify students into four categories—Visual, Auditory, Reading/Writing, and Kinesthetic learners. The responses collected through Google Forms were first exported in CSV format. Since the raw data did not directly provide learning style scores, preprocessing was carried out to compute each student’s VARK profile. This step was automated using a Python script, which analyzed the questionnaire items, calculated the scores, and produced a dataset, which was subsequently linked to individual student index numbers.

Data Anonymization, Cleaning, and Transformation

To protect the privacy of participants and prevent direct identification of individual records, each student was assigned a unique identification number. This anonymized ID replaced the original index number and was also used when completing the VARK learning style questionnaire. The mapping between students’ learning styles and their interaction and performance data was established through these anonymized IDs.
Before proceeding to model development and cluster analysis, the combined dataset underwent several preprocessing and enhancement steps. First, missing values in the interaction data—where students had not performed certain actions—were replaced with zero (0), signifying no activity or engagement with the corresponding item. In cases where performance data were missing, the mean value of the respective feature column was imputed using the Weka software. Mean imputation is a simple and widely used method that enhances the clarity and reproducibility of analyses, particularly in exploratory or large-scale studies. By filling in missing values instead of discarding cases, this approach preserves the full sample, ensuring that statistical power is maintained throughout the analysis.
To bring together behavioral and learning style information, the interaction records from two modules and two batches were combined into a single dataset. This dataset was then merged with the VARK results by matching the anonymized ID assigned to each student. The final, integrated dataset offers a clear view of both interaction frequencies and VARK scores for each student.
Finally, using Moodle’s event log fields, such as event context, component, and event name, a new dataset was built to support the modeling work in this study. Out of all possible events, twelve features were selected to represent key aspects of student interaction behavior. A detailed description of these features is provided in Table 1. The resulting dataset carried a total of 1709 records representing 846 students. As the anonymized student ID did not contribute to the prediction of academic performance, this attribute was excluded from the final dataset.
Table 1. Features extracted from LMS interaction logs.
To address potential outliers, a percentile-based filtering technique was applied using the quantile method in the pandas library. Specifically, values falling below the 0.001 percentile or above the 0.999 percentile within each feature column were considered outliers and were subsequently removed.
Even after outlier removal, the dataset could still contain inconsistencies in scale or distribution that might affect the performance of machine learning models. Therefore, additional data transformation techniques were applied to produce a more symmetrically distributed dataset. These preprocessing and transformation steps were executed using the pandas and Scikit-learn libraries in Google Colab, resulting in a clean, well-structured dataset suitable for modeling.

3.2. Selection of Tools and Technologies

This study adopted a structured approach to explore the relationship between student engagement and academic performance in online learning environments. A combination of clustering algorithms, machine learning prediction models, and personalized recommendation techniques was employed. The selection of each method was grounded in its suitability for the dataset and its alignment with this study’s objectives. Together, these techniques enabled a comprehensive analysis of learner behavior, accurate performance forecasting, and the development of tailored study support strategies. The modeling techniques and tools applied in each component of the study, along with their selection rationale and key characteristics, are summarized in Table 2.
Table 2. Overview of the modeling techniques used in this study.

4. Results

4.1. Relationship Between Student Engagement and Academic Performance

Student engagement and academic performance are both highly variable and context-dependent factors. Given the absence of predefined classes or optimal clustering criteria, an exploratory cluster analysis was conducted to uncover patterns in the data. The objective was to identify natural groupings of students based on their interactions with the Learning Management System (LMS) and examine how these groupings relate to academic outcomes.
To determine the optimal number of clusters (K), the elbow method was employed by evaluating the within-cluster sum of squares (WCSS) across K values ranging from 1 to 10. A pronounced bend at K = 2 in Figure 2 indicated a significant reduction in WCSS, suggesting that a two-cluster solution best fits the data. The smaller bends at K = 3 and K = 6 hint at possible variations in learner behavior where a three-cluster model might capture a moderate engagement group, while a six-cluster model could reveal finer sub-profiles, such as video-focused or quiz-oriented learners. They were not considered substantial enough to warrant deviation from the two-cluster solution. Although not explored in this study, these alternative structures suggest promising directions for future research to better understand and support diverse engagement patterns.
Figure 2. Application of elbow method.
Accordingly, four clustering algorithms were applied with K = 2: K-Means clustering (KM), Agglomerative Hierarchical Clustering (AHC), Gaussian Mixture Model (GMM), and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The models were evaluated using three standard cluster evaluation metrics:
  • Silhouette Score: Assesses how similar an object is to its own cluster compared to other clusters. Higher values indicate better-defined clusters.
  • Calinski–Harabasz Index (CH Index): Measures the ratio of between-cluster dispersion to within-cluster dispersion. Higher values suggest better performance.
  • Davies–Bouldin Index (DB Index): Evaluates the average similarity between each cluster and its most similar one. Lower values are preferable.
Each model was compared pairwise to determine the best-performing approach. Initially, AHC was compared to the GMM. AHC outperformed the GMM in both the Silhouette Score and CH Index, indicating denser and better-separated clusters, despite having a slightly higher DB Index. Consequently, AHC was selected to proceed to the next comparison stage against DBSCAN. The results of each comparison cycle are shown in Table 3.
Table 3. Aggregated evaluation matrices for clustering models.
DBSCAN achieved the highest Silhouette Score, suggesting well-separated clusters. However, it performed poorly on the CH and DB Indices, making its overall performance inconsistent. Given this, both AHC and DBSCAN were compared to K-Means for final selection.
K-Means surpassed both AHC and DBSCAN across all metrics: it recorded the second-highest Silhouette Score (0.219318), the highest CH Index (244.933881), and the lowest DB Index (1.954872). These results reflect K-Means’ effectiveness in generating distinct and compact clusters with minimal overlap, making it the most suitable algorithm for this dataset.
To identify patterns of student engagement early in the course, we applied K-Means clustering to interaction data and assignment scores collected during the first two weeks of instruction. The K-Means algorithm was applied with K = 2, chosen based on the elbow method and silhouette analysis. The clustering revealed two distinct student groups:
  • Cluster 1: Low-engagement group with minimal LMS activity and content interaction, with low or no forum engagement.
  • Cluster 2: High-engagement group with high frequency of LMS logins and resource views, and active participation in discussion forums.
Interestingly, students in the high-engagement cluster demonstrated higher assignment scores during the first two weeks, whereas the low-engagement cluster consistently exhibited lower performance on early assignments.
The identified cluster labels were subsequently employed to examine the association between student engagement and long-term academic performance. For this purpose, final exam grades were consolidated into two categories—high performance and low performance—according to the grade classification presented in Table 4. Grades classified as Excellent and Good (i.e., A+, A, A−, B+, and B) were grouped under the high-performance category, while all remaining grades were categorized as low performance.
Table 4. Result label encoding.
Following the analysis, only three distinct relationships between student engagement and academic performance were identified: (1) high engagement–high performance, (2) low engagement–high performance, and (3) low engagement–low performance.
Accordingly, only three types of students could be categorized based on the intersection of their engagement level and academic outcomes. The distribution of these student groups within the dataset is presented in Table 5.
Table 5. Student distribution in engagement–performance clusters.
Notably, no students with high engagement and low performance were observed in the entire dataset. This absence is a remarkable finding in the context of online education. It reinforces the critical role of sustained student engagement as a key success factor in improving academic outcomes in digital learning environments. At the same time, it is important to further assess the depth and quality of learning, moving beyond reliance on purely quantitative engagement indicators.

4.2. Performance Prediction Model

A student performance prediction model was developed using interaction log data from the Moodle learning management system, along with students’ academic performance data. The dataset, which had already undergone cleaning, scaling, and balancing, included the “result” column in categorical format. Since most machine learning algorithms require numerical input, the categorical grades in the result column were converted to numeric values using the label encoding technique, as outlined in Table 4. Each grade was assigned an integer value from 10 (A+) to 0 (F, I, P, N).
To ensure the features were on a comparable scale, standardization was applied using the StandardScaler function from the Scikit-learn library. This process centers the dataset by transforming each feature to have a zero mean and unit standard deviation. Mathematically, standardization is defined as
x = x μ σ
where x is the transformed variable, x is the original value, μ is the mean of the feature, and σ is the standard deviation. This transformation is crucial for improving the stability and convergence of machine learning models.
An initial examination of the dataset revealed a significant class imbalance among the target categories in the result column. Such imbalances can adversely affect model training by biasing predictions toward the majority class and reducing sensitivity to minority classes. To address this issue, we employed the Synthetic Minority Over-sampling Technique (SMOTE), a widely used oversampling method for managing imbalanced datasets.
The SMOTE works by generating synthetic examples for the minority class based on the feature space similarities between existing minority instances. This approach improves the balance between classes without simply duplicating existing data, thereby helping machine learning models generalize better. The application of the SMOTE was conducted after feature scaling and label encoding, ensuring that the synthetic data aligned with the original data distribution. The distribution of classes before and after the application of SMOTE is shown in Figure 3.
Figure 3. Class distribution: (a) before SMOTE, and (b) after SMOTE.
Principal Component Analysis (PCA) is a widely used dimensionality reduction technique that identifies patterns in data by analyzing the correlation among features and emphasizing variance. PCA enables the reduction in the feature space while retaining the most significant information. In this study, PCA was used to determine the optimal number of components that can be retained without a substantial loss of information.
The dataset included data from five different course modules, each containing varying numbers of submission records. Since these variations could contribute disproportionately to the total variance and potentially bias the model, the number of submissions was normalized to a scale of 10, rather than treating each module submission as a distinct feature. This normalization ensured consistency across modules and improved the effectiveness of PCA.
Before training, the dataset was split into training and test subsets using the train-test-split() function from the Scikit-learn library. Machine learning models were trained on the training set, and their performance was evaluated on the test set. The model achieving the highest classification accuracy was selected as the optimal model for downstream system development.
To assess the impact of oversampling and dimensionality reduction, each machine learning model was tested with and without applying SMOTE and PCA techniques. The resulting accuracies for all combinations are summarized in Table 6.
Table 6. Performance prediction accuracies by machine learning model under different preprocessing conditions.
While accuracy provides a general measure of how many labels were correctly predicted by the model, it does not offer insights into class-specific performance. In imbalanced datasets, high accuracy may be misleading if certain classes dominate the predictions while others are neglected. Therefore, to achieve a more comprehensive evaluation of the classification model, a confusion matrix and a classification report were generated, as presented in Table 7.
Table 7. Classification performance metrics by model with and without SMOTE and PCA.
These metrics allow for detailed analysis of precision, recall, and F1-score for each class, helping to identify which classes were well predicted and which were underrepresented in the model’s output.
The performance of the classification models under different preprocessing conditions is summarized in Table 7. Among the tested models, the Support Vector Machine (SVM) achieved the best overall performance, with the highest accuracy (0.69) and average precision (0.73) when trained with the SMOTE and without PCA. This suggests that the SVM benefits significantly from oversampling techniques that address class imbalance, although the application of PCA slightly reduced its performance. Random Forest also showed consistently strong results across all conditions, with a slight improvement in accuracy and precision when the SMOTE was applied without PCA (0.66 and 0.65, respectively), indicating that this model is robust to variations in preprocessing.
As shown in Figure 4, Classes 3 and 10 under-perform compared to others, having an F1 score below 0.5. Mid-range classes (4 to 9) perform reasonably well. Classes 0 and 1 perform considerably well, despite the limited number of cases in each class.
Figure 4. SVM performance: (a) evaluation matrices, and(b) confusion matrix.
In contrast, Softmax Regression exhibited the lowest performance across all metrics. Interestingly, it performed slightly better without the SMOTE and PCA, achieving 0.50 in both accuracy and average precision, while the SMOTE tended to degrade its results. XGBoost showed an opposite trend compared to the SVM; its best performance was observed without the SMOTE but with PCA (accuracy of 0.58 and precision of 0.63). The application of the SMOTE negatively impacted XGBoost’s performance across all evaluated metrics.
Overall, the results highlight that the effectiveness of preprocessing techniques, such as the SMOTE and PCA, varies by model. While the SVM and Random Forest benefited from oversampling, PCA did not consistently improve performance. These findings underscore the importance of model-specific tuning when handling imbalanced and high-dimensional educational datasets.

4.3. Study Plan Recommendation

The study plan recommender system was designed to utilize multiple input sources, including student interaction data from the Learning Management System (LMS), predicted academic performance, and each student’s learning style. Interaction data were collected via the LMS, while predicted grades were passed from the previously trained performance prediction model. To determine students’ learning styles, the VARK questionnaire was administered to the same cohort of students for whom LMS interaction and performance data were available.
The integration of these data sources was performed in two stages using the student identification number as the common key. First, the interaction data collected for individual course modules were manually combined into a single dataset. This manual preprocessing step was required due to the one-time nature of the data and the non-repetitive structure of each module. Second, learning style data collected through Google Form submissions were decoded from the exported CSV format. Each student’s VARK scores were derived by analyzing their questionnaire responses using a custom Python script. The script computed the individual scores for Visual, Aural, Read/Write, and Kinesthetic modalities and generated a dataset mapping each score to the student’s identification number. In cases where a student was enrolled in multiple modules, their corresponding VARK scores were mapped to each related interaction record.
Since the recommender system is expected to predict study plan outputs as interaction patterns, a well-prepared training dataset was crucial. The output labels were structured for binary classification, indicating the presence or absence of recommended patterns. Given that the recommender system performs multi-label binary classification, additional data transformations were required. The student’s desired grade, provided as text, was encoded numerically using the scale defined in Table 4, where A+ corresponds to 10 and failing or incomplete grades to 0.
Furthermore, student interaction features, which varied across numeric ranges, were binarized to standardize their influence in the model. Each interaction metric was transformed into a binary variable depending on whether it exceeded the mean value across the dataset. This ensured that the model could clearly identify whether a particular interaction was meaningfully expressed by the student.
To identify the most effective model for generating personalized study plans, two machine learning algorithms were tested: the K-Nearest Neighbors (KNNs) classifier and a Convolutional Neural Network (CNN). Both models were evaluated using the Hamming loss metric, which is appropriate for assessing performance in multi-label binary classification tasks. The evaluation results are summarized in Table 8.
Table 8. Model performance based on Hamming loss for study plan recommendation. Lower values indicate better multi-label classification.
The results clearly indicate that the KNN classifier outperforms the CNN model, achieving significantly lower Hamming loss values across the dataset. Due to its superior performance and lower computational complexity, the KNN model was selected as the underlying engine for the implementation of the study plan recommender system.
While the KNN outperformed the CNN in terms of Hamming loss, this may reflect the nature of the dataset and feature space. The KNN is well suited for structured, tabular data with limited samples, whereas CNNs generally require larger datasets and spatially structured inputs to perform optimally. Future studies could revisit this comparison with expanded datasets and additional feature engineering to test whether deep models capture further nuances.
The study plan recommendations are generated using the trained KNN model. The process outlines the full pipeline—from accepting new input data to writing the final recommendations into a structured CSV file. A sample output of a CSV file, containing recommended study plans for several students, is displayed in Figure 5.
Figure 5. Recommended study plan.
To evaluate the effectiveness of the proposed study plan recommender system, a controlled experiment was conducted using a custom-hosted Moodle platform. The platform simulated a semester-long learning program for an HTML course, with participation from 250 undergraduate students of the Faculty of Information Technology, University of Moratuwa. To incorporate individual learning preferences into the system, the VARK questionnaire was distributed among all participating students to collect data on their learning styles.
From this cohort, a subset of 30 students was selected and randomly divided into two groups. The experimental group received personalized study plan recommendations following their performances in Quiz 1 and Quiz 2, while the control group continued learning without any system-generated study guidance. Both groups proceeded through the remainder of the course under these conditions.
At the end of the course period, a final quiz was administered to all 30 students to assess learning outcomes. The predicted results, based on the machine learning performance model, and the actual quiz scores were recorded for each student in both groups. The difference between actual and predicted results was computed using the encoded grade values defined earlier (Table 4), with point deviation calculated as the difference between actual and predicted grade values.
The results for the control group (students without recommendations) and the experimental group (students with study plan recommendations) are presented in Table 9 and Table 10, respectively. The comparison aims to measure the impact of the study plan recommender system on academic improvement over the baseline predicted outcomes.
Table 9. Comparison of predicted and actual results for the control group (no recommendations).
Table 10. Comparison of predicted and actual results for the experiment group (with personalized study plans).
An analysis of the performance deviation among the control group revealed that 66.67% of the students experienced a decline in performance compared to their predicted grades. Only 6.67% of students showed an improvement, while 25% maintained their originally predicted performance level. These findings suggest that, in the absence of personalized study plan recommendations, the majority of students failed to meet their expected academic outcomes, with a notable tendency toward performance decline.
The results from the experiment group—students who received personalized study plan recommendations—demonstrate a significant improvement in academic performance. As shown in Table 10, 60% of the students achieved higher actual grades than their predicted results, while 26.67% maintained the same performance level. Only 13.33% of the students exhibited a slight decline in performance.
These findings contrast sharply with the control group, in which the majority of students experienced a decline in academic performance. In contrast, a significantly higher proportion of students in the experiment group demonstrated improved outcomes. This observation is further supported by the comparative analysis illustrated in Figure 6, where the performance deviations between the control and experiment groups are visually contrasted.
Figure 6. Comparison of performance outcomes between control and experiment groups.
The high percentage of performance improvement in the experiment group strongly suggests that the study plan recommender system had a positive impact on student learning outcomes, particularly benefiting those students who were previously predicted to perform at lower levels.

4.3.1. Statistical Analysis of Intervention Effectiveness

To evaluate the effectiveness of the personalized study plan recommendations, an independent two-sample t-test (Welch’s t-test) was conducted. This test compared the deviation between predicted performance and actual final quiz scores for two groups:
  • Control group (n = 15): Students who did not receive personalized study plan recommendations.
  • Experimental group (n = 15): Students who received recommendations aligned with their learning styles.
The performance difference was calculated as
difference = final quiz score ( grade points ) predicted grade points .
A positive difference indicates the student outperformed the prediction; a negative difference indicates underperformance.

4.3.2. Descriptive Statistics and t-Test Results

Table 11 summarizes the key statistics for each group. Note that the t-statistic and p-value describe the statistical relationship between the two groups and are therefore not specific to either one individually.
Table 11. Summary of t-test comparing student performance differences between groups. T-statistic and p-value apply to the comparison across groups.
Since the p-value is less than 0.05, the result is statistically significant. We conclude that the personalized study plan recommendations led to a meaningful improvement in student performance.

4.3.3. Box Plot Interpretation

Figure 7 illustrates a box plot comparing the distribution of performance differences between the two groups. Students who received personalized recommendations exhibited more consistent achievement levels and frequently surpassed their predicted grades. In contrast, those without recommendations tended to underperform and showed greater variability in outcomes. These patterns are consistent with the statistical analyses, reinforcing the hypothesis that tailored study support contributes to improved learning performance. The observed improvements further reflect the system’s primary objective—delivering personalized academic guidance based on individual learning styles and interaction behaviors—highlighting its potential effectiveness in authentic educational settings.
Figure 7. Box plot showing distribution of score differences (final quiz score - predicted score). Students with recommendations had higher medians and tighter distributions.

5. Discussion

This study was designed to examine how data-driven methods can be used to improve student success in online learning environments. The research addressed three core questions: (1) the relationship between LMS engagement and academic performance, (2) the feasibility of predicting student performance from LMS data, and (3) the impact of personalized study plans tailored to learning styles. The findings corresponding to each research question are summarized below.

5.1. RQ1: What Is the Nature of the Relationship Between Student Engagement with LMS Activities and Academic Performance?

The analysis revealed a strong correlation between students’ engagement patterns and their academic outcomes. K-Means clustering identified two clear engagement clusters: high and low engagement. These clusters were not only distinct in terms of LMS interaction metrics but also mirrored differences in formative assessment scores. The high engagement cluster consistently achieved better assignment marks.
When final exam grades were added to the dataset, three distinct engagement–performance profiles emerged: (1) high engagement–high performance, (2) low engagement–high performance, and (3) low engagement–low performance. Notably, no students were found in the high engagement–low performance category, suggesting that consistent interaction with LMS content is a strong predictor of academic success. However, the presence of low-engagement–high-performance students suggests that other latent factors (e.g., prior knowledge, external support) may also influence outcomes.

5.2. RQ2: How Accurately Can Student Performance Be Predicted Based on LMS Interaction Patterns?

Student performance prediction was approached as a multi-class classification task using 14 grade categories. Among the models tested, the Support Vector Machine (SVM) achieved the highest accuracy at 68.8%, especially when the dataset was enhanced with the SMOTE to mitigate class imbalance. This is a strong result considering the complexity of the task.
The application of dimensionality reduction (PCA) did not improve model performance and was, in some cases, detrimental. This likely stems from the already limited number of features in the dataset. Thus, while data balancing techniques proved beneficial, further dimensionality reduction is not recommended for similarly constrained datasets. Future models may benefit from richer interaction data, potentially including time-on-task, clickstream sequences, or quiz-level behavior.

5.3. RQ3: How Effective Are Personalized Study Plans, Aligned with Learning Styles, in Enhancing Student Performance in Online Learning Environments?

The effectiveness of personalized study plans was evaluated through a controlled experiment. Students in the experimental group received customized recommendations based on predicted performance and VARK-assessed learning styles, while the control group did not. The experimental group showed significantly better performance: 60% improved their grades compared to only 6.7% in the control group. Only 13.3% of the experimental group saw a performance decline, compared to 66.7% in the control group.
Further analysis showed that students predicted to achieve mid-level grades (C+ to B+) benefited the most from recommendations, with nearly all of them improving. High-performing students (A− and above) maintained their performance or improved slightly, with minimal drop-off. These findings suggest that personalized study plans are particularly effective for students at risk of underperformance. This group can further be reflected as "on-the-margin" students who are more at risk of failing due to inadequate engagement with learning resources compared to high-performing students who generally maintain consistent study behavior.
Figure 6 visualizes these group-level differences, reinforcing the positive impact of the study plan recommender system. However, with a small sample size (n = 15 per group), the minimum detectable effect size is relatively large, making it difficult to identify subtle effects. To capture smaller but potentially meaningful impacts of the system’s recommendations, a larger sample size would be required.

6. Conclusions and Future Work

This research demonstrates that LMS interaction data can be effectively used to (1) detect meaningful student engagement patterns, (2) predict academic performance at a fine-grained level with reasonable accuracy, and (3) recommend personalized study plans that enhance learning outcomes.
Nevertheless, several areas remain for improvement. In the present study, all LMS interactions were treated equally, without accounting for the varying pedagogical value of different activities. Future work should explore weighting schemes that differentiate between activity types (e.g., forum participation, quiz attempts, or resource views) to capture their relative contribution to learning outcomes. Similarly, tailoring predictive models to specific courses or subject domains may improve accuracy by reflecting discipline-specific learning behaviors.
Beyond simply weighting activities, more advanced feature engineering could play an important role in improving predictive performance. For example, examining when students engage—such as time-of-day activity, short study bursts versus longer, sustained sessions, or recurring weekly patterns—offers deeper insights than raw frequency counts alone. Similarly, analyzing session-level behaviors, the order in which activities are completed, and how engagement shifts over time can uncover subtle learning trajectories and even provide early signs of disengagement.
From a practical perspective, these richer features can help educators and learning platforms move beyond static predictions to more proactive interventions. For instance, identifying irregular study rhythms may highlight students who struggle to maintain consistency, while detecting repeated short bursts without sustained engagement could flag surface-level learning. By capturing these temporal and behavioral nuances, predictive models can not only become more accurate but also more actionable, supporting timely feedback, tailored study recommendations, and ultimately more effective learning experiences.
Although this study successfully mapped students’ learning styles to their interaction logs and leveraged these mappings for learning style–based recommendations, it did not explicitly investigate the direct linkage between learning styles and the structure of the recommended study plans. This omission leaves an important gap in the personalization pipeline. Once the holistic effort in fine-grained performance prediction is consolidated, future research must prioritize addressing this gap to fully realize the potential of adaptive, learning style-driven study plan recommendations.
Expanding the dataset to include larger and more diverse cohorts, along with incorporating real-time interaction features, will further improve model robustness and generalizability. Finally, integrating learner feedback into the study plan recommender system could enable adaptive, hybrid recommendation models that dynamically balance predictive analytics with student preferences, ultimately fostering deeper engagement and improved academic success in online learning environments.

Author Contributions

Conceptualization, I.K.; methodology, I.K.; software, R.A.A.S., V.C.S.V., and P.S.; validation, R.A.A.S., V.C.S.V., P.S., and I.K.; formal analysis, R.A.A.S., V.C.S.V., P.S., and I.K.; writing—original draft preparation, I.K.; writing—review and editing, A.S.A.; supervision, A.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted in accordance with the ethical principles outlined in the Belmont Report, including respect for persons, beneficence, and justice. Ethical approval for this research was obtained from the Research Ethics Committee of the University of Moratuwa (Approval ID: EDN/2023/014), prior to data collection. All procedures involving human participants complied with institutional and national research ethics guidelines. Informed consent was obtained from all participants before participation.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author. The data are not publicly available due to institutional data protection policies.

Acknowledgments

The authors would like to express their sincere gratitude to the staff and students of the Department of Information Technology, University of Moratuwa, for their valuable support during the data collection and evaluation phases of this research. This manuscript was language-edited with the assistance of OpenAI’s ChatGPT 5, which was used to improve grammar, phrasing, and clarity. All content, analyses, and interpretations are the sole work of the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Calvo-Flores, M.D.; Galindo, E.G.; Jiménez, M.P.; Piñeiro, O.P. Predicting students’ marks from Moodle logs using neural network models. Curr. Dev. Technol.-Assist. Educ. 2006, 1, 586–590. [Google Scholar]
  2. Picciano, A.G.; Seaman, J.; Allen, I.E. Educational transformation through online learning: To be or not to be. J. Asynchronous Learn. Netw. 2010, 14, 17–35. [Google Scholar] [CrossRef]
  3. Pardo, A.; Han, F.; Ellis, R.A. Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Trans. Learn. Technol. 2016, 10, 82–92. [Google Scholar] [CrossRef]
  4. López-Zambrano, J.; Lara, J.A.; Romero, C. Towards portability of models for predicting students’ final performance in university courses starting from Moodle logs. Appl. Sci. 2020, 10, 354. [Google Scholar] [CrossRef]
  5. García-Morales, V.J.; Garrido-Moreno, A.; Martín-Rojas, R. The transformation of higher education after the COVID disruption: Emerging challenges in an online learning scenario. Front. Psychol. 2021, 12, 616059. [Google Scholar] [CrossRef]
  6. Romero, C.; Espejo, P.G.; Zafra, A.; Romero, J.R.; Ventura, S. Web usage mining for predicting final marks of students that use Moodle courses. Comput. Appl. Eng. Educ. 2013, 21, 135–146. [Google Scholar] [CrossRef]
  7. Moubayed, A.; Injadat, M.; Shami, A.; Lutfiyya, H. Student engagement level in an e-learning environment: Clustering using k-means. Am. J. Distance Educ. 2020, 34, 137–156. [Google Scholar] [CrossRef]
  8. Nielson, K.B. Self-study with language learning software in the workplace: What happens? Lang. Learn. Technol. 2011, 15, 110–129. [Google Scholar] [CrossRef]
  9. Moubayed, A.; Injadat, M.; Shami, A.; Lutfiyya, H. Relationship between student engagement and performance in e-learning environment using association rules. In Proceedings of the 2018 IEEE World Engineering Education Conference (EDUNINE), Buenos Aires, Argentina, 11–14 March 2018; pp. 1–6. [Google Scholar]
  10. Azevedo, R.; Moos, D.C.; Witherspoon, A.M.; Chauncey, A. Issues in the Measurement of Cognitive and Metacognitive Regulatory Processes Used During Hypermedia Learning. In Proceedings of the AAAI Fall Symposium: Cognitive and Metacognitive Educational Systems, Arlington, VA, USA, 5–7 November 2009. [Google Scholar]
  11. Gray, J.A.; DiLoreto, M. The effects of student engagement, student satisfaction, and perceived learning in online learning environments. Int. J. Educ. Leadersh. Prep. 2016, 11, n1. [Google Scholar]
  12. Goh, T.T. Learning management system log analytics: The role of persistence and consistency of engagement behaviour on academic success. J. Comput. Educ. 2025, 1–24. [Google Scholar] [CrossRef]
  13. Johnston, L.J.; Griffin, J.E.; Manolopoulou, I.; Jendoubi, T. Uncovering Student Engagement Patterns in Moodle with Interpretable Machine Learning. arXiv 2024, arXiv:2412.11826. [Google Scholar] [CrossRef]
  14. Yuan, J.; Qiu, X.; Wu, J.; Guo, J.; Li, W.; Wang, Y.G. ntegrating behavior analysis with machine learning to predict online learning performance: A scientometric review and empirical study. arXiv 2024, arXiv:2406.11847. [Google Scholar]
  15. Hubbard, K.; Amponsah, S. Feature Engineering on LMS Data to Optimize Student Performance Prediction. arXiv 2025, arXiv:2504.02916. [Google Scholar] [CrossRef]
  16. Alotaibi, N.S. The impact of AI and LMS integration on the future of higher education: Opportunities, challenges, and strategies for transformation. Sustainability 2024, 16, 10357. [Google Scholar] [CrossRef]
  17. Heilporn, G.; Lakhal, S.; Bélisle, M.; St-Onge, C. Student engagement: A multidimensional measurement scale applied to blended course modalities at the university level. Mesure Éval. Éduc. 2020, 43, 1–31. [Google Scholar] [CrossRef]
  18. Lin, L.; Wang, F. Research on the relationship between learning engagement and learning performance in online learning. In Proceedings of the 15th International Conference on Education Technology and Computers, Barcelona, Spain, 26–28 September 2023; pp. 201–206. [Google Scholar]
  19. Lawson, H.A.; Lawson, M.A. Student engagement and disengagement as a collective action problem. Educ. Sci. 2020, 10, 212. [Google Scholar] [CrossRef]
  20. Wang, J.; Yu, Y. Machine learning approach to student performance prediction of online learning. PLoS ONE 2025, 20, e0299018. [Google Scholar] [CrossRef]
  21. Wang, Y.; Cao, Y.; Gong, S.; Wang, Z.; Li, N.; Ai, L. Interaction and learning engagement in online learning: The mediating roles of online learning self-efficacy and academic emotions. Learn. Individ. Differ. 2022, 94, 102128. [Google Scholar] [CrossRef]
  22. Saltmarsh, J.; Zlotkowski, E.; Hollander, E. Indicators of engagement. In Learning to Serve: Promoting Civil Society Through Service Learning; Springer: Boston, MA, USA, 2001; pp. 285–302. [Google Scholar]
  23. Conijn, R.; Snijders, C.; Kleingeld, A.; Matzat, U. Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Trans. Learn. Technol. 2016, 10, 17–29. [Google Scholar] [CrossRef]
  24. Cordovilla, A.; Emiliano, F.; Peña, M. Comparative analysis of machine learning models for predicting student success in online programming courses: A study based on LMS data and external factors. Mathematics 2024, 12, 3272. [Google Scholar] [CrossRef]
  25. Du Plooy, E.; Casteleijn, D.; Franzsen, D. Personalized Adaptive Learning in Higher Education: A Scoping Review of Key Characteristics and Impact. Heliyon 2024, 10, e20987. [Google Scholar] [CrossRef] [PubMed]
  26. Gao, Y. The Potential of Adaptive Learning Systems to Enhance Learning Outcomes: A Meta-Analysis. Ph.D. Thesis, University of Alberta, Edmonton, AB, Canada, 2023. [Google Scholar]
  27. Gkontzis, A.F.; Panagiotakopoulos, C.T.; Kotsiantis, S.; Verykios, V.S. Measuring engagement to assess performance of students in distance learning. In Proceedings of the 2018 9th International Conference on Information, Intelligence, Systems and Applications (IISA), Zakynthos, Greece, 23–25 July 2018; pp. 1–7. [Google Scholar]
  28. Mogus, A.M.; Djurdjevic, I.; Suvak, N. The impact of student activity in a virtual learning environment on their final mark. Act. Learn. High. Educ. 2012, 13, 177–189. [Google Scholar] [CrossRef]
  29. Corrigan, O.; Smeaton, A.F.; Glynn, M.; Smyth, S. Using educational analytics to improve test performance. In Proceedings of the European Conference on Technology Enhanced Learning, Toledo, Spain, 15–18 September 2015; pp. 42–55. [Google Scholar]
  30. Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R. Student engagement predictions in an e-learning System and their impact on student course assessment scores. Comput. Intell. Neurosci. 2018, 2018, 6347186. [Google Scholar] [CrossRef] [PubMed]
  31. Yadav, S.K.; Pal, S. Data mining: A prediction for performance improvement of engineering students using classification. arXiv 2012, arXiv:1203.3832. [Google Scholar] [CrossRef]
  32. Kedia, P.; Mishra, L. Exploring the factors influencing the effectiveness of online learning: A study on college students. Soc. Sci. Humanit. Open 2023, 8, 100559. [Google Scholar] [CrossRef]
  33. Alsumaidaie, M.S.I.; Nafea, A.A.; Mukhlif, A.A.; Jalal, R.D.; AL-Ani, M.M. Intelligent System for Student Performance Prediction Using Machine Learning. Baghdad Sci. J. 2024, 21, 3877–3891. [Google Scholar] [CrossRef]
  34. Ahmed, E. Student performance prediction using machine learning algorithms. Appl. Comput. Intell. Soft Comput. 2024, 2024, 4067721. [Google Scholar] [CrossRef]
  35. Chen, H.; Li, X.; Kumar, S. A Multidimensional Time-Series Model for Early-Risk Prediction in Online Learning. Comput. Educ. 2024, 205, 104553. [Google Scholar]
  36. Altabrawee, H.; Ali, O.A.J.; Ajmi, S.Q. Predicting students’ performance using machine learning techniques. J. Univ. Babylon Pure Appl. Sci. 2019, 27, 194–205. [Google Scholar] [CrossRef]
  37. Kolluru, V.; Mungara, S.; Chintakunta, A.N. Adaptive learning systems: Harnessing AI for customized educational experiences. Int. J. Comput. Sci. Inf. Technol. 2018, 6, 10–5121. [Google Scholar] [CrossRef]
  38. Bimba, A.T.; Idris, N.; Al-Hunaiyyan, A.; Mahmud, R.B.; Shuib, N.L.B.M. Adaptive feedback in computer-based learning environments: A review. Adapt. Behav. 2017, 25, 217–234. [Google Scholar] [CrossRef]
  39. Kabudi, T.; Pappas, I.; Olsen, D.H. AI-enabled adaptive learning systems: A systematic mapping of the literature. Comput. Educ. Artif. Intell. 2021, 2, 100017. [Google Scholar] [CrossRef]
  40. Hocine, N. A Systematic Literature Review of Adaptive Learning Systems Based on the Assessment of Collaboration Quality. In Proceedings of the 17th International Conference on Computer Supported Education (CSEDU 2025), Porto, Portugal, 1–3 April 2025. [Google Scholar]
  41. Toto, G.A.; Limone, P. Motivation, stress and impact of online teaching on Italian teachers during COVID-19. Computers 2021, 10, 75. [Google Scholar] [CrossRef]
  42. Strielkowski, W.; Grebennikova, V.; Lisovskiy, A.; Rakhimova, G.; Vasileva, T. AI-driven adaptive learning for sustainable educational transformation. Sustain. Dev. 2025, 33, 1921–1947. [Google Scholar] [CrossRef]
  43. Akavova, A.; Temirkhanova, Z.; Lorsanova, Z. Adaptive learning and artificial intelligence in the educational space. E3S Web Conf. 2023, 451, 06011. [Google Scholar] [CrossRef]
  44. Jain, A.K.; Dubes, R.C. Algorithms for Clustering Data; Prentice-Hall, Inc.: Englewood Cliffs, NJ, USA, 1988. [Google Scholar]
  45. Ikotun, A.M.; Ezugwu, A.E.; Abualigah, L.; Abuhaija, B.; Heming, J. K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Inf. Sci. 2023, 622, 178–210. [Google Scholar] [CrossRef]
  46. Sasirekha, K.; Baby, P. Agglomerative hierarchical clustering algorithm-a. Int. J. Sci. Res. Publ. 2013, 83, 83. [Google Scholar]
  47. Murtagh, F.; Contreras, P. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2012, 2, 86–97. [Google Scholar] [CrossRef]
  48. Murtagh, F.; Contreras, P. Algorithms for hierarchical clustering: An overview, II. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2017, 7, e1219. [Google Scholar] [CrossRef]
  49. Khan, K.; Rehman, S.U.; Aziz, K.; Fong, S.; Sarasvady, S. DBSCAN: Past, present and future. In Proceedings of the Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), Bangalore, India, 17–19 February 2014; pp. 232–238. [Google Scholar]
  50. Deng, D. DBSCAN clustering algorithm based on density. In Proceedings of the 2020 7th International Forum on Electrical Engineering and Automation (IFEEA), Hefei, China, 25–27 September 2020; pp. 949–953. [Google Scholar]
  51. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
  52. Ali, J.; Khan, R.; Ahmad, N.; Maqsood, I. Random forests and decision trees. Int. J. Comput. Sci. Issues (IJCSI) 2012, 9, 272. [Google Scholar]
  53. Xu, X.; Huang, S.L. Maximal correlation regression. IEEE Access 2020, 8, 26591–26601. [Google Scholar] [CrossRef]
  54. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  55. Dreiseitl, S.; Ohno-Machado, L. Logistic regression and artificial neural network classification models: A methodology review. J. Biomed. Inform. 2002, 35, 352–359. [Google Scholar] [CrossRef]
  56. Koren, Y.; Rendle, S.; Bell, R. Advances in Collaborative Filtering. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: New York, NY, USA, 2022; pp. 91–142. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.