1. Introduction
Predicting student performance and dropout is a multi-layered, complex challenge that faces educational institutions and academic systems. It is not a simple computational process but a deep one that involves considering different integrated academic, educational, socio-economic, demographic, and linguistic factors that reflect how students behave in their educational environment. Early predictions based on predictive analytics and statistics enable academic guidance departments in universities to help students in their educational pathways through scheduling balanced timetables, selecting suitable courses, reducing confusion, and offering academic support. Furthermore, reducing student dropout is useful not only for students but also for universities and academic institutions, as it helps them in saving their resources. Students’ dropout means extended program duration that leads to a loss of financial costs, administrative burdens and teaching and advising hours. Through analyzing large numbers of different datasets, modern algorithms enable educational systems to predict dropout before it takes place, thus investigating the problem, diagnosing risks and providing support solutions to solve this problem. In addition, these huge volumes of data encompass various behavioral, academic, social and economic features that cannot be analyzed manually, so Machine Learning (ML) [
1] models can detect early warnings and formulate planned solutions or changes in administrative procedures, academic and psychological guidance manuals or even in course plans. The current study proposes a hybrid model that connects the Holistic Swarm Optimization (HSO) [
2] algorithm with the Random Forest (RF) [
3] classifier. The hybrid model suggests a multi-objective optimization framework that simultaneously maximizes the macro F1-score [
4], regulates model complexity [
5], and lessens inter-class performance [
6] disparity in order to overcome the drawbacks of conventional optimization strategies that prioritize predictive accuracy. The suggested method guarantees more balanced and equitable predictive performance by specifically taking fairness across various student outcome categories into account. Additionally, the framework uses ensemble-based probability dispersion to incorporate uncertainty-aware prediction [
7,
8], making it possible to identify high-risk students with different degrees of confidence. This capacity facilitates more knowledgeable and successful academic interventions, especially in early warning systems for dropout risk and student performance. The integration of the advanced, more accurate algorithm HSO, along with the reliable RF, enhances the prediction process, tunes hyperparameters, and thus increases accuracy.
Students’ dropouts from university programs may be caused by several reasons. These reasons could be dramatically lowered by monitoring and student guidance. Universities place special importance on the students’ guidance process as it is supposed to detect students at risk of dropout or continue their programs poorly and become uncompetitive with their colleagues in the workspace. The main problem is the lack of data reported on those students. Often, university guidance offices find out about these cases very late, causing students’ failure and wasting university resources. Also, datasets available for students’ dropouts suffer from an imbalance sourced from the low number of dropouts compared to successful students. Students’ dropout and academic success, a standard data set from the UCI repository, is utilized in the literature. This dataset suffers from imbalance, which causes the prediction model to be biased in favor of successful students and fails to detect students at risk (minority instances) of dropout [
9]. This literature section is dedicated to studying dropout topics, trying to reach the latest techniques and research gaps.
Prediction models based on machine learning algorithms interest researchers as they can be trained efficiently by existing data and produce models that can automatically classify new instances. For example, Lykourentzou et al. [
10] benefits Neural Networks (NNs) capabilities in hybrid with Support Vector Machine (SVM) and ensemble-based fuzzy ARTMAP to detect students’ dropout from e-learning courses. In comparison, their model outperformed other traditional classification algorithms. Also, Yukselturk et al. [
11] used strong predictive machine learning algorithms like Decision Trees (DT), Random Forests (RF), and SVM to classify dropouts in online program data. They proved these algorithms’ capabilities to predict student dropout from a set of factors represented in the dataset features. In traditional university programs, RF was investigated by Dekker et al. [
12] and Zhao et al. [
13] in detecting students at risk of dropping out. These models achieved accuracy between 75% and 80% on institutionally prepared datasets. Niyogisubizo et al. [
14] presented a stacking ensemble combining RF, XGBoost, Gradient Boosting (GB), and NN to predict student dropout using data from 2016 to 2020 at Constantine the Philosopher University in Nitra. Their model achieved higher accuracy and AUC. This helped investigators identify students at risk of university dropout. Martins et al. [
15] evaluated ML algorithms and boosting algorithms. Their experimental results showed that boosting algorithms achieved better performing classification in a dataset from a higher education institution. On releasing the UCI dropout dataset, Villar et al. [
16] evaluated RF, SVM, NN, and ensemble models. Their results indicated that tree-based methods performed better on imbalanced data. They reported that socioeconomic and parental features dramatically affect student success or failure during university progress. Kok et al. [
17] and Rebelo et al. [
18] analyzed Learning Management System (LMS) Moodle activity logs by gradient boosting, attention-based Recurrent Neural Network (RNN), and time-dependent features. They reported that behavioral features like login frequency, submissions and forum activity are strongly related to students’ success during programs and hence provide early signs for risks of dropouts. Also, Tamada et al. [
19] presented a systematic review on ML algorithms to support virtual learning students’ advisors and provide early signs of students at risk, which improves the student retention process. Gardner et al. [
20] were interested in analyzing a cross-institutional prediction model. Their experimental results proved that transfer models can be generalized as they simulate the performance of locally trained models. Vaarma et al. [
21] demonstrated that prediction models trained by LMSs can be generalized better than those dependent on demographic and pre-admission features. This concluded that behavioral features are more general than social and economic ones.
In the hybrid models context, Xiong et al. [
22] developed an educational prediction model using CNN and RNN. In comparison, their hybrid model competed with traditional ML methods with higher precision in dropout prediction but at the cost of interpretability. Although RF is a strong and robust machine learning algorithm because of its interpretability and overfitting resistance, metaheuristic optimization algorithms could be integrated with its capabilities to fine-tune hyper parameters. In the RF optimization field, metaheuristic algorithms like the Gray Wolf Optimizer (GWO) [
23], Particle Swarm Optimization (PSO) [
24] and Artificial Fish Swarm Algorithm (AFSA) [
25] were introduced in several fields. For example, Radhi et al. [
26] optimized RF to improve its performance in medical diagnostic systems. During the COVID-19 pandemic, the predictive model generalizes the idea of tuning RF hyper parameters by metaheuristic optimization in classifying imbalanced data. In a similar context, Khalidou et al. [
27] introduced optimizing RF by PSO on a heart-disease prediction model. This encouraged researchers to combine swarm-intelligence with RF algorithms for tuning hyper parameters and utilizing these capabilities in early predicting student dropouts. Khaseeb et al. [
28] improved the prediction model performance by combining GWO in feature selection on high-dimensional datasets. In the security field, the MOO-PSO algorithm [
29] is used to optimize RF and XGBoost classifiers. They tend to improve accuracy, convergence speed, and prediction model complexity. In the bioinformatics context, AFSA [
30] was introduced to optimize RF parameters based on a reduced set of relevant features. The comparisons proved that the proposed hybrid provides better performance than traditional models. Also, PSO and Genetic Algorithms (GA) [
31] were introduced by Shafiey et al. for hyperparameter tuning. They concluded that optimization improved performance over state-of-the-art prediction methods.
Since traditional machine learning models usually provide point estimates without expressing the confidence of their predictions, uncertainty-aware predictions provide a suitable substitute, especially in educational data mining and learning analytics. In high-stakes situations like early identification of at-risk students, where poor choices can have a detrimental impact on academic progress, such deterministic outputs may be deceptive. Recent research has used uncertainty estimation methods to measure prediction confidence to overcome this constraint.
In order to enable more dependable decision-making in ambiguous classification scenarios, Kornaev et al. [
7] proposed a multivariate, multi-view classification framework that explicitly models prediction uncertainty by combining information from multiple data perspectives. By examining consistency across perspectives instead of depending solely on single-model confidence scores, their method enhances awareness of uncertainty. For COVID-19 X-ray image classification, Gour and Jain [
8] presented an uncertainty-aware convolutional neural network that incorporates uncertainty estimation to differentiate between high-confidence and low-confidence predictions. By lowering the possibility of overconfident misclassifications, their approach improved diagnostic reliability and demonstrated the significance of uncertainty-aware models in crucial decision-support systems.
HSO is a recently introduced metaphor algorithm for optimization requirements by Akbari et al. [
2]. Unlike traditional swarm methods like PSO, AFSA, and GWO, which simulate metaphor-driven behaviors such as birds, HSO guides search movement by the entire population’s fitness distribution. Hence, HSO obtains a global view of the search space. Thereby, this research introduces HSO as an optimization algorithm to improve RF capabilities by tuning its hyper parameters. The prediction model is used to predict student dropouts from university programs based on a multi-objective and uncertainty-aware hybrid optimization framework. As HSO is a novel metaphor algorithm not used in an educational prediction model, this research tends to benefit its exploration and exploitation capabilities to tune RF hyperparameters to predict students at risk of dropout, saving their time and university resources as well. The next section presents methods and materials of the proposed HSO-RF hybrid model.
The following is a summary of this study’s primary contributions:
By combining HSO with the RF classifier, a strong multi-objective optimization framework is suggested. To tackle the problems of unbalanced multiclass educational datasets, predictive performance, model complexity, and inter-class fairness are jointly optimized.
By utilizing ensemble-based probability dispersion, an uncertainty-aware prediction mechanism is included in the HSO-RF framework. This allows for the identification of high-risk students with varied degrees of confidence and supports more dependable and practical academic intervention tactics.
The suggested framework consistently outperforms classical machine learning and ensemble baselines, achieving improved macro F1-score, competitive overall accuracy, and more balanced performance across minority and majority outcome classes, according to a thorough evaluation on a real-world educational dataset.
The efficacy of HSO in navigating intricate, plateau-rich hyperparameter search spaces is confirmed by convergence analysis and constant feature importance behavior, which support the robustness and stability of the optimized model.
Most methods optimize predictive accuracy as a single aim and produce deterministic forecasts without taking model confidence into consideration, notwithstanding the efficacy of current machine learning models in student dropout prediction. These restrictions make it more difficult to make practical decisions and lower the dependability of highly unbalanced educational datasets. This paper suggests a multi-objective, uncertainty-aware hybrid optimization approach that combines Random Forest classification and Holistic Swarm Optimization to fill these shortcomings. The suggested method quantifies prediction uncertainty to support reliable academic interventions while concurrently balancing predictive performance, model complexity, and inter-class fairness. The following sections are structured as follows:
Section 2 presents materials and methods.
Section 3 encompasses experiments and results.
Section 4 elaborates on the discussion, and
Section 5 concludes the main findings of the paper.
4. Discussion
The study’s findings show that incorporating uncertainty-aware prediction and multi-objective optimization into an RF framework significantly improves the reliability and robustness of multiclass student outcome prediction. The suggested HSO-RF framework clearly balances predictive performance, model complexity, and inter-class fairness, in contrast to traditional methods that maximize predictive accuracy alone. This produces more dependable results when there is a significant class imbalance. Despite being numerically modest, the observed increases in macro and weighted F1-scores are especially significant in educational datasets where minority groups, like dropout instances, are frequently underrepresented and more challenging to predict.
Achieving balanced performance across student outcome categories is largely dependent on the multi-objective design. The suggested framework reduces the propensity of ensemble models to overfit majority classes by penalizing excessive model complexity and class-wise performance difference during optimization. Improved class-wise F1-scores and smaller performance differences between graduate and dropout categories are clear indications of this effect. For early warning systems, where ignoring minority high-risk pupils can compromise the usefulness of predictive analytics, such balanced behavior is crucial.
In addition to improving performance, uncertainty-aware prediction is a major improvement over the deterministic student risk model. Because some student profiles are inherently ambiguous, the entropy-based uncertainty distributions show significant variation in prediction confidence. Higher predictive entropy is closely correlated with misclassification risk, as evidenced by the distinct divergence between uncertainty levels for successfully and wrongly categorized examples. This result provides empirical support for the suggested uncertainty modeling approach’s efficacy and emphasizes its significance for practical implementation, as decision-makers must choose between trustworthy alarms and unclear situations that need more research.
Uncertainty-aware outputs allow for more sophisticated academic interventions from an applied standpoint. While high-uncertainty scenarios might benefit from postponed choices, more data collection, or human review, high-confidence predictions can initiate urgent support actions. This capability, which emphasizes openness, dependability, and risk-aware decision-making in delicate areas like education, is consistent with growing ideals of trustworthy and responsible artificial intelligence.
The robustness of the suggested framework is further supported by the stability of the adjusted hyperparameters and the steady convergence behavior of the HSO algorithm. These findings imply that the holistic swarm-based search technique avoids premature convergence while successfully navigating intricate, non-convex hyperparameter domains. Additionally, the model’s explanatory value is strengthened by the semantic interpretation of influential features, which shows that academic, administrative, and socioeconomic factors collectively contain latent aspects of student motivation, resilience, and institutional restrictions.
Despite these advantages, there are certain drawbacks that should be discussed. Because just one publicly accessible dataset was used for the trials, generalizability across universities with various student demographics or academic frameworks may be limited. Furthermore, entropy-based uncertainty does not completely differentiate between aleatoric and epistemic uncertainty, even though it offers useful confidence estimates. Future research prospects include cross-institutional validation, longitudinal analysis, and the incorporation of more sophisticated uncertainty decomposition approaches to address these limitations.
Overall, by fusing robust optimization, confidence-aware prediction, and useful interpretability, the suggested multi-objective and uncertainty-aware HSO-RF framework pushes the boundaries of educational data mining. The findings show that increasing model credibility is just as important as increasing accuracy, especially when predictive algorithms are meant to guide student support plans and high-stakes academic decisions.
5. Conclusions
To improve multiclass student academic performance and dropout prediction, this study suggested a reliable, multi-objective, and uncertainty-aware predictive framework that combines the Random Forest (RF) classifier with Holistic Swarm Optimization (HSO). The suggested methodology simultaneously balances macro F1-score, model complexity, and inter-class performance fairness, in contrast to traditional methods that maximize predictive accuracy separately. This makes it especially useful for severely imbalanced educational datasets. Extensive trials on the publicly available Predict Students’ Dropout and Academic Success dataset show that the proposed HSO-RF framework outperforms standard machine learning models and improves upon the baseline RF configuration by roughly 1–2% in macro and weighted F1-score. It achieves a weighted F1-score of 76.0% and an overall accuracy of 77.74%. More importantly, the improved model reduces bias in favor of the majority “Graduate” class by doing better on minority classes, particularly when it comes to identifying dropout-prone students. In the context of imbalanced multiclass prediction, where increases in macro F1-score directly imply improved dependability for underrepresented and high-risk student groups, these improvements—while quantitatively modest—are noteworthy. Other than predictive performance, the suggested approach can measure confidence in individual student risk assessments thanks to the incorporation of uncertainty-aware prediction. By differentiating between high-confidence and high-uncertainty forecasts, this capacity gives academic advisers practical insights that promote better-informed and more focused intervention tactics. The robustness and interpretability of the suggested method are further supported by the observed stability of the adjusted hyperparameters and feature importance rankings. From a wider angle, the findings demonstrate how intricate relationships between administrative, socioeconomic, macroeconomic, and academic issues influence educational attainment. Together, these factors serve as significant markers of student motivation, resilience, and institutional restrictions in addition to being computational predictors. The suggested paradigm advances the creation of reliable and ethical educational analytics by combining robust optimization, uncertainty estimates, and semantic interpretation.
Future research will concentrate on expanding the framework to longitudinal and cross-institutional datasets, adding further explainability and fairness criteria, and investigating multimodal educational data sources like behavioral traces and learning management system logs. All things considered, the suggested multi-objective and uncertainty-aware HSO-RF architecture provides a dependable and useful decision-support tool for early warning systems, with great potential to enhance data-driven academic planning and student retention tactics.