MDPI - Publisher of Open Access Journals

29 pages, 1167 KB

Open AccessArticle

The Learning Style Decoder: FSLSM-Guided Behavior Mapping Meets Deep Neural Prediction in LMS Settings

by Athanasios Angeioplastis, John Aliprantis, Markos Konstantakis, Dimitrios Varsamis and Alkiviadis Tsimpiris

Computers 2025, 14(9), 377; https://doi.org/10.3390/computers14090377 - 8 Sep 2025

Viewed by 332

Abstract

Personalized learning environments increasingly rely on learner modeling techniques that integrate both explicit and implicit data sources. This study introduces a hybrid profiling methodology that combines psychometric data from an extended Felder–Silverman Learning Style Model (FSLSM) questionnaire with behavioral analytics derived from Moodle [...] Read more.

Personalized learning environments increasingly rely on learner modeling techniques that integrate both explicit and implicit data sources. This study introduces a hybrid profiling methodology that combines psychometric data from an extended Felder–Silverman Learning Style Model (FSLSM) questionnaire with behavioral analytics derived from Moodle Learning Management System interaction logs. A structured mapping process was employed to associate over 200 unique log event types with FSLSM cognitive dimensions, enabling dynamic, behavior-driven learner profiles. Experiments were conducted across three datasets: a university dataset from the International Hellenic University, a public dataset from Kaggle, and a combined dataset totaling over 7 million log entries. Deep learning models including a Sequential Neural Network, BiLSTM, and a pretrained MLSTM-FCN were trained to predict student performance across regression and classification tasks. Results indicate moderate predictive validity: binary classification achieved practical, albeit imperfect accuracy, while three-class and regression tasks performed close to baseline levels. These findings highlight both the potential and the current constraints of log-based learner modeling. The contribution of this work lies in providing a reproducible integration framework and pipeline that can be applied across datasets, offering a realistic foundation for further exploration of scalable, data-driven personalization. Full article

(This article belongs to the Special Issue Transformative Approaches in Education: Harnessing AI, Augmented Reality, and Virtual Reality for Innovative Teaching and Learning)

► Show Figures

Figure 1

16 pages, 949 KB

Open AccessArticle

Predicting the Cognitive and Social–Emotional Development of Minority Children in Early Education: A Data Science Approach

by Danail Brezov, Nadia Koltcheva and Desislava Stoyanova

AppliedMath 2025, 5(3), 113; https://doi.org/10.3390/appliedmath5030113 - 1 Sep 2025

Viewed by 784

Abstract

Our study tracks the development of 105 Roma children between 3 and 5 (median age: 51 months), enrolled in an NGO-aided developmental program. Each child undergoes pre- and post-assessment based on the Developmental Assessment of Young Children (DAYC), a standard tool used to [...] Read more.

Our study tracks the development of 105 Roma children between 3 and 5 (median age: 51 months), enrolled in an NGO-aided developmental program. Each child undergoes pre- and post-assessment based on the Developmental Assessment of Young Children (DAYC), a standard tool used to track the progress in early childhood development and detect delays. Data are gathered from three sources, teacher, parent/caregiver and specialist, covering four developmental domains and adaptive behavior scale. There are subjective biases; however, in the post-assessment, the teachers’ and parents’ evaluations converge. The test results confirm significant improvement in all areas (

p < 0.0001

), with the highest being in cognitive skills

32.2 %

and the lowest being in physical development

14.4 %

. We also apply machine learning methods to impute missing data and predict the likely future progress for a given student in the program based on the initial input, while also evaluating the influence of environmental factors. Our weighted ensemble regression models are coupled with principal component analysis (PCA) and yield average coefficients of determination

R^{2} \approx 0.7

for the features of interest. Also, we perform k-means clustering in the plane cognitive vs. social–emotional progress and consider the classification problem of predicting the group in which a given student would eventually be assigned to, with a weighted

F_{1}

-score of

0.83

and a macro-averaged area under the curve (AUC) of

0.94

. This could be useful in practice for the optimized formation of study groups. We explore classification as a means of imputing missing categorical data too, e.g., education, employment or marital status of the parents. Our algorithms provide solutions with the

F_{1}

-score ranging from

0.92

to

0.97

and, respectively, an AUC between

0.99

and 1. Full article

► Show Figures

Figure 1

23 pages, 7524 KB

Open AccessArticle

Analyzing Visual Attention in Virtual Crime Scene Investigations Using Eye-Tracking and VR: Insights for Cognitive Modeling

by Wen-Chao Yang, Chih-Hung Shih, Jiajun Jiang, Sergio Pallas Enguita and Chung-Hao Chen

Electronics 2025, 14(16), 3265; https://doi.org/10.3390/electronics14163265 - 17 Aug 2025

Viewed by 493

Abstract

Understanding human perceptual strategies in high-stakes environments, such as crime scene investigations, is essential for developing cognitive models that reflect expert decision-making. This study presents an immersive experimental framework that utilizes virtual reality (VR) and eye-tracking technologies to capture and analyze visual attention [...] Read more.

Understanding human perceptual strategies in high-stakes environments, such as crime scene investigations, is essential for developing cognitive models that reflect expert decision-making. This study presents an immersive experimental framework that utilizes virtual reality (VR) and eye-tracking technologies to capture and analyze visual attention during simulated forensic tasks. A360° panoramic crime scene, constructed using the Nikon KeyMission 360 camera, was integrated into a VR system with HTC Vive and Tobii Pro eye-tracking components. A total of 46 undergraduate students aged 19 to 24–23, from the National University of Singapore in Singapore and 23 from the Central Police University in Taiwan—participated in the study, generating over 2.6 million gaze samples (IRB No. 23-095-B). The collected eye-tracking data were analyzed using statistical summarization, temporal alignment techniques (Earth Mover’s Distance and Needleman-Wunsch algorithms), and machine learning models, including K-means clustering, random forest regression, and support vector machines (SVMs). Clustering achieved a classification accuracy of 78.26%, revealing distinct visual behavior patterns across participant groups. Proficiency prediction models reached optimal performance with a random forest regression (

R^{2}

= 0.7034), highlighting scan-path variability and fixation regularity as key predictive features. These findings demonstrate that eye-tracking metrics—particularly sequence-alignment-based features—can effectively capture differences linked to both experiential training and cultural context. Beyond its immediate forensic relevance, the study contributes a structured methodology for encoding visual attention strategies into analyzable formats, offering valuable insights for cognitive modeling, training systems, and human-centered design in future perceptual intelligence applications. Furthermore, our work advances the development of autonomous vehicles by modeling how humans visually interpret complex and potentially hazardous environments. By examining expert and novice gaze patterns during simulated forensic investigations, we provide insights that can inform the design of autonomous systems required to make rapid, safety-critical decisions in similarly unstructured settings. The extraction of human-like visual attention strategies not only enhances scene understanding, anomaly detection, and risk assessment in autonomous driving scenarios, but also supports accelerated learning of response patterns for rare, dangerous, or otherwise exceptional conditions—enabling autonomous driving systems to better anticipate and manage unexpected real-world challenges. Full article

(This article belongs to the Special Issue Autonomous and Connected Vehicles)

► Show Figures

Figure 1

23 pages, 2710 KB

Open AccessArticle

Non-Semantic Multimodal Fusion for Predicting Segment Access Frequency in Lecture Archives

by Ruozhu Sheng, Jinghong Li and Shinobu Hasegawa

Educ. Sci. 2025, 15(8), 978; https://doi.org/10.3390/educsci15080978 - 30 Jul 2025

Viewed by 508

Abstract

This study proposes a non-semantic multimodal approach to predict segment access frequency (SAF) in lecture archives. Such archives, widely used as supplementary resources in modern education, often consist of long, unedited recordings that are difficult to navigate and review efficiently. The predicted SAF, [...] Read more.

This study proposes a non-semantic multimodal approach to predict segment access frequency (SAF) in lecture archives. Such archives, widely used as supplementary resources in modern education, often consist of long, unedited recordings that are difficult to navigate and review efficiently. The predicted SAF, an indicator of student viewing behavior, serves as a practical proxy for student engagement. The increasing volume of recorded material renders manual editing and annotation impractical, making the automatic identification of high-SAF segments crucial for improving accessibility and supporting targeted content review. The approach focuses on lecture archives from a real-world blended learning context, characterized by resource constraints such as no specialized hardware and limited student numbers. The model integrates multimodal features from instructor’s actions (via OpenPose and optical flow), audio spectrograms, and slide page progression—a selection of features that makes the approach applicable regardless of lecture language. The model was evaluated on 665 labeled one-minute segments from one such course. Experiments show that the best-performing model achieves a Pearson correlation of 0.5143 in 7-fold cross-validation and 61.05% average accuracy in a downstream three-class classification task. These results demonstrate the system’s capacity to enhance lecture archives by automatically identifying key segments, which aids students in efficient, targeted review and provides instructors with valuable data for pedagogical feedback. Full article

(This article belongs to the Special Issue Artificial Intelligence and Blended Learning: Challenges, Opportunities, and Future Directions)

► Show Figures

Figure 1

31 pages, 2148 KB

Open AccessArticle

Supporting Reflective AI Use in Education: A Fuzzy-Explainable Model for Identifying Cognitive Risk Profiles

by Gabriel Marín Díaz

Educ. Sci. 2025, 15(7), 923; https://doi.org/10.3390/educsci15070923 - 18 Jul 2025

Viewed by 1043

Abstract

Generative AI tools are becoming increasingly common in education. They make many tasks easier, but they also raise questions about how students interact with information and whether their ability to think critically might be affected. Although these tools are now part of many [...] Read more.

Generative AI tools are becoming increasingly common in education. They make many tasks easier, but they also raise questions about how students interact with information and whether their ability to think critically might be affected. Although these tools are now part of many learning processes, we still do not fully understand how they influence cognitive behavior or digital maturity. This study proposes a model to help identify different user profiles based on how they engage with AI in educational contexts. The approach combines fuzzy clustering, the Analytic Hierarchy Process (AHP), and explainable AI techniques (SHAP and LIME). It focuses on five dimensions: how AI is used, how users verify information, the cognitive effort involved, decision-making strategies, and reflective behavior. The model was tested on data from 1273 users, revealing three main types of profiles, from users who are highly dependent on automation to more autonomous and critical users. The classification was validated with XGBoost, achieving over 99% accuracy. The explainability analysis helped us understand what factors most influenced each profile. Overall, this framework offers practical insight for educators and institutions looking to promote more responsible and thoughtful use of AI in learning. Full article

(This article belongs to the Special Issue Generative AI in Education: Current Trends and Future Directions)

► Show Figures

Figure 1

14 pages, 662 KB

Open AccessArticle

Changes in Body Mass Index Among Korean Adolescents Before and After COVID-19: A Comparative Study of Annual and Regional Trends

by Seong Jun Ha

Int. J. Environ. Res. Public Health 2025, 22(7), 1136; https://doi.org/10.3390/ijerph22071136 - 18 Jul 2025

Viewed by 499

Abstract

This study aimed to longitudinally analyze changes in body mass index (BMI) among Korean middle and high school students before and after the COVID-19 pandemic. Data were obtained from the national-level Physical Activity Promotion System (PAPS), collected between 2018 and 2024. A total [...] Read more.

This study aimed to longitudinally analyze changes in body mass index (BMI) among Korean middle and high school students before and after the COVID-19 pandemic. Data were obtained from the national-level Physical Activity Promotion System (PAPS), collected between 2018 and 2024. A total of 171,705 adolescents aged 13 to 18 were included in the analysis (86,542 males and 85,163 females), with a mean age of 15.2 years (SD = 1.68). Time-series analysis and two-way analysis of variance (ANOVA) were conducted to examine differences in BMI by year, sex, region (capital vs. non-capital), and urban–rural classification. The results indicated a significant increase in BMI during the pandemic period (2020–2022), peaking in 2022, followed by a gradual decline thereafter. Notably, male students and those living in rural or non-capital areas consistently exhibited higher BMI levels, suggesting structural disparities in access to physical activity opportunities and health resources. This study employed the Socio-Ecological Model and the Health Equity Framework as theoretical lenses to interpret BMI changes not merely as individual behavioral outcomes but as consequences shaped by environmental and policy-level determinants. The findings underscore the need for equity-based interventions in physical education and health policy to mitigate adolescent health inequalities during future public health crises. Full article

(This article belongs to the Special Issue Advances in Primary Health Care and Community Health)

► Show Figures

Figure 1

23 pages, 2960 KB

Open AccessArticle

Exploring Information Interaction Preferences in an LLM-Assisted Learning Environment with a Topic Modeling Framework

by Yiming Taclis Luo, Ting Liu, Patrick Cheong-Iao Pang, Zhuo Wang and Ka Ian Chan

Appl. Sci. 2025, 15(13), 7515; https://doi.org/10.3390/app15137515 - 4 Jul 2025

Cited by 2 | Viewed by 1151

Abstract

Large Language Models (LLMs) are driving a revolution in the way we access information, yet there remains a lack of exploration to capture people’s information interaction preferences in LLM environments. In this study, we designed a comprehensive analysis framework to evaluate students’ prompt [...] Read more.

Large Language Models (LLMs) are driving a revolution in the way we access information, yet there remains a lack of exploration to capture people’s information interaction preferences in LLM environments. In this study, we designed a comprehensive analysis framework to evaluate students’ prompt texts during a professional academic writing task. The framework includes a dimensionality reduction and classification method, three topic modeling approaches, namely BERTopic, BoW-LDA, and TF-IDF-NMF, and a set of evaluation criteria. These criteria assess both the semantic quality of topic content and the structural quality of clustering. Using this framework, we analyzed 288 prompt texts to identify key topics that reflect students’ information interaction behaviors. The results showed that students with low academic performance tend to focus on structural clarity and task execution, including task inquiry, format specifications, and methodological search, indicating that their interaction mode is instruction-oriented. In contrast, students with high academic performance interact with LLM not only in basic task completion but also in knowledge integration and the pursuit of novel ideas. This is reflected in more complex topic levels and diverse, innovative keywords. It shows that they have stronger self-planning and self-regulation abilities. This study provides a new approach to studying the interaction between students and LLM in engineering education by using natural language processing to process prompts, contributing to the exploration of the performance of students with different performance levels in professional academic writing using LLM. Full article

(This article belongs to the Special Issue Applications of Natural Language Processing to Data Science)

► Show Figures

Figure 1

8 pages, 1063 KB

Open AccessProceeding Paper

Predicting Student Success in English Tests Using Artificial Intelligence Algorithm

by Thao-Trang Huynh-Cam, Dat Tan Truong, Long-Sheng Chen, Tzu-Chuen Lu and Venkateswarlu Nalluri

Eng. Proc. 2025, 98(1), 19; https://doi.org/10.3390/engproc2025098019 - 20 Jun 2025

Viewed by 483

Abstract

In Vietnam, English proficiency is a graduation requirement and offers students great opportunities to win scholarships and employability after graduation. Universities in the Mekong Delta region (MDR) often face challenges in foresting students’ English proficiency despite continuous assistance offered. Although students have taken [...] Read more.

In Vietnam, English proficiency is a graduation requirement and offers students great opportunities to win scholarships and employability after graduation. Universities in the Mekong Delta region (MDR) often face challenges in foresting students’ English proficiency despite continuous assistance offered. Although students have taken online supplementary courses (OSC) delivered through e-learning systems to support their English formal classes for several years, students’ successes in English tests with such supplementary courses and the predictors of this issue remain unknown. Therefore, we developed a model to predict students’ success in English final tests based on behaviors and grades in OSC using logistic regression (LR) and classification and regression tree (CART) classifiers. A total of 109 students of OSC in a target university in MDR participated in this study, and the result showed that CART (area under the curve (AUC) = 89.3%) was slightly better than LR. The outcomes of this study contribute to students’ success in English tests and the enhancement of the effectiveness of online supplementary courses for English improvements. Full article

(This article belongs to the Proceedings of 2024 4th International Conference on Social Sciences and Intelligence Management (SSIM 2024))

► Show Figures

Figure 1

31 pages, 5232 KB

Open AccessArticle

A Comparative Evaluation of Machine Learning Methods for Predicting Student Outcomes in Coding Courses

by Zakaria Soufiane Hafdi and Said El Kafhali

AppliedMath 2025, 5(2), 75; https://doi.org/10.3390/appliedmath5020075 - 18 Jun 2025

Viewed by 895

Abstract

Artificial intelligence (AI) has found applications across diverse sectors in recent years, significantly enhancing operational efficiencies and user experiences. Educational data mining (EDM) has emerged as a pivotal AI application to transform educational environments by optimizing learning processes and identifying at-risk students. This [...] Read more.

Artificial intelligence (AI) has found applications across diverse sectors in recent years, significantly enhancing operational efficiencies and user experiences. Educational data mining (EDM) has emerged as a pivotal AI application to transform educational environments by optimizing learning processes and identifying at-risk students. This study leverages EDM within a Moroccan university (Hassan First, University Settat, Morocco) context to augment educational quality and improve learning. We introduce a novel “Hybrid approach” that synthesizes students’ historical academic records and their in-class behavioral data, provided by instructors, to predict student performance in initial coding courses. Utilizing a range of machine learning (ML) algorithms, our research applies multi-classification, data augmentation, and binary classification techniques to evaluate student outcomes effectively. The key performance metrics, accuracy, precision, recall, and F1-score, are calculated to assess the efficacy of classification. Our results highlight the long short-term memory (LSTM) algorithm’s robustness achieving the highest accuracy of 94% and an F1-score of 0.87 along with a support vector machine (SVM), indicating high efficacy in predicting student success at the onset of learning coding. Furthermore, the study proposes a comprehensive framework that can be integrated into learning management systems (LMSs) to accommodate generational shifts in student populations, evolving university pedagogies, and varied teaching methodologies. This framework aims to support educational institutions in adapting to changing educational dynamics while ensuring high-quality, tailored learning experiences for students. Full article

► Show Figures

Figure 1

26 pages, 2575 KB

Open AccessArticle

Comparing the Effectiveness of Machine Learning and Deep Learning Models in Student Credit Scoring: A Case Study in Vietnam

by Nguyen Thi Hong Thuy, Nguyen Thi Vinh Ha, Nguyen Nam Trung, Vu Thi Thanh Binh, Nguyen Thu Hang and Vu The Binh

Risks 2025, 13(5), 99; https://doi.org/10.3390/risks13050099 - 20 May 2025

Cited by 3 | Viewed by 2716

Abstract

In emerging markets like Vietnam, where student borrowers often lack traditional credit histories, accurately predicting loan eligibility remains a critical yet underexplored challenge. While machine learning and deep learning techniques have shown promise in credit scoring, their comparative performance in the context of [...] Read more.

In emerging markets like Vietnam, where student borrowers often lack traditional credit histories, accurately predicting loan eligibility remains a critical yet underexplored challenge. While machine learning and deep learning techniques have shown promise in credit scoring, their comparative performance in the context of student loans has not been thoroughly investigated. This study aims to evaluate and compare the predictive effectiveness of four supervised learning models—such as Random Forest, Gradient Boosting, Support Vector Machine, and Deep Neural Network (implemented with PyTorch version 2.6.0)—in forecasting student credit eligibility. Primary data were collected from 1024 university students through structured surveys covering academic, financial, and personal variables. The models were trained and tested on the same dataset and evaluated using a comprehensive set of classification and regression metrics. The findings reveal that each model exhibits distinct strengths. Deep Learning achieved the highest classification accuracy (85.55%), while random forest demonstrated robust performance, particularly in providing balanced results across classification metrics. Gradient Boosting was effective in recall-oriented tasks, and support vector machine demonstrated strong precision for the positive class, although its recall was lower compared to other models. The study highlights the importance of aligning model selection with specific application goals, such as prioritizing accuracy, recall, or interpretability. It offers practical implications for financial institutions and universities in developing machine learning and deep learning tools for student loan eligibility prediction. Future research should consider longitudinal data, behavioral factors, and hybrid modeling approaches to further optimize predictive performance in educational finance. Full article

► Show Figures

Figure 1

21 pages, 4365 KB

Open AccessArticle

Teaching Artificial Intelligence and Machine Learning in Secondary Education: A Robotics-Based Approach

by Georgios Karalekas, Stavros Vologiannidis and John Kalomiros

Appl. Sci. 2025, 15(8), 4570; https://doi.org/10.3390/app15084570 - 21 Apr 2025

Cited by 1 | Viewed by 2130

Abstract

The rapid advancement of Artificial Intelligence (AI) and Machine Learning (ML) highlights the need for innovative, engaging educational approaches in secondary education. This study presents the design and classroom implementation of a robotics-based lesson aimed at introducing core AI and ML concepts to [...] Read more.

The rapid advancement of Artificial Intelligence (AI) and Machine Learning (ML) highlights the need for innovative, engaging educational approaches in secondary education. This study presents the design and classroom implementation of a robotics-based lesson aimed at introducing core AI and ML concepts to ninth-grade students without prior programming experience. The intervention employed two low-cost, 3D-printed robots, each used to illustrate a different aspect of intelligent behavior: (1) rule-based automation, (2) supervised learning using image classification, and (3) reinforcement learning. The lesson was compared with a previous implementation of similar content delivered through software-only activities. Data were collected through classroom observation and student–teacher discussions. The results indicated increased student engagement and enthusiasm in the robotics-based version, as well as improved conceptual understanding. The approach required no specialized hardware or instructor expertise, making it easily adaptable for broader use in school settings. Full article

(This article belongs to the Special Issue ICT in Education, 2nd Edition)

► Show Figures

Figure 1

13 pages, 1475 KB

Open AccessArticle

Prevalence of Hypertension in Adolescents: Differences Between 2016 ESH and 2017 AAP Guidelines

by Caterina Carollo, Luigi Peritore, Alessandra Sorce, Emanuele Cirafici, Miriam Bennici, Luca Tortorici, Riccardo Polosa, Giuseppe Mulè and Giulio Geraci

J. Clin. Med. 2025, 14(6), 1911; https://doi.org/10.3390/jcm14061911 - 12 Mar 2025

Viewed by 859

Abstract

Introduction: The American Academy of Pediatrics (AAP) published in 2017 new guidelines for the screening and management of hypertension in children containing different nomograms compared to the European guidelines, leading to a reclassification of blood pressure values, the consequences of which are still [...] Read more.

Introduction: The American Academy of Pediatrics (AAP) published in 2017 new guidelines for the screening and management of hypertension in children containing different nomograms compared to the European guidelines, leading to a reclassification of blood pressure values, the consequences of which are still little investigated. The aim of our study was to evaluate the prevalence of high blood pressure values estimated with both the most recent American and European guidelines and to analyze the relationship of blood pressure increases with lifestyles and potentially risky behaviors in a school population in Western Sicily. Methods: On the occasion of the XV World Hypertension Day, blood pressure values of 1301 students aged between 13 and 18 were measured. Two questionnaires were administered, one relating to anamnestic data and anthropometric parameters and a second aimed at investigating lifestyle. For the diagnosis of increased blood pressure, both ESH and AAP criteria were considered. Results: The prevalence of elevated blood pressure was 7.5% according to ESH criteria and nearly twice as high using AAP criteria, with a more pronounced discrepancy in females. Individuals with elevated blood pressure were younger, exhibited higher body weight and BMI, and had an increased prevalence of overweight and obesity. Classification based on ESH criteria revealed higher alcohol and drug consumption among normotensive individuals. AAP criteria identified a higher proportion of males and greater height in the hypertensive group. Systolic blood pressure correlated significantly with height, weight, and BMI, with stronger associations in males, while diastolic pressure correlated with weight and BMI. Conclusions: To the best of our knowledge, our study is the only one to analyze the prevalence of increased blood pressure and its relationship with lifestyle factors and anthropometric data in adolescence in our region. Our study confirms that elevated blood pressure is common in adolescence, with higher prevalence using the 2017 AAP criteria than ESH guidelines. Full article

(This article belongs to the Special Issue Hypertension in Adults: Current Updates on Diagnosis, Treatment and Management)

► Show Figures

Figure 1

29 pages, 4066 KB

Open AccessArticle

SAPEx-D: A Comprehensive Dataset for Predictive Analytics in Personalized Education Using Machine Learning

by Muhammad Adnan Aslam, Fiza Murtaza, Muhammad Ehatisham Ul Haq, Amanullah Yasin and Numan Ali

Data 2025, 10(3), 27; https://doi.org/10.3390/data10030027 - 20 Feb 2025

Cited by 4 | Viewed by 2016

Abstract

Education is crucial for leading a productive life and obtaining necessary resources. Higher education institutions are progressively incorporating artificial intelligence into conventional teaching methods as a result of innovations in technology. As a high academic record raises a university’s ranking and increases student [...] Read more.

Education is crucial for leading a productive life and obtaining necessary resources. Higher education institutions are progressively incorporating artificial intelligence into conventional teaching methods as a result of innovations in technology. As a high academic record raises a university’s ranking and increases student career chances, predicting learning success has been a central focus in education. Both performance analysis and providing high-quality instruction are challenges faced by modern schools. Maintaining high academic standards, juggling life and academics, and adjusting to technology are problems that students must overcome. In this study, we present a comprehensive dataset, SAPEx-D (Student Academic Performance Exploration), designed to predict student performance, encompassing a wide array of personal, familial, academic, and behavioral factors. Our data collection effort at Air University, Islamabad, Pakistan, involved both online and paper questionnaires completed by students across multiple departments, ensuring diverse representation. After meticulous preprocessing to remove duplicates and entries with significant missing values, we retained 494 valid responses. The dataset includes detailed attributes such as demographic information, parental education and occupation, study habits, reading frequencies, and transportation modes. To facilitate robust analysis, we encoded ordinal attributes using label encoding and nominal attributes using one-hot encoding, expanding our dataset from 38 to 88 attributes. Feature scaling was performed to standardize the range and distribution of data, using a normalization technique. Our analysis revealed that factors such as degree major, parental education, reading frequency, and scholarship type significantly influence student performance. The machine learning models applied to this dataset, including Gradient Boosting and Random Forest, demonstrated high accuracy and robustness, underscoring the dataset’s potential for insightful academic performance prediction. In terms of model performance, Gradient Boosting achieved an accuracy of 68.7% and an F1-score of 68% for the eight-class classification task. For the three-class classification, Random Forest outperformed other models, reaching an accuracy of 80.8% and an F1-score of 78%. These findings highlight the importance of comprehensive data in understanding and predicting academic outcomes, paving the way for more personalized and effective educational strategies. Full article

(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)

► Show Figures

Figure 1

23 pages, 1615 KB

Open AccessArticle

Enhancing Student Academic Success Prediction Through Ensemble Learning and Image-Based Behavioral Data Transformation

by Shuai Zhao, Dongbo Zhou, Huan Wang, Di Chen and Lin Yu

Appl. Sci. 2025, 15(3), 1231; https://doi.org/10.3390/app15031231 - 25 Jan 2025

Cited by 2 | Viewed by 1800

Abstract

Predicting student academic success is a significant task in the field of educational data analysis, offering insights for personalized learning interventions. However, the existing research faces challenges such as imbalanced datasets, inefficient feature transformation methods, and limited exploration data integration. This research introduces [...] Read more.

Predicting student academic success is a significant task in the field of educational data analysis, offering insights for personalized learning interventions. However, the existing research faces challenges such as imbalanced datasets, inefficient feature transformation methods, and limited exploration data integration. This research introduces an innovative method for predicting student performance by transforming one-dimensional student online learning behavior data into two-dimensional images using four distinct text-to-image encoding methods: Pixel Representation (PR), Sine Wave Transformation (SWT), Recurrence Plot (RP), and Gramian Angular Field (GAF). We evaluated the transformed images using CNN and FCN individually as well as an ensemble network, EnCF. Additionally, traditional machine learning methods, such as Random Forest, Naive Bayes, AdaBoost, Decision Tree, SVM, Logistic Regression, Extra Trees, K-Nearest Neighbors, Gradient Boosting, and Stochastic Gradient Descent, were employed on the raw, untransformed data with the SMOTE method for comparison. The experimental results demonstrated that the Recurrence Plot (RP) method outperformed other transformation techniques when using CNN and achieved the highest classification accuracy of 0.9528 under the EnCF ensemble framework. Furthermore, the deep learning approaches consistently achieved better results than traditional machine learning, underscoring the advantages of image-based data transformation combined with advanced ensemble learning approaches. Full article

► Show Figures

Figure 1

22 pages, 3214 KB

Open AccessArticle

Cheating Detection in Online Exams Using Deep Learning and Machine Learning

by Bahaddin Erdem and Murat Karabatak

Appl. Sci. 2025, 15(1), 400; https://doi.org/10.3390/app15010400 - 3 Jan 2025

Cited by 5 | Viewed by 7165

Abstract

This study aims to identify the best deep learning and machine learning models to identify the unethical behavior patterns of learners using distance education exam data of an educational institution. One hundred twenty-nine online exam data were analyzed by the researcher with three [...] Read more.

This study aims to identify the best deep learning and machine learning models to identify the unethical behavior patterns of learners using distance education exam data of an educational institution. One hundred twenty-nine online exam data were analyzed by the researcher with three different scenarios to reveal the best model performance in regression and classification. For regression and classification, deep neural network (DNN) from deep learning algorithms and support vector machine (SVM), decision trees (DTs), k-nearest neighbor (KNN), random forest (RF), logistic regression (LR), and extreme gradient boosting (XGBoost) algorithms from machine learning algorithms were used. In the regression analysis conducted within the scope of Scenario-1, the model we proposed to detect “cheating” behavior, which is one of the unethical learner behaviors, was found to be a 5-layer DNN model with a test performance success of 80.9%. In the binary classification analysis for Scenario-2, students who “copied” from unethical behaviors were obtained with an accuracy rate of 96.9% by the model established by the 10-layer DNN algorithm we proposed. In the triple classification analysis for Scenario-3 defined in the study, the XGBoost model was found to have the highest accuracy rate of 97.7% for students who “cheated” due to unethical behaviors and the highest performance in all other metric values. In addition, SHAP and LIME methods, which are explanatory methods for the XGBoost model, which is one of the best-performing models, were applied, and the attributes and percentages affecting the model were shared. As a result of this study, it has been shown that the application of the most appropriate layer functions and parameter selection that will increase performance can be effective in estimating complex problems and target values that cannot be solved using classical mathematical models. The proposed models can provide educational institutions with a roadmap and insight in evaluating online examination practices and ensuring academic integrity. Future researchers may need more data sets and different analyses for better performance of the established models. Full article

(This article belongs to the Topic Software Engineering and Applications)

► Show Figures

Figure 1

Search Results (73)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (73)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI