Arnold and Pistilli [23] | April 2012 | EWS | Course Signals EWS, used in courses at Purdue University (USA), with a special focus on first year students. | Students’ risk of failing the course. | Obtained from the institutional LMS (Blackboard Vista): demographic information, performance and effort indicators, prior academic history. | “Student Success Algorithm”, producing a single score by weighting all input parameters [24]. |
Chen [21] | May 2013 | Predictor | A group of 38 freshmen students from a Taiwanese university. | Final grade in the course. | Quality and quantity of the notes taken by students, both during and after lectures. | Hierarchical regression analysis. |
Krumm et al. [26] | March 2014 | EWS | Student Explorer EWS, targeting STEM students at Maryland and California universities (USA). | Students’ risk of failing a course. | Performance and effort indicators from the institutional LMS. | Weighted aggregation of input data. |
Waddington and Nam [27] | March 2014 | EWS | An extension of Student Explorer EWS. Tested with 8762 students enrolled in a chemistry course. | Final grade in a course. | Data used by the basic version of Student Explorer, plus use of academic resources in the LMS. | Multinomial logistic regression. |
Brown et al. [28] | April 2016 | EWS | An extension of Student Explorer EWS. Tested with 556 students belonging to various first-year courses. | Data used by the basic version of Student Explorer, plus contextual information such as size of cohorts and specific STEM field of the degree. | Event history analysis. | |
Schuck [16] | February 2017 | Predictor | Over 1000 higher education institutions in the USA. | Graduation rate, i.e. fraction of students that finish their degree within the intended number of years. | Crime and violence indicators in and around campus, provided by the US Department of Education and the National Center for Education Statistics. | Multivariate least squares regression. |
Brown et al. [29] | March 2017 | EWS | An extension of Student Explorer EWS. Tested with 2169 students in a Statistics course. | Data used by the basic version of Student Explorer, plus type of interventions performed on struggling students. | Event history analysis. | |
Akhtar et al. [32] | June 2017 | EWS | Laboratory sessions of computer-aided design courses at University of Surrey (England), with a sample size of 331 students. | Students’ risk of failing the course. | Attendance to class, location, and neighbors within the lab, time spent doing exercises. | ANOVA, Pearson correlation, linear regression. |
Cohen [35] | October 2017 | EWS | A sample of 362 students of mathematics and statistics at an Israeli university. | Chance of dropping out of the course. | LMS activity: type, timing, and frequency of actions performed. | Mann–Whitney U test to prove a correlation between low student activity and a higher dropout chance. |
Ornelas and Ordonez [10] | October 2017 | Predictor | 13 courses at Rio Salado Community College (USA), with a sample size of around 8700 students. | Students’ chance of obtaining a passing grade. | Engagement and performance indicators from the institutional LMS. | Naïve Bayesian classification. |
Thompson et al. [11] | February 2018 | Predictor | An introductory biology course at Northern Kentucky University (USA), including 413 students. | Students’ chance of passing the course. | Results from Lawson’s Classroom Test of Scientific Reasoning and ACT Mathematics Test, taken before the start of the course. | Logistic regression. |
Benablo et al. [12] | February 2018 | Predictor | 100 Information Technologies and Computer Science students in the Philippines. | Identification of underperforming students. | Student age, gender, academic standing and procrastination indicators: time spent using social networks and playing online games. | SVM, KNN, RF. SVM had the best performance. |
Brown et al. [30] | March 2018 | EWS | An extension of Student Explorer EWS. Tested with 987 students in an introductory programming course. | Students’ risk of failing a course. | Data used by the basic version of Student Explorer, plus difficulty estimations of concurrent courses. | Binary logistic regression. |
Howard et al. [33] | April 2018 | EWS | 136 students in a Practical Statistics course at University College Dublin (Ireland). | Final grade in the course. | Results of weekly tests, as well as demographic information and access to online course resources. | RF, BART, XGBoost, PCR, SVM, NN, Splines, KNN. BART had the best performance. |
Hirose [15] | July 2018 | Predictor | Around 1100 calculus and algebra students in Japan. | Classification of students into “successful” and “not successful” categories. Estimation of students’ abilities using item response theory. | Results of weekly multiple-choice tests. | KNN classifier. |
Tsiakmaki et al. [17] | July 2018 | Predictor | 592 Business Administration students in Greece. | Final grade in second semester courses. | Final scores of first semester subjects. | Linear regression, RF, instance-based regression, M5, SVM, GP, bootstrap aggregating. RF had the best performance. |
Amirkhan and Kofman [22] | July 2018 | Predictor | 600 freshmen students at a major public university in the USA. | Prediction of performance and dropout probability. | Stress indicators obtained from mid-semester surveys, as well as demographic information. | Structural equation modeling, path analysis. |
Trussel and Burke-Smalley [20] | November 2018 | Predictor | 1919 business students at a public university in Tennessee (USA). | Cumulative GPA at the end of the degree program and academic retention. | Demographic and socioeconomic attributes, performance in pre-college stage. | OLS regression, logistic regression. |
Umer et al. [13] | November 2018 | Predictor | 99 students enrolled in an introductory mathematics module at an Australian university. | Earliest possible reliable identification of students at risk of failing the course. | Assignment results in a continuous assessment model, as well as LMS log data. | RF, Naïve Bayes, KNN, and LDA. RF had the best performance. |
Wang et al. [34] | November 2018 | EWS | 1712 students from Hangzhou Normal University (China). | Risk assessment of students regarding dropout and delays in graduation. | Grades, attendance and engagement indicators, as well as records from the university library and dorm in order to monitor student habits. | Decision tree, artificial neural network, Naïve Bayes. Naïve Bayes had the best performance. |
Gutiérrez et al. [31] | December 2018 | EWS | Learning Analytics Dashboard for Advisors (LADA) EWS, deployed in two universities: a European one and a Latin American one. | Students’ chance of passing a course. | Student grades, courses booked by a student, number of credits per course. | Multilevel clustering. |
Adekitan and Salau [18] | February 2019 | Predictor | 1841 engineering students at a Nigerian higher education institution. | Final grade point average (GPA) over a five year program. | Cumulative GPA over the first three years of the degree. | Classifiers: NN, RF, decision tree, Naïve Bayes, tree ensemble, and logistic regression. Logistic regression had the best performance. Additionally, linear and quadratic regression models were tested. |
Plak et al. [37] | March 2019 | EWS | EWS deployed at Vrije Universiteit in Amsterdam (Netherlands), tested with 758 students. | Identification of low-performing students. | Progress indicators, such as grades or obtained credits. | Generalized additive model. |
Kostopoulos et al. [14] | April 2019 | Predictor | 1073 students in an introductory informatics module at a Greek open university. | Identification of students at risk of failing a course. | Student demographics and academic achievements; and LMS activity indicators. The data were divided into two views in order to use a co-training method. | Custom co-training method, using combinations of KNN, Extra Tree, RF, GBC, and NB as underlying classifiers. |
Akcapinar et al. [36] | May 2019 | EWS | 90 students in an Elementary Informatics course at an Asian university. | Identification of students at risk of failing the course. | Data from the e-book management system BookRoll: book navigation, page highlighting and note taking. | Comparison of 13 different algorithms. RF had the best performance when using raw data. However, NB outperformed the rest when using categorical data. |
Jovanovic et al. [19] | June 2019 | Predictor | First year engineering course at an Australian university using the flipped classroom model. Tested during three consecutive years, with a number of students ranging from 290 to 486 each year. | Final grade in the course. | Indicators of regularity and performance related to pre-class activities. These activities included videos with multiple choice questions as well as problem sequences. | Multiple linear regression. |