# Customized Rule-Based Model to Identify At-Risk Students and Propose Rational Remedial Actions

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

- Propose a customized instructor rule-based model to identify at-risk students in order to take appropriate remedial actions.
- Propose a warning system for instructors to identify at-risk students and offer timely intervention using a visualization approach.

## 2. Literature Review

## 3. Methodology

#### 3.1. Data Collection and Dataset Description

#### 3.2. Data Pre-Processing

${C}_{i}$ | - represents the name of the checkpoint predefined earlier |

${g}_{i,j}$ | - is a grade of the jth student at the checkpoint ${C}_{i}$ |

$max\left({g}_{{C}_{i}}\right)$ | - is the maximum possible grade for the checkpoint ${C}_{i}$ |

m | - corresponds to the number of students |

n | - denotes the number of checkpoints in the course |

$i,j$ | - indices, $i=\overline{1,n},j=\overline{1,m}$ |

#### 3.3. Data Exploratory Analysis

## 4. Customized Rule-Based Model to Identify At-Risk Students

#### Visualization

## 5. Discussion and Future Work

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

CSV | Comma-Separated Value |

HW | Homework assignment |

JSON | JavaScript Object Notation |

KNN | KNnearest Neighbous |

MAE | Mean Absolute Error |

ML | Machine Learning |

MT | Mid-Term Exam |

RA | Remedial Action |

RBM | Rule-Based Model |

$RF$ | Risk Flag |

Qz | Quiz |

## References

- Namoun, A.; Alshanqiti, A. Predicting student performance using data mining and learning analytics techniques: A systematic literature review. Appl. Sci.
**2021**, 11, 237. [Google Scholar] [CrossRef] - Hellas, A.; Ihantola, P.; Petersen, A.; Ajanovski, V.V.; Gutica, M.; Hynninen, T.; Knutas, A.; Leinonen, J.; Messom, C.; Liao, S.N. Predicting academic performance: A systematic literature review. In Proceedings of the Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, Larnaca, Cyprus, 2–4 July 2018; pp. 175–199. [Google Scholar]
- Watkins, M. “Inclusive education: The way of the future”—A rebuttal. Prospects
**2009**, 39, 215–225. [Google Scholar] [CrossRef] - Acedo, C.; Ferrer, F.; Pamies, J. Inclusive education: Open debates and the road ahead. Prospects
**2009**, 39, 227–238. [Google Scholar] [CrossRef] [Green Version] - Akçapınar, G.; Hasnine, M.N.; Majumdar, R.; Flanagan, B.; Ogata, H. Developing an early-warning system for spotting at-risk students by using eBook interaction logs. Smart Learn. Environ.
**2019**, 6, 4. [Google Scholar] [CrossRef] - Berens, J.; Schneider, K.; Görtz, S.; Oster, S.; Burghoff, J. Early detection of students at risk–predicting student dropouts using administrative student data and machine learning methods. JEDM
**2018**, 11, 1–41. [Google Scholar] [CrossRef] - Baneres, D.; Rodríguez-Gonzalez, M.E.; Serra, M. An early feedback prediction system for learners at-risk within a first-year higher education course. IEEE Trans. Learn. Technol.
**2019**, 12, 249–263. [Google Scholar] [CrossRef] - Chung, J.Y.; Lee, S. Dropout early warning systems for high school students using machine learning. Child. Youth Serv. Rev.
**2019**, 96, 346–353. [Google Scholar] [CrossRef] - Aguiar, E.; Lakkaraju, H.; Bhanpuri, N.; Miller, D.; Yuhas, B.; Addison, K.L. Who, when, and why: A machine learning approach to prioritizing students at risk of not graduating high school on time. In Proceedings of the Fifth International Conference on Learning Analytics And Knowledge, New York, NY, USA, 16–20 March 2015; pp. 93–102. [Google Scholar]
- OuahiMariame, S.K. Feature Engineering, Mining for Predicting Student Success based on Interaction with the Virtual Learning Environment using Artificial Neural Network. Ann. Rom. Soc. Cell Biol.
**2021**, 25, 12734–12746. [Google Scholar] - Almutairi, F.M.; Sidiropoulos, N.D.; Karypis, G. Context-aware recommendation-based learning analytics using tensor and coupled matrix factorization. IEEE J. Sel. Top. Signal Process.
**2017**, 11, 729–741. [Google Scholar] [CrossRef] - Sweeney, M.; Rangwala, H.; Lester, J.; Johri, A. Next-term student performance prediction: A recommender systems approach. arXiv
**2016**, arXiv:1604.01840. [Google Scholar] - Oyedeji, A.O.; Salami, A.M.; Folorunsho, O.; Abolade, O.R. Analysis and prediction of student academic performance using machine learning. JITCE J. Inf. Technol. Comput. Eng.
**2020**, 4, 10–15. [Google Scholar] [CrossRef] [Green Version] - Asif, R.; Hina, S.; Haque, S.I. Predicting student academic performance using data mining methods. Int. J. Comput. Sci. Netw. Secur.
**2017**, 17, 187–191. [Google Scholar] - Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R.; Ali, S. Using machine learning to predict student difficulties from learning session data. Artif. Intell. Rev.
**2019**, 52, 381–407. [Google Scholar] [CrossRef] - Chen, Y.; Johri, A.; Rangwala, H. Running out of stem: A comparative study across stem majors of college students at-risk of dropping out early. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge, Sydney, Australia, 7–9 March 2018; pp. 270–279. [Google Scholar]
- Lee, S.; Chung, J.Y. The machine learning-based dropout early warning system for improving the performance of dropout prediction. Appl. Sci.
**2019**, 9, 3093. [Google Scholar] [CrossRef] [Green Version] - Lakkaraju, H.; Aguiar, E.; Shan, C.; Miller, D.; Bhanpuri, N.; Ghani, R.; Addison, K.L. A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 1909–1918. [Google Scholar]
- Al-Barrak, M.A.; Al-Razgan, M. Predicting students final GPA using decision trees: A case study. Int. J. Inf. Educ. Technol.
**2016**, 6, 528. [Google Scholar] [CrossRef] [Green Version] - Kavipriya, P. A review on predicting students’ academic performance earlier, using data mining techniques. Int. J. Adv. Res. Comput. Sci. Softw. Eng.
**2016**, 6, 101–105. [Google Scholar] - Akçapınar, G.; Altun, A.; Aşkar, P. Using learning analytics to develop early-warning system for at-risk students. Int. J. Educ. Technol. High. Educ.
**2019**, 16, 1–20. [Google Scholar] [CrossRef] - Mwalumbwe, I.; Mtebe, J.S. Using learning analytics to predict students’ performance in Moodle learning management system: A case of Mbeya University of Science and Technology. Electron. J. Inf. Syst. Dev. Ctries.
**2017**, 79, 1–13. [Google Scholar] [CrossRef] [Green Version] - Alhusban, S.; Shatnawi, M.; Yasin, M.B.; Hmeidi, I. Measuring and Enhancing the Performance of Undergraduate Student Using Machine Learning Tools. In Proceedings of the 2020 11th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; pp. 261–265. [Google Scholar]
- Gafarov, F.; Yu, R.Y.B.S.U.; PM, T.A.B. Analysis of Students’ Academic Performance by Using Machine Learning Tools. In International Scientific Conference “Digitalization of Education: History, Trends and Prospects”(DETP 2020); Atlantis Press: Amsterdam, The Netherlands, 2020; pp. 570–575. [Google Scholar]
- Al Breiki, B.; Zaki, N.; Mohamed, E.A. Using Educational Data Mining Techniques to Predict Student Performance. In Proceedings of the 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA), Ras Al Khaimah, United Arab Emirates, 19–21 November 2019; pp. 1–5. [Google Scholar]
- Iqbal, Z.; Qadir, J.; Mian, A.N.; Kamiran, F. Machine learning based student grade prediction: A case study. arXiv
**2017**, arXiv:1708.08744. [Google Scholar] - Shuqfa, Z.; Harous, S. Data Mining Techniques Used in Predicting Student Retention in Higher Education: A Survey. In Proceedings of the 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA), Ras Al Khaimah, United Arab Emirates, 19–21 November 2019; pp. 1–4. [Google Scholar]
- Mduma, N.; Kalegele, K.; Machuve, D. A survey of machine learning approaches and techniques for student dropout prediction. Data Sci. J.
**2019**, 18, 14. [Google Scholar] [CrossRef] [Green Version] - Al-Sudani, S.; Palaniappan, R. Predicting students’ final degree classification using an extended profile. Educ. Inf. Technol.
**2019**, 24, 2357–2369. [Google Scholar] [CrossRef] [Green Version] - Iam-On, N.; Boongoen, T. Generating descriptive model for student dropout: A review of clustering approach. Hum.-Centric Comput. Inf. Sci.
**2017**, 7, 1–24. [Google Scholar] [CrossRef] [Green Version] - Jenhani, I.; Brahim, G.B.; Elhassan, A. Course learning outcome performance improvement: A remedial action classification based approach. In Proceedings of the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Anaheim, CA, USA, 18–20 December 2016; pp. 408–413. [Google Scholar]
- Aggarwal, D.; Mittal, S.; Bali, V. Significance of Non-Academic Parameters for Predicting Student Performance Using Ensemble Learning Techniques. Int. J. Syst. Dyn. Appl. IJSDA
**2021**, 10, 38–49. [Google Scholar] [CrossRef] - John, L.K. Machine learning for performance and power modeling/prediction. In Proceedings of the ISPASS, Santa Rosa, CA, USA, 24–25 April 2017. [Google Scholar] [CrossRef]
- González, A. Turning a traditional teaching setting into a feedback-rich environment. Int. J. Educ. Technol. High. Educ.
**2018**, 15, 1–21. [Google Scholar] [CrossRef] - Ahadi, A.; Lister, R.; Haapala, H.; Vihavainen, A. Exploring machine learning methods to automatically identify students in need of assistance. In Proceedings of the Eleventh Annual International Conference on International Computing Education Research, Omaha, NE, USA, 9–13 August 2015; pp. 121–130. [Google Scholar]
- Albreiki, B.; Zaki, N.; Alashwal, H. A Systematic Literature Review of Student’Performance Prediction Using Machine Learning Techniques. Educ. Sci.
**2021**, 11, 552. [Google Scholar] [CrossRef] - Kononenko, I.; Hong, S.J. Attribute selection for modelling. Future Gener. Comput. Syst.
**1997**, 13, 181–195. [Google Scholar] [CrossRef]

**Figure 1.**Pearson product-moment correlation coefficient computed for each attribute in the data set.

**Figure 2.**The distribution of single feature and relationships between two attributes, such as HWs, Qzs, MT, Final, Total.

**Figure 3.**The heat-map of students performance level to visually indicate students with performance level below 70%.

**Figure 5.**Sequential model design employed to continuously identify at-risk students and propose remedial actions to close the loop.

**Figure 8.**Dependency between total grade in the course and the number of remedial actions evoked based on the parameters reported in the paper.

Student ID | ${\mathit{C}}_{1}$ | ${\mathit{C}}_{2}$ | ${\mathit{C}}_{3}$ | … | ${\mathit{C}}_{\mathit{n}}$ |
---|---|---|---|---|---|

$Studen{t}_{1}$ | ${g}_{1,1}$ | ${g}_{1,2}$ | ${g}_{1,3}$ | … | ${g}_{1,n}$ |

$Studen{t}_{2}$ | ${g}_{2,1}$ | ${g}_{2,2}$ | ${g}_{2,3}$ | … | ${g}_{2,n}$ |

$Studen{t}_{3}$ | ${g}_{3,1}$ | ${g}_{3,2}$ | ${g}_{3,3}$ | … | ${g}_{3,n}$ |

… | … | … | … | … | … |

$Studen{t}_{m}$ | ${g}_{m,1}$ | ${g}_{m,2}$ | ${g}_{m,3}$ | … | ${g}_{m,n}$ |

Max grade | $max\left({g}_{{C}_{1}}\right)$ | $max\left({g}_{{C}_{2}}\right)$ | $max\left({g}_{{C}_{3}}\right)$ | … | $max\left({g}_{{C}_{n}}\right)$ |

**Table 2.**Comparison of the students’ performance at the checkpoints with regard to their performance level at the total grade.

Total | High Risk | Low Risk | p-Value | |
---|---|---|---|---|

218 | 99 (45.41%) | 119 (54.59%) | ||

Gender | 0.13328 | |||

Female | 172 (78.9%) | 83 (83.84%) | 89 (74.79%) | |

Male | 46 (21.1%) | 16 (16.16%) | 30 (25.21%) | |

HW1 | 0.89 [0.86–1.0] | 0.84 ± 0.2 | 0.93 ± 0.08 | 3.75974$\times {\mathbf{10}}^{-\mathbf{8}}$ |

Qz1 | 0.76 [0.65–0.91] | 0.67 ± 0.18 | 0.84 ± 0.18 | 5.48951$\times {\mathbf{10}}^{-\mathbf{14}}$ |

HW2 | 0.87 [0.8–0.95] | 0.8 ± 0.2 | 0.92 ± 0.08 | 2.68999$\times {\mathbf{10}}^{-\mathbf{10}}$ |

Qz2 | 0.75 [0.63–0.95] | 0.6 ± 0.25 | 0.88 ± 0.12 | 1.24295$\times {\mathbf{10}}^{-\mathbf{19}}$ |

MT | 0.75 [0.64–0.89] | 0.61 ± 0.13 | 0.87 ± 0.1 | 8.1207$\times {\mathbf{10}}^{-\mathbf{31}}$ |

Qz3 | 0.79 [0.69–0.96] | 0.65 ± 0.23 | 0.9 ± 0.14 | 3.70452$\times {\mathbf{10}}^{-\mathbf{21}}$ |

HW3 | 0.85 [0.82–1.0] | 0.81 ± 0.25 | 0.89 ± 0.13 | 0.0190915 |

Qz4 | 0.74 [0.6–0.94] | 0.58 ± 0.24 | 0.88 ± 0.13 | 2.69329$\times {\mathbf{10}}^{-\mathbf{22}}$ |

Qz5 | 0.67 [0.52–0.9] | 0.55 ± 0.27 | 0.77 ± 0.27 | 6.04346$\times {\mathbf{10}}^{-\mathbf{12}}$ |

HW4 | 0.89 [0.9–1.0] | 0.84 ± 0.25 | 0.94 ± 0.13 | 6.81128$\times {\mathbf{10}}^{-\mathbf{5}}$ |

Qz6 | 0.77 [0.67–0.92] | 0.63 ± 0.18 | 0.88 ± 0.11 | 2.11131$\times {\mathbf{10}}^{-\mathbf{25}}$ |

HWs | 0.88 [0.86–0.95] | 0.82 ± 0.18 | 0.92 ± 0.07 | 5.93254$\times {\mathbf{10}}^{-\mathbf{10}}$ |

Qzs | 0.79 [0.7–0.92] | 0.66 ± 0.15 | 0.9 ± 0.08 | 4.84271$\times {\mathbf{10}}^{-\mathbf{29}}$ |

Final | 0.58 [0.42–0.78] | 0.37 ± 0.13 | 0.75 ± 0.16 | 1.43249$\times {\mathbf{10}}^{-\mathbf{33}}$ |

Total | 0.72 [0.61–0.86] | 0.58 ± 0.11 | 0.84 ± 0.09 | 2.54565$\times {\mathbf{10}}^{-\mathbf{37}}$ |

Performance Range | a Value |
---|---|

<20% | 1.5 |

$(20\%;30\%)$ | 1.4 |

$(30\%;40\%)$ | 1.3 |

$(40\%;50\%)$ | 1.2 |

$(50\%;60\%)$ | 1.1 |

$(60\%;70\%)$ | 1.0 |

Checkpoint | $\mathbf{RF}$ Value, Calculated at Each Checkpoint | Risk Condition |
---|---|---|

HW1 | $R{F}_{HW1}=R{F}_{0}=0$ | ${x}^{HW1}=0.85>0.7$ |

Qz1 | $R{F}_{Qz1}=R{F}_{HW1}+a\xb7{W}_{Qz}=0+1\phantom{\rule{3.33333pt}{0ex}}\times \phantom{\rule{3.33333pt}{0ex}}0.3=0.3$ | ${x}^{Qz1}=0.7\le 0.7$ |

HW2 | $R{F}_{HW2}=R{F}_{Qz1}=0.3$ | ${x}^{HW2}=0.95>0.7$ |

Qz2 | $R{F}_{Qz2}=R{F}_{HW2}+a\xb7WQz=0.3+1\phantom{\rule{3.33333pt}{0ex}}\times \phantom{\rule{3.33333pt}{0ex}}0.3=0.6$ | ${x}^{Qz2}=0.7\le 0.7$ |

MT | $R{F}_{MT}=R{F}_{Qz2}=0.6$ | ${x}^{MT}=0.78>0.7$ |

Qz3 | $R{F}_{Qz3}=R{F}_{MT}+a\xb7WQz=0.6+1\phantom{\rule{3.33333pt}{0ex}}\times \phantom{\rule{3.33333pt}{0ex}}0.3=0.9$ | ${x}^{Qz3}=0.7\le 0.7$ |

HW3 | $R{F}_{HW3}=R{F}_{Qz3}=0.9$ | ${x}^{HW3}=0.9>0.7$ |

Qz4 * | $R{F}_{Qz4}=R{F}_{Qz3}+a\xb7WQz=0.9+1.1\phantom{\rule{3.33333pt}{0ex}}\times \phantom{\rule{3.33333pt}{0ex}}0.3=1.23$ | ${x}^{Qz4}=0.5\le 0.7$ |

Qz5 | $R{F}_{Qz5}=R{F}_{Qz4}+a\xb7WQz=1.23-1+1.5\phantom{\rule{3.33333pt}{0ex}}\times \phantom{\rule{3.33333pt}{0ex}}0.3=0.68$ | ${x}^{Qz5}=0\le 0.7$ |

HW4 | $R{F}_{HW4}=R{F}_{Qz5}=0.68$ | ${x}^{HW4}=0.9>0.7$ |

Qz6 | $R{F}_{Qz6}=R{F}_{Qz5}+a\xb7WQz=0.68+1\phantom{\rule{3.33333pt}{0ex}}\times \phantom{\rule{3.33333pt}{0ex}}0.3=0.98$ | ${x}^{Qz6}=0.65\le 0.7$ |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Albreiki, B.; Habuza, T.; Shuqfa, Z.; Serhani, M.A.; Zaki, N.; Harous, S.
Customized Rule-Based Model to Identify At-Risk Students and Propose Rational Remedial Actions. *Big Data Cogn. Comput.* **2021**, *5*, 71.
https://doi.org/10.3390/bdcc5040071

**AMA Style**

Albreiki B, Habuza T, Shuqfa Z, Serhani MA, Zaki N, Harous S.
Customized Rule-Based Model to Identify At-Risk Students and Propose Rational Remedial Actions. *Big Data and Cognitive Computing*. 2021; 5(4):71.
https://doi.org/10.3390/bdcc5040071

**Chicago/Turabian Style**

Albreiki, Balqis, Tetiana Habuza, Zaid Shuqfa, Mohamed Adel Serhani, Nazar Zaki, and Saad Harous.
2021. "Customized Rule-Based Model to Identify At-Risk Students and Propose Rational Remedial Actions" *Big Data and Cognitive Computing* 5, no. 4: 71.
https://doi.org/10.3390/bdcc5040071