Data Balancing Techniques for Predicting Student Dropout Using Machine Learning
Round 1
Reviewer 1 Report
I was happy to review the paper in more depth because the subject matter is interesting.And the submissions are worth publishing.Here are a few minor comments:
In this study, different data balancing techniques were used to improve the prediction accuracy of ethnic minority populations, and logistic regression, random forest and multilayer perceptron were used for testing. Finally, the original (unbalanced) data was compared with the data using sampling techniques to improve the prediction results. The overall classification performance was satisfactory, but the accuracy was not high and there were certain limitations.Overall, this article has novel ideas and clear context.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Reviewer 2 Report
Dear authors,
I'd like to congratulate you and your team on your excellent research work in your paper submitted for publication in this prestigious journal. The topic is very interesting, and I enjoyed it. I would like to thank you for your efforts in presenting your research work in such a professional manner. However, before your work is recommended or accepted, a few comments must be included/ addressed to improve the quality of your work as well as for future publication in this reputable journal. I have the following observations, questions, and comments that may help to improve your work. The authors must modify the following points in great detail.
1. In the abstract, please include 2-3 special quantitative achievements from the findings of this study in the context of the environment by combining the research objectives and problems. Please limit your abstract to 250 words. Check spellings for many words that are misspelt or written in haste.
2. The introduction section needs a few more sentences to strengthen the article, and please include the research problem, objective, and novelty in the last paragraph of the Introduction section.
3. Include a few more sentences at the beginning of the introduction explaining your paper's contribution to the environment, climate change impact, and sustainability, as well as your attempts to deal with or present solutions to a specific problem/s and your unique contribution with this research paper.
4. Please also present the methodology section in a concise graphical format.
5. The literature review section is very weak; please revise it.
6. Please present your literature review in the form of a SmartArt chart.
7. Just after the Methodology, please mention the societal benefits of your research in terms of evaluating its key determinant.
8. In 500-750 words, explain research problems, solutions, and the theoretical contribution of your study in the "Results" section.
9. Please include graphical presentations of your findings.
10. Describe why you placed this study in a separate section of "Policy Suggestions" just before the section of "Conclusions."
https://doi.org/10.1016/j.emj.2023.01.004
I think above all studies will make this study more relevant in bridging the gap with literature.
Looking forward for your revised submission.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Reviewer 3 Report
This paper addresses the issue of data balancing when using machine learning algorithms for predicting student dropout.
My concerns are as follows:
This is a topic very well studied in the literature and, in that sense, the paper lacks originality.
Although the authors support the paper with a high number of relevant references, lack the use of more recent works that address the same type of problem.
It is not clear, in the literature, that the statement in lines 72-73 about MPL is so affirmative, and therefore it needs to be further substantiated.
Concerning the data and methods, the datasets used and the pre-processing performed must be more detailed. It would be interesting to include also boosting methods.
In the evaluation performed, it would also be important to consider the most common performance metrics such as ROC, AUC, Precision, or Recall as performance metrics.
Tables 3 and 4 must be transformed into the confusion matrices that the authors refer to using to evaluate the models.
The introduction must end with the organization of the rest of the paper.
Review the way references are being made to Figures and Tables in accordance with the journal's instructions.
The link for the first dataset is not available.
Figures 3-7 add no value to the manuscript and should be deleted.
I suggest merging Table 1 with Table 2.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Round 2
Reviewer 3 Report
Thanks for the revisions made to the manuscript.
All my concerns were taken care of.