Next Article in Journal
PlantDRs: A Database of Dispersed Repeats in Plant Genomes Identified by the Iterative Procedure Method
Previous Article in Journal
ICA-Based Resting-State Networks Obtained on Large Autism fMRI Dataset ABIDE
Previous Article in Special Issue
NPFC-Test: A Multimodal Dataset from an Interactive Digital Assessment Using Wearables and Self-Reports
 
 
Article
Peer-Review Record

Analysis of Student Dropout Risk in Higher Education Using Proportional Hazards Model and Based on Entry Characteristics

by Liga Paura 1,*, Irina Arhipova 1, Gatis Vitols 1 and Sandra Sproge 2
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Submission received: 27 May 2025 / Revised: 4 July 2025 / Accepted: 5 July 2025 / Published: 8 July 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript focuses on an important and timely issue: predicting student dropout in higher education using survival analysis techniques, specifically the Cox proportional hazards model. The authors use administrative and academic data from one of Latvia’s largest universities to examine how students’ entry characteristics—such as gender, faculty, study priority, and funding source—are related to their likelihood of dropping out during their studies. Considering the growing emphasis on student retention in performance-based higher education systems, the topic is both relevant and potentially impactful.

However, while the topic and general approach are appropriate, the manuscript in its current form has several major weaknesses. These problems are not just cosmetic—they affect the scientific credibility and usefulness of the study. The key issues relate to the quality of writing, the lack of depth in the literature review, incomplete execution of the statistical methods, and poor presentation of results.

The most obvious issue is the language. The manuscript contains many grammatical and stylistic errors that make it hard to read and understand. For example, phrases like “students drop out is finance source” and “hire are the differences between faculties” are grammatically incorrect and unclear. The expression “students at will” is also confusing and needs to be replaced with clearer terminology like “voluntary dropout.” These problems occur throughout the paper and significantly reduce its readability and professionalism. A full language revision by a native English speaker or a professional editor is necessary.

The literature review is broad but shallow. Although the authors cite many international studies—from Spain, Italy, South Korea, Chile, and others—they mostly summarize these studies without critically analyzing them or connecting them to each other. More importantly, the review lacks any reference to major theoretical models of student dropout, such as those developed by Tinto, Bean, or Spady. These models are widely used and would help the authors build a stronger framework for interpreting their results. The authors also don’t explain how findings from other countries relate to the Latvian context or how certain variables were selected for inclusion in their model.

Methodologically, the use of the Cox model is generally suitable for this kind of time-to-event data. However, the authors do not show that they tested the proportional hazards assumption, which is a key requirement for this model. They also do not provide any diagnostic plots—such as Schoenfeld residuals or log-minus-log plots—that would allow readers to assess the model’s validity. This omission casts doubt on the reliability of their hazard ratio estimates.

Another concern is the limited range of predictors used in the model. The variables included—gender, faculty, funding, and admission priority—are relevant, but the authors ignore other available variables that could strengthen the analysis. For example, secondary school grades (SM) and the weighted average mark (WAM) at university are described and shown to have a moderate correlation (r = 0.359), but they are not included in the Cox model. This is a missed opportunity, since prior academic performance is a well-known predictor of student outcomes.

The performance of the model itself is also weak. The reported concordance index (C-index) is only 0.56, which suggests the model is only slightly better than random guessing. Despite this, the authors do not discuss the limitations of their model or compare it to other approaches such as machine learning methods or more flexible survival models.

The interpretation of results is also quite limited. For example, the authors find that students from the Veterinary Medicine faculty have a lower dropout risk (HR = 0.67), but they do not provide any detailed explanation for this, other than suggesting higher motivation. Similarly, the higher dropout risk in engineering and IT is not discussed in the context of known challenges in STEM education. Overall, the discussion is mostly descriptive and lacks critical reflection.

The figures and visualizations need major improvement. Several graphs are poorly labeled, with missing axis titles, unclear legends, and cluttered lines. For instance, the survival curves in Figure 3 and hazard plots in Figure 4 are difficult to interpret without reading the text closely. Better figure design would help communicate the findings more effectively.

In summary, this manuscript deals with an important topic and uses an appropriate general method, but it falls short in writing quality, theoretical framing, statistical rigor, and presentation. These weaknesses significantly limit the scientific contribution of the paper. To improve, the authors would need to revise the text thoroughly, include key theoretical models, validate and expand their statistical analysis, and enhance how they interpret and present the results.

Because of these substantial issues, I recommend rejection of the manuscript in its current form.

Author Response

Thank you very much for taking the time to review this manuscript. Please find the detailed responses in file and the corresponding revisions/corrections highlighted in the re-submitted files.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

Overall, this is an exciting paper and a useful contribution to the literature. I have some suggestions that would make this paper more robust.

  • The paper could be better structured by clearly dividing it into standard sections such as Introduction, Related Work, and Research Questions/ research objectives/ proposed model. Improving the organization and explicitly stating the research aims would significantly enhance the readability and coherence of the paper.

For example:

The content from Page 2, Line 42 to Page 3, Line 104 could be more appropriately placed under a "Related Research” section, leaving the remaining content in the  “introduction”. Relocating this content would enhance the logical flow and align the paper's structure more closely with academic conventions.

  • The abstract and title of the paper introduce the topic “Predicting Student Dropout Risk in Higher Education by Proportional Hazards Model Based on Entry Characteristics.” However, the main body of the paper does not effectively connect the authors’ contributions to the core concept of prediction. Instead, the focus is largely on reporting statistical analyses to identify factors associated with student dropout, rather than developing or evaluating a predictive model. This creates a disconnect between the title, abstract, and the actual content. The authors should clearly articulate the specific research problem and objectives—preferably through well-defined research questions or an explicit problem statement. It is important to clarify whether the aim is to develop a predictive tool, explore explanatory relationships, or achieve both.
  • Line 175: A reference to the “proportional hazards model” could be added when it is first introduced in the paper.
  • Line 175: The sentence “Survival and hazard probability, log-rank test, and proportional hazards model were used for student dropout data analysis” could be improved by briefly explaining how these methods contribute to the development or performance of the prediction model. Clarifying the role of each technique would help establish a stronger link between the analysis and the predictive objective of the study.
  • A “Proposed Methodology” section could have been added to outline and discuss the overall research strategy and approach.
  • Line 177–178: The sentence “Survival probability S(t) – the probability that a students will survive from beginning 177 of the study (t=0) to at any given specified time (t=6 moths, t=24 months) was calculated” could benefit from a more detailed explanation of how the survival probability and hazard h(t) were calculated.
  • Line 189: Equation numbers appear to be missing—these should be added for clarity and reference in the body of the text.
  • Lines 212–214: The content in this section is unclear and needs to be rewritten for better clarity and coherence.

 

  • Figure 1: The caption is vague and lacks sufficient explanation. Additionally, the figure is not discussed or referenced adequately in the main body of the paper.

 

  • In Figure 2 caption – add captions for 2(a) and(b)

 

  • Line 286 : The authors mentioned” Based on the Proportional hazard model results” Please provide a clear and proper reference to the specific results or section where these results are presented.

 

  • Performance evaluation of the proposed model: The process of training and testing the model, including the methodology for data splitting (e.g., train/test split or cross-validation), should be documented. Additionally, details on the evaluation metrics used and how these metrics reflect the model’s predictive performance should be explicitly reported. This ensures transparency and enables readers to assess the robustness and generalizability of the proposed model.

 

 

Comments on the Quality of English Language

The results section contains weak written expressions, making interpretation difficult in places. The authors should improve clarity and ensure all findings are clearly explained and aligned with the research questions

Author Response

Thank you very much for taking the time to review this manuscript. Please find the detailed responses in file and the corresponding revisions/corrections highlighted in the re-submitted files.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

1. Abstract: Some specific analysis results and data can be added.

2.  Introduction: It is suggested to highlight the core motivation of this paper. Some related work can be introduced in a dedicated chapter.

3. It is suggested to optimize the expression methods of some figures (as shown in Figure 3) to facilitate readers' better understanding.

4. Do the diagonal lines in Figures 3 to 5 have any special meanings?

5. Considering that the main contribution of this paper is the dataset, does the author consider providing relevant links?

6. The author mainly analyzes the data through visualization. It is suggested to conduct some statistical or algorithmic analyses, such as classifying students or predicting their states through certain features. The following are some references:

[1] Machine learning model (RG-DMML) and ensemble algorithm for prediction of students’ retention and graduation in education

[2] Prediction of students' adaptability using explainable AI in educational machine learning models

[3] Sentiment analysis of online course evaluation based on a new ensemble deep learning mode: evidence from Chinese

7. Summarize the limitations and future plans of this paper

Comments on the Quality of English Language

N/A

Author Response

Thank you very much for taking the time to review this manuscript. Please find the detailed responses in file and the corresponding revisions/corrections highlighted in the re-submitted files.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have made meaningful improvements to the manuscript, effectively addressing the majority of previous concerns. The language has been thoroughly edited and is now clear and readable. The use of the Cox proportional hazards model is well justified, with appropriate assumption testing and improved variable selection. Including secondary school marks (SM) strengthens the model, while excluding WAM is logically explained. Figures have been clarified and limitations are acknowledged.

The addition of a conceptual section and the inclusion of Tinto’s model are welcome steps. However, the theoretical framework remains loosely connected to the findings. Most importantly, the interpretation of results, though improved, still lacks depth. For example, differences in dropout risk across faculties are reported but not fully explained through known educational challenges - particularly in STEM fields. Likewise, the connection between key results and the principles of academic or social integration (central to Tinto’s model) is not developed.

These issues do not require major restructuring, but a more reflective interpretation of findings - especially in light of existing theory and literature - would notably enhance the manuscript’s value.

The manuscript is now methodologically and structurally sound. I recommend acceptance after minor revisions focusing on:

  • stronger theoretical linkage between results and Tinto’s framework,
  • deeper interpretation of faculty-specific dropout patterns,
  • brief discussion of possible interactions among predictors.

Author Response

Thank you very much for taking the time to review our manuscript. Please find the detailed responses below and the corresponding revisions highlighted (yellow) in the re-submitted files.

The manuscript is now methodologically and structurally sound. I recommend acceptance after minor revisions focusing on:

Comments1: stronger theoretical linkage between results and Tinto’s framework,

Response1 : text was included to manuscript P. 7, P. 9.

Comments2: deeper interpretation of faculty-specific dropout patterns,

Response 2: text was included to manuscript conclusions P. 14.

Comments3: brief discussion of possible interactions among predictors.

Response 3: text was included to manuscript conclusions P. 14.

Reviewer 2 Report

Comments and Suggestions for Authors

Authors have responded constructively to the previous comments and improved the manuscript accordingly.

 

Author Response

Thank you very much for taking the time to review our manuscript.

Back to TopTop