Exploring Risk Factors for Crash Severity During Thailand’s Holiday Travel: Machine Learning Exploration Compared to Heterogeneity Modeling
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript examines the factors influencing the severity of road accidents during major holiday periods in Thailand (2017–2019) and offers a valuable comparison between traditional econometric modeling and machine learning approaches for crash analysis. The study is relevant and potentially impactful. However, several areas require clarification or refinement to strengthen the manuscript.
Major Comments
• The description of the machine learning methodology requires further detail. Key aspects such as the train–test split, cross-validation strategy, hyperparameter tuning, and the approach used to address class imbalance are not clearly specified. Providing this information is essential to ensure the reproducibility of the study.
• The choice of random parameters and heterogeneity terms should be better justified. The authors are encouraged to explain the criteria used to identify which variables were treated as random and how heterogeneity in both means and variances was assessed.
• Data preprocessing procedures are not sufficiently described. The manuscript should include information on how missing data were handled, whether normalization or scaling techniques were applied, and how outliers were managed, in order to improve transparency.
• The connection between results and conclusions could be strengthened. While the conclusions are generally supported by the findings, explicitly referring to key numerical results and further discussing the policy implications would make this section more robust.
• The rationale for merging severe and fatal injuries should be expanded. Although the low frequency of fatal cases is mentioned, a more detailed justification would improve the methodological clarity of this choice.
• A broader set of performance metrics for the machine learning models is recommended. In addition to accuracy, reporting precision, recall, F1-score, and/or confusion matrices would allow for a more comprehensive evaluation of model performance.
Minor Comments
• Please expand all acronyms at their first occurrence.
• The English language could be improved in several sections to enhance clarity and readability.
Comments on the Quality of English Language
The manuscript contains frequent repetitions, overly long paragraphs, and sections that are more verbose than necessary, particularly in the Introduction, Literature Review, and Methods.
The manuscript contains frequent repetitions, overly long paragraphs, and sections that are more verbose than necessary, particularly in the Introduction, Literature Review, and Methods. There are also language and stylistic issues (grammar, phrasing, consistency) that detract from readability. So, the paper would benefit from substantial editing for conciseness, removal of redundancy, and language polishing to reach a high journal standard.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript investigates crash injury severity during Thailand’s major holiday periods using a comparative framework that combines machine-learning models (MLP, AdaBoost, Random Forest, XGBoost) with an econometric Random Parameters Ordered Logit Model with heterogeneity in means (RPOLHM). The topic is timely and policy-relevant, particularly given the elevated crash risks during festival travel periods. The dataset is substantial (8,346 crashes), the modeling effort is extensive, and the results are generally well presented.
- Abstract
It would be advisable to include one performance metric that is quantitative in nature, such as the XGBoost validation accuracy of 83.8%, to add strength to the abstract.
As an abstract, it might mention additional aspects about the complementarity between machine learning approaches together with econometric models, in terms of interpretation versus predictions.
- Terminology and Consistency
Note: RPOLHM should be used uniformly in the manuscript. RPOLHMV has been referred to in certain parts, although heterogeneity of variances was not identified.
Use a consistent classification of the severity of injury, especially when defining “severe/fatal injury.”
- Methodology Ex
In Section 3.5 (Model Evaluation), could you briefly describe:
Whether the issue of class imbalance was addressed, and if so, how.
The training and validation split for machine learning models.
In Section 3.4, it shall be explicitly stated whether hyperparameters are tuned through using default values or optimization techniques such as grid search.
- Resultatene
Specifically, I would like to request in Table 4 whether the AUC value reported is a macro AUC value of the multi-class AUC.
A small explanation for the summary score interpretation in Table 5 is recommended for clarity for non-technical audiences.
- Figures
With regard to Figure 2 (Variable Importance for XGBoost), you should indicate whether you’re calculating importance by gain, frequency, or cover.
A minor increase in the size of the fonts would be beneficial for legibility.
- Discussion
In Section 4.3.1, think about briefly addressing the differences for the above conclusion in regards to adverse weather being related to a lower injury severity, versus other studies that have the inverse relationship.
Regarding pre-festival crashes, the issue could be more articulated by referring to enforcement effort or behaviors in response to holidays.
- Language & Style
English language editing involving a superficial scan to correct minor grammatical errors as well as eliminate some redundancies throughout Sections 1 & 2 would be beneficial.
There is no need for restructuring in the manuscript.
The paper is suitable for publication after moderate revision, primarily to strengthen methodological clarity, interpretation rigor, and policy relevance.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe additional explanations and clarifications provided by authors in response to my comments have significantly improved the quality of the paper. So, the manuscript could be accepted in the present form.
Comments on the Quality of English LanguageThe authors have also improved the English, eliminating repetitions and improving the clarity of the presentation

