Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

An Interpretable Machine Learning-Based Hurdle Model for Zero-Inflated Road Crash Frequency Data Analysis: Real-World Assessment and Validation

Appl. Sci. 2024, 14(23), 10790; https://doi.org/10.3390/app142310790

by Moataz Bellah Ben Khedher^1,2 and Dukgeun Yun^1,2,*

Reviewer 1:

César De Santos-Berbel

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Reviewer 4: Anonymous

Appl. Sci. 2024, 14(23), 10790; https://doi.org/10.3390/app142310790

Submission received: 20 August 2024 / Revised: 12 November 2024 / Accepted: 16 November 2024 / Published: 21 November 2024

(This article belongs to the Section Transportation and Future Mobility)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript describes research conducted to devise an analytical framework utilizing machine learning techniques to assess and interpret crash frequency data characterized by a high incidence of zero occurrences.

The mathematical rationale behind the construction of the procedure seems to be adequate. However, the research approach has two important flaws.

First, the consideration of zero-inflated models for accident prediction to account for the overdispersion of zeros in count data by assuming that the random process of occurrence of accidents can be split into two states: one where accidents have a no chance to occur, and another one where the accident occurs according to a standard count distribution. However, there are several reasons why zero-inflated models are not be appropriate for predicting accident frequency. Zero-inflated models assume that a portion of the network analyzed is safe because accidents are highly unlikely, which is not a realistic assumption unless no vehicle drives through that segment. All road segments have some level of risk and categorizing them into completely safe and unsafe might oversimplify the actual risk factors involved. See:

https://doi.org/10.1016/J.AAP.2004.02.004.

In addition, there are overfitting issues. Zero-inflated models can sometimes provide a statistically superior fit to data but may not generalize well to new or unseen data. This happens because these models might overfit the noise in the data, especially in complex datasets with high variability in accident occurrences. See:

https://doi.org/10.1016/j.aap.2009.07.012.

Second, the length of the sections considered is very short, since the mean value is only 130 m long. While a rather short length such as 160 m – 300 m is adequate to examine the effect of geometric design variables, an important proportion of sample used by the authors seems extremely short. That is the most likely reason why there are many zero counts.

Furthermore, the validation through the comparison of the results of the proposed two-stage model against traditional methods such as the Poisson Hurdle Model and the Negative Binomial Hurdle Model is not enough. The authors should have validated the results by dividing the sample into at least two sub samples (training and test) in order to verify that no overfitting is produced.

Although the authors claim that the model is interpretable and transparent, there is no information about the contribution of each variable to the occurrence of accidents. Therefore, whoever does the analysis will not be able to understand or make decisions on how to improve the network.

In my view, for the manuscript to be published, the sample must be arranged so that no segment shorter than 150 m is included, the use of zero inflated models should be discarded and the final model should be interpretable to the extent that there is some kind of metric that shows the effect of each of the variables considered.

Author Response

Comments 1: First, the consideration of zero-inflated models for accident prediction to account for the overdispersion of zeros in count data by assuming that the random process of occurrence of accidents can be split into two states: one where accidents have a no chance to occur, and another one where the accident occurs according to a standard count distribution. However, there are several reasons why zero-inflated models are not be appropriate for predicting accident frequency. Zero-inflated models assume that a portion of the network analyzed is safe because accidents are highly unlikely, which is not a realistic assumption unless no vehicle drives through that segment. All road segments have some level of risk and categorizing them into completely safe and unsafe might oversimplify the actual risk factors involved. See:

https://doi.org/10.1016/J.AAP.2004.02.004.

https://doi.org/10.1016/j.aap.2009.07.012.

Response 1: Thank you for your thoughtful feedback regarding the use of zero-inflated models and the critiques raised by previous research. We appreciate the opportunity to clarify our approach and how it addresses these concerns.

Our study employs a machine learning-driven hurdle model, which offers a novel perspective that differs from traditional dual-state models. Unlike models that rely on strict assumptions about the data-generating process, our approach is data-driven and adaptable, allowing it to capture more complex, non-linear relationships inherent in crash frequency data. This flexibility enables the model to account for a wider range of explanatory variables and address challenges such as over-dispersion and class imbalance through tailored loss functions.

Furthermore, the use of machine learning offers significant advantages in terms of both model accuracy and interpretability. Recent advancements, such as SHAP (Shapley Additive exPlanations), allow for clear insights into the contribution of individual variables, addressing concerns about model transparency. Additionally, our rigorous cross-validation process ensures that the model is robust and generalizable across different scenarios, reducing the risk of overfitting.

While we acknowledge that some critiques are well-founded in the context of traditional dual-state models, we believe that the integration of machine learning techniques provides a promising alternative. By leveraging machine learning's strengths, our research aims to develop a more accurate and interpretable model for crash frequency analysis, ultimately contributing to more effective road safety interventions.

We hope this clarification addresses the concerns raised and provides a clearer understanding of how machine learning enhances our approach.

Comments 2: Second, the length of the sections considered is very short, since the mean value is only 130 m long. While a rather short length such as 160 m – 300 m is adequate to examine the effect of geometric design variables, an important proportion of sample used by the authors seems extremely short. That is the most likely reason why there are many zero counts.

Response 2: Thank you for highlighting the issue regarding the section lengths. We acknowledge that the mean segment length in our study is relatively short at 130 meters average. This segmentation was predefined based on the available dataset, and shorter segments were included to capture the high-resolution variability of road geometry and crash occurrence, which is particularly important in areas like intersections, sharp curves, or other critical points where crashes are more likely to occur. We recognize that shorter sections may contribute to a higher number of zero counts, as these segments often have lower traffic volumes or minimal geometric variation. However, this is precisely why the hurdle model was employed, as it effectively handles the excess zero counts while still modeling the crash frequency on segments where crashes are more likely to occur. Additionally, shorter segments are crucial for studies focusing on localized design features and their impact on crash risk. While longer segment lengths (160m–300m) may be appropriate in other contexts, we believe that including shorter sections provides a more detailed understanding of road safety risks in this study.

It is important to note that this is not a case study but a validation of the proposed model. In future research, we plan to test the model with datasets containing longer segments to further assess its generalizability and robustness.

Comments 3: Furthermore, the validation through the comparison of the results of the proposed two-stage model against traditional methods such as the Poisson Hurdle Model and the Negative Binomial Hurdle Model is not enough. The authors should have validated the results by dividing the sample into at least two sub samples (training and test) in order to verify that no overfitting is produced.

Response 3: Thank you for your valuable comment regarding the validation process. We would like to clarify that the sub-sample testing approach (training and testing data) was followed for each sub-stage of the model, both the classification stage and the regression stage. This ensured that overfitting was minimized and the model’s performance was validated at these critical stages.

We believe this approach provides a robust validation framework, ensuring that both components of the model generalize well to unseen data.

Comments 4: Although the authors claim that the model is interpretable and transparent, there is no information about the contribution of each variable to the occurrence of accidents. Therefore, whoever does the analysis will not be able to understand or make decisions on how to improve the network.

Response 4: Thank you for your feedback regarding the model's interpretability. While the model does not provide traditional numerical metrics to quantify the contribution of each variable, we have used SHapley Additive exPlanations (SHAP) and Partial Dependence Plots (PDP) to provide insights into how each variable influences crash occurrence. PDP plots, in particular, offer a useful tool to help readers understand how changes in specific variables affect crash risks, which can guide informed decisions about road safety interventions.

In future work, we plan to incorporate additional techniques to further enhance the interpretability of our model, such as Individual Conditional Expectation (ICE) plots and Accumulated Local Effects (ALE) plots, which can provide more granular insights alongside SHAP. These methods will improve transparency and help stakeholders better understand the underlying factors driving crash risks.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Abstract: The abstract is well written.

Introduction: The introduction mentions the limitations of traditional models but could benefit from a clearer explanation of how the proposed model addresses these limitations in a novel way.

Literature Review: The literature review needs to be expanded and organized by categorizing the studies into different themes, such as traditional models, machine learning approaches, and hybrid models. Recent references need to be added to ensure the literature review is up-to-date.

Proposed Methodology: The methodology is well written maybe too much detailed since it occupies more than 6 pages.

Empirical Assessment using Real World Data:

How did you determine the boundaries of road sections? (line 431 to 433 “Following the necessary data preprocessing, which included the removal of redundant and erroneous entries, the finalized dataset consists of 8,636 road segments and encompasses a total of 3,182 crashes recorded over the specified period.)

In Table 1 minimum section length is 50m, Is this to small section for analysis?

In Table 2 there are different number of traffic lanes 1 to 4. That means that different models need to be developed for each type of road: one-lane road, two-lane road, and multi-lane road. For example on one-lane and two-lane roads "Median Width" does not influence traffic accidents since Median does not exist on this roads. Or, for example, motorways with 4 lanes do not have the influence of curve radius on occurrence of the traffic accidents, since curve radius are large on motorways. This means that a list of influencing factors must be developed for each type of road.

Figures 7 and 8 show only the significance of the influencing factors on crash frequency, which is nothing new. The authors did not develop their model of traffic accidents prediction. Traditional Crash prediction models (Poisson and Negative Binomial models) usually result in an equation that can predict the number of traffic accidents per road section in the function of influencing parameters. This is not the case here. So your approach is not comparable to existing models. The only result of this paper is a list of the influencing factors on traffic crash frequency, which is nothing new.

Conclusions: Limitations of the paper are missing

Author Response

Summary

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below to address all the comments.

Point-by-point response to Comments and Suggestions for Authors

Comments 1: The introduction mentions the limitations of traditional models but could benefit from a clearer explanation of how the proposed model addresses these limitations in a novel way.

Response 1: We appreciate the reviewer's suggestion for clarification in the introduction. In response, we revised the introduction to more clearly highlight how our proposed machine learning-based hurdle model overcomes the limitations of traditional models. Please check line 77 to 84 in the revised manuscript.

Comments 2: The literature review needs to be expanded and organized by categorizing the studies into different themes, such as traditional models, machine learning approaches, and hybrid models.

Response 2: Thank you for the insightful suggestion. In response, we have reorganized the literature review by creating two distinct subsections:

Traditional Models: This section discusses the use of Poisson, Negative Binomial, and other count-based models traditionally employed in crash frequency analysis.
Machine Learning Approaches: This section highlights the more recent application of machine learning methods which have gained traction in addressing the limitations of traditional models.

Although we did not add a third subsection on hybrid models, the restructuring provides clearer distinctions between conventional statistical models and modern machine learning techniques.

Comments 3: How did you determine the boundaries of road sections?

Response 3: We would like to clarify that we did not determine the section lengths ourselves, as we did not have access to the original, granular data. The dataset we obtained for the government research institute contained pre-defined road segments after initial analysis. Our preprocessing focused on refining this dataset by removing incomplete records, redundant data, and outliers. According to the description provided by the data source, the segmentation was based on key factors such as geometric consistency, curvature, slope, and traffic volume. This segmentation ensures that each section represents relatively homogeneous conditions.

Comments 4: In Table 1 minimum section length is 50m, Is this to small section for analysis?

Response 4: We appreciate the reviewer’s observation regarding the minimum section length of 50 meters in Table 1. While 50 meters may appear short for certain types of analysis, this segment length was predefined based on the dataset provided to us, which reflects changes in key road characteristics such as geometry, traffic volume, and other contextual factors. These shorter segments are essential in capturing high-resolution data, particularly in areas where sudden changes in road design or conditions, such as intersections or curves, may influence crash risk. It is important to note that similar segment lengths have been employed in other studies focusing on detailed geometric features and localized risk factors. Moreover, our two-stage hurdle model is designed to handle zero-inflation, which often occurs in shorter segments. However, we recognize that segment length could influence model performance, therefore we mention about it in the limitation and future work in the conclusion section. Please refer to line 1068 in the revised manuscript.

Comments 5: In Table 2 there are different number of traffic lanes 1 to 4. That means that different models need to be developed for each type of road: one-lane road, two-lane road, and multi-lane road. For example on one-lane and two-lane roads "Median Width" does not influence traffic accidents since Median does not exist on this roads. Or, for example, motorways with 4 lanes do not have the influence of curve radius on occurrence of the traffic accidents, since curve radius are large on motorways. This means that a list of influencing factors must be developed for each type of road.

Response 5: We appreciate the reviewer’s observation regarding the need for different models based on the number of traffic lanes. As our approach is driven by machine learning, it allows the model to learn patterns directly from the data without needing to predefine different models for each road type. The data used in this study was intended to demonstrate the versatility and capabilities of our model across various road conditions, including different lane configurations.

While we did not explicitly separate the analysis by road type in this demonstration, our machine learning framework is flexible enough to adapt to such distinctions in future studies. Depending on the case study and the specific objectives, separating road types and tailoring the analysis to different roadway characteristics could indeed enhance the precision of the model.

Comments 6: Figures 7 and 8 show only the significance of the influencing factors on crash frequency, which is nothing new. The authors did not develop their model of traffic accidents prediction. Traditional Crash prediction models (Poisson and Negative Binomial models) usually result in an equation that can predict the number of traffic accidents per road section in the function of influencing parameters. This is not the case here. So your approach is not comparable to existing models. The only result of this paper is a list of the influencing factors on traffic crash frequency, which is nothing new.

Response 6: We appreciate the reviewer’s feedback regarding Figures 7 and 8 and the comparison to traditional crash prediction models. We acknowledge that traditional models, such as Poisson and Negative Binomial, typically result in explicit mathematical equations that predict crash counts based on influencing factors. However, our approach, leveraging machine learning techniques, takes a different, data-driven path to achieve the same goal.

Rather than producing a single predictive equation, our model captures complex, non-linear relationships between crash frequency and influencing factors that are often not possible to represent in a simple equation. Machine learning models like CatBoost provide more flexible and adaptive predictions by learning patterns directly from the data, allowing for greater accuracy when dealing with high-dimensional and highly variable crash data. The SHAP analysis presented in Figures 7 and 8, while focusing on the importance of influencing factors, goes beyond simple parameter estimates to provide interpretable insights into how these factors contribute to crash frequency in a non-linear manner, which is a key advantage over traditional methods.

While this paper does not provide a direct comparison to traditional equation-based models, we believe the flexibility and accuracy of our approach demonstrate its strength in complex datasets. That said, we acknowledge that some readers may prefer explicit predictive formulas. We will also consider incorporating additional comparisons with traditional models to highlight the practical advantages of our approach in future work.

Comments 7: Limitations of the paper are missing.

Response 7: Thank you for the insightful comment regarding the absence of limitations in the manuscript. To address this, we have included a paragraph in the conclusion section (lines 1068 to 1083 in the revised manuscript) that outlines several key limitations. Specifically, we discuss the dependency of the model's performance on the South Korean dataset, which may limit its generalizability to other regions with different road characteristics. We also acknowledge the potential impact of the 50-meter road segment lengths on the high incidence of zero crashes, as well as the need to account for temporal dynamics, infrastructure changes, and evolving traffic regulations. Additionally, we highlight the possibility of biases in the crash data, such as underreporting, which could affect model fairness. These limitations provide a balanced perspective on the study’s findings and guide future research directions.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

The perspective the authors choose for investigating the road safety issue is valuable and innovative. It deals with a recurrent problem in the crash analysis. Even if currently, several research manuscripts are dealing with this topic, the proposed one finds a specific spot in this research field.

ABSTRACT + INTRODUCTION + Literature Review

The abstract summarizes well all the key points of the research. it can be improved in style. be careful with the typos.

The introduction section covers in detail all the main aspects and clearly highlight the research question and goals. The literature review follows all the stages covered in the introduction citing great works about the topic.

Some minor suggestions are listed below:

- Try to avoid typos. From the abstract on there are several typos, like cut words (em-ployes) or similar staff.

- L 29-36 try to rephrase these sentences because they are so common in literature about this topic and they are really close to others I have recently read. The concepts are crucial, just rephrase the sentences to give more subjectivity to this introduction.

- L 52 The Higwhay safety manual can be cited being a milestone for the road safety analysis. It can be considered also in the literature review as a benchmark.

Methodology

The proposed mathemathics is accurate and well-explained.

Some minor suggestions, as follows:

- Improve the quality and readability of Figure 1

Empirical assessment

- Try to justify more specifically why 5 years of crashes can be sufficient for your models.

- L 419-430 explain the sample size at the beginning and the location of data collection. I suggest to add a figure with a map where the investigated roads are highlighted. Otherwise it is difficult to catch this important part.

- The road typologies (urban-rural, divided- undivided,.. )and features must be highlighted to understand the type of dataset available. the tables are not enough at this moment.

- Try to improve quality and readability of Figure 3 and to comment some crucial steps of the figure in the text. It can help readers.

- Results and discussion must be a different chapter of the manuscript. They cannot be embedded as a subparagraph.

- Try to enrich some explanations about the results with one-to-one comparisons between your results and existing results in the cited literature.

Comments on the Quality of English Language

It is quite good.

Author Response

Summary

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below to address all the comments.

Point-by-point response to Comments and Suggestions for Authors

Comments 1: Try to avoid typos. From the abstract on there are several typos, like cut words (em-ployes) or similar staff.

Response 1: We appreciate the reviewer’s attention to detail regarding the presence of typos. We have thoroughly reviewed the manuscript from the abstract onward to correct any cut words (e.g., "employes", “implications”) or other similar issues.

Comments 2: L 29-36 try to rephrase these sentences because they are so common in literature about this topic and they are really close to others I have recently read. The concepts are crucial, just rephrase the sentences to give more subjectivity to this introduction.

Response 2: Thank you for the valuable feedback regarding the phrasing of the sentences in lines 29-36. We have rephrased this section to ensure the language is more unique and subjective while retaining the crucial concepts. Please refer to the revised manuscript for the updated text (line 29 to 37).

Comments 3: L52 The Highway safety manual can be cited being a milestone for the road safety analysis. It can be considered also in the literature review as a benchmark.

Response 3: Thank you for the suggestion to cite the Highway Safety Manual. We have cited it in line 56 of the revised manuscript.

Comments 4: Improve the quality and readability of Figure 1

Response 4: Thank you for your comment regarding the quality of Figure 1. In response, we have replaced the original figure with a higher-resolution version to enhance clarity and readability. Please refer to the updated figure in the revised manuscript.

Comments 5: L 419-430 explain the sample size at the beginning and the location of data collection. I suggest to add a figure with a map where the investigated roads are highlighted. Otherwise it is difficult to catch this important part.

Response 5: Thank you for your thoughtful suggestion. We would like to clarify that the sample size is already explained in the original manuscript. Please check from line 441 to 449 in the revised manuscript. As for the data location, since the purpose of the data used in this study is to validate the proposed model rather than conduct a case study, the data is not tied to a specific geographic location. The data we obtained from the source only includes anonymized segment IDs without specific geographic information. Therefore, it is not possible to provide location details or include a map in the current study. Nonetheless, we greatly appreciate your suggestion and will certainly consider it for future studies that involve specific case locations.

Comments 6: The road typologies (urban-rural, divided- undivided,.. )and features must be highlighted to understand the type of dataset available. the tables are not enough at this moment.

Response 6: Thank you for your suggestion. The road typologies were briefly explained in lines 454 to 467 of the revised manuscript. Since this study is not focused on a specific case study but rather on model validation, we chose not to delve into these details extensively. However, we believe the current explanation provides sufficient context for understanding the dataset in the context of the study's goals. We appreciate your feedback and will consider expanding this section in future work if more detailed road typology information becomes relevant.

Comments 7: Try to improve quality and readability of Figure 3 and to comment some crucial steps of the figure in the text.

Response 7: Thank you for your feedback. In response, we have replaced Figure 3 with a higher-resolution and larger version to improve its quality and readability. Additionally, we have commented on some of the crucial steps illustrated in the figure within the text to provide further clarity. Please refer to the revised manuscript for these updates.

Comments 8: Results and discussion must be a different chapter of the manuscript. They cannot be embedded as a subparagraph.

Response 8: Thank you for your comment. We acknowledge the suggestion to separate the Results and Discussion into different chapters. We put the discussion chapter before Conclusions from line 976 to 1030

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

Dear Authors, The research findings are of a high standard. Nevertheless, certain elements of the paper require further refinement. In the introduction, the authors present the problem statement and the background of the research in an effective manner. Furthermore, the authors delineated the research aims and objectives of the paper. Nevertheless, the authors could provide greater precision by including the specific type and algorithm of machine learning to be employed in the analysis. The literature review is well structured and provides a comprehensive overview of the subject matter. Nevertheless, a concluding section in which the authors discuss the relationship between these LT and the research would be beneficial. Section 3 should include a detailed account of the steps taken by the authors to perform the analysis. Nevertheless, it would be beneficial to provide a correlation between the figure that illustrates the framework and the corresponding sub-section presented in the text. Indeed, the figure could illustrate the steps involved in the analysis, with the text subsequently delineating these steps in a systematic manner. The purpose of Section 4 is unclear. If it is intended to illustrate the implications of the framework, it should be incorporated into the discussion section. Alternatively, if the section is intended to describe the data, it should be included in the results and defined as a case study. This section requires further definition or introduction to enhance comprehension of the paper. The results section is well presented; however, the authors could provide additional analysis of the results. Additionally, the results section is primarily focused on the performance of the algorithm but lacks information on the results related to the data. It is advisable to provide more details and analysis of the case study, thereby enhancing the robustness and interest of the research for the readers. Furthermore, the conclusion should present some future research directions to enhance the results or address the limitations.

Best regards;

Comments on the Quality of English Language

Upon reviewing the manuscript, moderate editing of the English language is necessary throughout the paper to enhance readability and coherence. Addressing these linguistic aspects will significantly contribute to the manuscript's quality and readability.

Author Response

Summary

We would like to express our sincere gratitude to the reviewer for the insightful comments and constructive feedback, which have greatly contributed to improving the quality and clarity of our manuscript. We have carefully addressed each of the comments, making specific revisions throughout the manuscript to enhance the precision, organization, and comprehensibility of our study. All revisions have been highlighted in yellow in the revised manuscript. Please find our detailed responses below.

Point-by-point response to Comments and Suggestions for Authors

Comments 1: In the introduction, the authors present the problem statement and the background of the research in an effective manner. Furthermore, the authors delineated the research aims and objectives of the paper. Nevertheless, the authors could provide greater precision by including the specific type and algorithm of machine learning to be employed in the analysis.

Response 1: We appreciate the reviewer’s positive feedback on the introduction and the suggestion to enhance precision by specifying the type of machine learning algorithm used. In response, we have updated the objectives section in the introduction to explicitly mention the use of the CatBoost algorithm. This adjustment highlights CatBoost’s suitability for handling categorical data and capturing complex, non-linear relationships, which are critical for accurately modeling zero-inflated crash frequency data. We believe this addition clarifies the machine learning approach employed in our study and aligns with the reviewer’s recommendation.

Please refer to the text from Line 86 to Line 93 in the revised manuscript.

Comments 2: The literature review is well structured and provides a comprehensive overview of the subject matter. Nevertheless, a concluding section in which the authors discuss the relationship between these LT and the research would be beneficial.

Response 2: We appreciate the reviewer’s positive feedback on the structure and comprehensiveness of the literature review. In response to the suggestion, we have added a concluding subsection titled "Addressing Limitations in Crash Analysis Models" to enhance the connection between existing research and our study. This new subsection synthesizes the limitations of traditional models in handling zero-inflated and over-dispersed crash data, as well as the interpretability challenges of many machine learning methods. It also outlines how our proposed CatBoost-based hurdle model addresses these challenges by combining flexibility in modeling complex relationships with enhanced interpretability through SHapley Additive exPlanations (SHAP).

Please refer to the text from Line 155 to Line 180 in the revised manuscript.

Comments 3: Section 3 should include a detailed account of the steps taken by the authors to perform the analysis. Nevertheless, it would be beneficial to provide a correlation between the figure that illustrates the framework and the corresponding sub-section presented in the text. Indeed, the figure could illustrate the steps involved in the analysis, with the text subsequently delineating these steps in a systematic manner.

Response 3: Thank you for your insightful suggestion to clarify the steps in our analysis and correlate them with the figure illustrating the framework. In response, we have added a detailed, step-by-step breakdown at the end of Section 3.1 that systematically describes each phase of the methodology as shown in Figure 1. The updated description now includes:

Data Preprocessing – Cleans and prepares the dataset for further analysis.
Data Transformation – Separates the data into binary and non-zero subsets tailored for classification and regression.
Binary Classification Stage – Predicts the likelihood of crash occurrence for each road segment.
Crash Frequency Prediction (Regression Stage) – Estimates crash frequency for segments with predicted crashes.
Model Evaluation – Assesses model performance using task-specific metrics for classification and regression.
Model Interpretability – Provides insights into feature contributions to inform road safety interventions.

We believe these additions provide readers with a clear, structured guide to each analytical step, improving the comprehensiveness and clarity of Section 3 and aligning each step with Figure 1 as suggested.

Please refer to the text from Line 273 to Line 371 in the revised manuscript.

Comments 4: The purpose of Section 4 is unclear. If it is intended to illustrate the implications of the framework, it should be incorporated into the discussion section. Alternatively, if the section is intended to describe the data, it should be included in the results and defined as a case study. This section requires further definition or introduction to enhance comprehension of the paper. The results section is well presented; however, the authors could provide additional analysis of the results. Additionally, the results section is primarily focused on the performance of the algorithm but lacks information on the results related to the data. It is advisable to provide more details and analysis of the case study, thereby enhancing the robustness and interest of the research for the readers.

Response 4: We appreciate the reviewer’s feedback on clarifying the purpose of Section 4. As indicated by the section title, the primary purpose of Section 4 is to empirically assess the proposed model using real-world data, validating its effectiveness in handling crash frequency data. This section is intended as a demonstration of the model’s capability to address the challenges associated with zero-inflated and over-dispersed crash data, rather than providing an in-depth analysis of the data itself. A comprehensive case study and further data-driven insights will be explored in future work, once additional data has been collected to support a broader investigation.

To enhance clarity, we have revised the introduction to Section 4 to better communicate its purpose as a validation of the model in a practical context. The revised introduction emphasizes that this section aims to showcase the model's performance and adaptability, with a focus on demonstrating its accuracy and robustness rather than detailed analysis of specific data insights.

We hope this adjustment clarifies the intent of Section 4 and aligns with the reviewer’s expectations for comprehensibility and purpose within the manuscript.

Please refer to the text from Line 555 to Line 569 in the revised manuscript.

Comments 5: Furthermore, the conclusion should present some future research directions to enhance the results or address the limitations.

Response 5: Thank you for your valuable feedback on the Conclusion section. In response, we have expanded the Conclusion to include future research directions aimed at addressing the model’s limitations and enhancing its application. These additions provide a clear pathway for extending the model's applicability and robustness, addressing both current limitations and potential enhancements.

We believe these changes clarify the trajectory for future research and enhance the paper’s relevance to the field of crash frequency modeling.

Please refer to the text from Line 1278 to Line 1300 in the revised manuscript.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have conscientiously written responses to the problems identified by this reviewer. However, these problems were not issues that have not been made clear in the manuscript, but rather, as the reviewer's template indicates, it is a matter of lack of scientific soundness and control missing in the experiments. The authors have not even attempted to acknowledge the limitations of the research but have given recalcitrant rebuttals instead.

Regardless of the adjusting method, a Zero-inflated model assumes that certain network segments are completely safe, which is not a realistic assumption unless these segments have no traffic. All road segments have some level of risk and categorizing them into completely safe oversimplifies the actual risk factors involved.

The problem of segments that are too short for safety analysis has also not been solved, nor has the network screening been adequately justified.

The authors did not validate the results by dividing the sample into at least two sub samples (training and test) to verify that no overfitting is produced. This measure would have mitigated the abovementioned issue of the Zero-inflated models.

The authors have failed to prove the interpretability and transparency of the model since there is still no information about the contribution of each variable to the occurrence of accidents. As a result, decisions on how to improve the network cannot be made on the results of the analysis.

Author Response

We sincerely thank you for your continued time and effort in reviewing our manuscript. We acknowledge that our initial responses did not fully address the limitations of our research. We understand the importance of transparently discussing limitations to provide a balanced and honest portrayal of our study's findings and implications. In response to your feedback, we have made significant revisions to the Conclusion section of our manuscript:

Added a limitation paragraph (Lines 1068 to 1095):
- Content: This new section outlines the primary limitations of our study, including data generalizability, segment lengths, validation processes, temporal dynamics, and potential data biases.
- Purpose: It provides a balanced view of our research by acknowledging constraints and sets the stage for future research directions.
Revised the Final Paragraph of the Conclusion (Lines 1096 to 1134):
- Content: This revision includes a continuous prose paragraph detailing future work that directly addresses the limitations you highlighted. It discusses plans to utilize longer road segments, enhance validation techniques, incorporate temporal dynamics, expand data sources, improve model interpretability, address data biases, generalize to diverse regions, and explore advanced machine learning frameworks.
- Purpose: It demonstrates our commitment to addressing the identified limitations and improving the scientific soundness of our model in future studies.

We believe these additions provide a more nuanced and transparent understanding of our research, aligning with your valuable feedback.

Reviewer 2 Report

Comments and Suggestions for Authors

I have no additional requests for manuscript improvement.

Author Response

Thank you very much for taking the time to review our revised manuscript. We appreciate your thorough evaluation and are pleased to hear that the revisions have satisfactorily addressed your previous concerns. We are grateful for your constructive comments.

Reviewer 4 Report

Comments and Suggestions for Authors

Dear Authors,

The authors have implemented the recommended improvements from the previous review. Nevertheless, a few modifications are necessary.

The introduction provides an explanation of the research aim and outlines the methodology employed by the authors to achieve it. Nevertheless, it would be beneficial to provide an overview of the paper's structure, indicating for each section what the readers can expect to find therein.

It is still unclear how the previous suggestions, which related the works defined in the LT with the research, were understood. The section designated as 2.3 did not address the aforementioned statement. Moreover, the authors address the limitations of crash analysis in that section. This is appropriate for the discussion, as it should be included in the final section of the paper. Accordingly, the authors should rephrase this section by incorporating relevant works that address the limitations outlined in the aforementioned section.

Furthermore, the defined steps in the explanation of the proposed methods should be consistent with the nomenclature and names depicted in Figure 1. This will assist the readers in comprehending the methodology. It should be noted that only step 2 is aligned with the figure; the remaining steps lack clear graphical representation.

Figure 3 is crucial in providing readers with a comprehensive understanding of the results and the subsequent analysis. It is advisable to define the same steps and names in the framework explanation, as well as in the figure 1 and figure 3.

The conclusion section is particularly extensive in terms of content. It would be preferable to relocate some of the conclusions to the discussion section. For example, the limitations could be incorporated into the discussion to serve as a point of reference for future research.

Comments on the Quality of English Language

A review of the manuscript reveals the necessity for minor editing of the English language throughout the paper in order to enhance readability and coherence.

Author Response

Summary

We sincerely appreciate the reviewer’s valuable comments and thoughtful feedback, which have significantly helped us refine and enhance the quality and clarity of our manuscript. We have thoroughly addressed each point and implemented revisions across the manuscript to improve its accuracy and readability. All changes are highlighted in yellow in the revised version. Please see our detailed responses to each comment below.

Point-by-point response to Comments and Suggestions for Authors

Comments 1: The introduction provides an explanation of the research aim and outlines the methodology employed by the authors to achieve it. Nevertheless, it would be beneficial to provide an overview of the paper's structure, indicating for each section what the readers can expect to find therein.

Response 1: We appreciate the reviewer’s suggestion to enhance the Introduction with a structural overview of the paper. To address this, we have added a brief summary at the end of the Introduction, outlining the content of each section. This addition provides readers with a clear roadmap, indicating what they can expect in each part of the manuscript.

Please refer to the text from Line 94 to Line 104 in the revised manuscript.

Comments 2: It is still unclear how the previous suggestions, which related the works defined in the LT with the research, were understood. The section designated as 2.3 did not address the aforementioned statement. Moreover, the authors address the limitations of crash analysis in that section. This is appropriate for the discussion, as it should be included in the final section of the paper. Accordingly, the authors should rephrase this section by incorporating relevant works that address the limitations outlined in the aforementioned section.

Response 2: We appreciate the reviewer’s insight and the opportunity to clarify the intent of Section 2.3. This section was specifically added to address the limitations present in prior literature and to illustrate how our current research aims to overcome these challenges. The limitations discussed in Section 2.3 focus solely on issues associated with traditional methods and models in crash analysis, rather than limitations of our own study. By including this discussion in the literature review, we aim to provide readers with a clear understanding of the research gap and the motivation behind our proposed approach.

We believe that moving this section to the Discussion or Conclusion would shift the focus, as those sections are intended to address the limitations of our current research and provide a critical reflection on our study’s findings. We believe that leaving Section 2.3 in its current place allows for a more logical progression, from identifying challenges in existing work to demonstrating how our methodology addresses these issues. We hope this reasoning is clear and that the purpose of Section 2.3 aligns with the reviewer's expectations.

Also, we mentioned the limitation of this study at the end of conclusion form Line 1225 to 1238 to clarify the limitation.

Comments 3: Furthermore, the defined steps in the explanation of the proposed methods should be consistent with the nomenclature and names depicted in Figure 1. This will assist the readers in comprehending the methodology. It should be noted that only step 2 is aligned with the figure; the remaining steps lack clear graphical representation.

Response 3: We appreciate the reviewer’s valuable feedback regarding the need for consistency between the explanation of the proposed methods and the nomenclature used in Figure 1 and Figure 3. We have revised Figures 1 and 3 to ensure that the steps and nomenclature align consistently with the explanation provided in the methodology section. Following this, we also revised the text to reflect the updated terminology and structure used in the figures. These adjustments were made to enhance clarity and help readers more easily comprehend each step of our proposed framework. We believe the methodology and results are now presented in a more coherent and accessible manner.

Please refer to the text from Line 316 to Line 355 in the revised manuscript.

Comments 4: The conclusion section is particularly extensive in terms of content. It would be preferable to relocate some of the conclusions to the discussion section. For example, the limitations could be incorporated into the discussion to serve as a point of reference for future research.

Response 4: Thank you for your valuable feedback on the Conclusion section. In response, we have added a paragraph in the Discussion to address the limitations and potential areas for future research. Additionally, we revised the Conclusion section by removing redundant text to create a more concise and focused summary of our findings. We believe these changes improve the clarity and organization of the manuscript, aligning it more closely with the reviewer’s suggestions.

Please refer to the text from Line 1163 to Line 1185 and the text from Line 1225 to Line 1238 in the revised manuscript.

Author Response File: Author Response.docx

Round 3

Reviewer 1 Report

Comments and Suggestions for Authors

I would like to thank the authors once again for their thoughtful responses to the reviewer's comments. While I sincerely appreciate their efforts, there remain significant concerns regarding the scientific soundness and control of the experiments. The issues raised are not merely matters of clarity but fundamental deficiencies in the research design and methodology. The authors' attempts to make the document something worth of publishing fail to address the underlying limitations pointed out in the previous review. Future lines of research must be done now if the manuscript is to be published because otherwise, the contribution to science is negligible

The assumption of completely safe network segments in Zero-inflated models is unrealistic and oversimplifies the actual risk factors involved.

The issue of segments that are too short for safety analysis remains unresolved.

The network screening process requires a more robust justification.

The lack of validation through a training-test split increases the risk of overfitting and undermines the reliability of your results.

The model's interpretability and transparency are essential for informed decision-making. The current lack of information about variable contributions prevents meaningful insights into network improvement strategies.

Author Response

1. Summary We would like to express our sincere gratitude to the reviewer for the insightful comments and constructive feedback, which have greatly contributed to improving the quality and clarity of our manuscript. We understand the importance of transparently discussing limitations to provide a balanced and honest portrayal of our study's findings and implications. In response to your feedback, we have made revisions to our manuscript: We have carefully addressed each of the comments, making specific revisions throughout the manuscript to enhance the precision, organization, and comprehensibility of our study. All revisions have been highlighted in yellow in the revised manuscript. Please find our detailed responses below. 2. Point-by-point response to Comments and Suggestions for Authors Comments 1: The assumption of completely safe network segments in Zero-inflated models is unrealistic and oversimplifies the actual risk factors involved. The issue of segments that are too short for safety analysis remains unresolved. The network screening process requires a more robust justification Response 1: Thank you for your thoughtful feedback regarding the use of zero-inflated models. Our study employs a machine learning-driven hurdle model, which offers a novel perspective that differs from traditional dual-state models. Unlike models that rely on strict assumptions about the data-generating process, our approach is data-driven and adaptable, allowing it to capture more complex, non-linear relationships inherent in crash frequency data. This flexibility enables the model to account for a wider range of explanatory variables and address challenges such as over-dispersion and class imbalance through tailored loss functions. Furthermore, the use of machine learning offers significant advantages in terms of both model accuracy and interpretability. Recent advancements, such as SHAP (Shapley Additive exPlanations), allow for clear insights into the contribution of individual variables, addressing concerns about model transparency. Additionally, our rigorous cross-validation process ensures that the model is robust and generalizable across different scenarios, reducing the risk of overfitting. While we acknowledge that some critiques are well-founded in the context of traditional dual-state models, we believe that the integration of machine learning techniques provides a promising alternative. By leveraging machine learning's strengths, our research aims to develop a more accurate and interpretable model for crash frequency analysis, ultimately contributing to more effective road safety interventions. We hope this clarification addresses the concerns raised and provides a clearer understanding of how machine learning enhances our approach. We appreciate the reviewer’s positive feedback on the introduction and the suggestion to enhance precision by specifying the type of machine learning algorithm used. In response, we mentioned the limitation in the Conclusion. Please refer to the text from Line 1211 to Line 1268 in the revised manuscript. Comments 2: The network screening process requires a more robust justification. Response 2: Thank you for your feedback regarding the model's interpretability. While the model does not provide traditional numerical metrics to quantify the contribution of each variable, we have used SHapley Additive exPlanations (SHAP) and Partial Dependence Plots (PDP) to provide insights into how each variable influences crash occurrence. PDP plots, in particular, offer a useful tool to help readers understand how changes in specific variables affect crash risks, which can guide informed decisions about road safety interventions. In future work, we plan to incorporate additional techniques to further enhance the interpretability of our model, such as Individual Conditional Expectation (ICE) plots and Accumulated Local Effects (ALE) plots, which can provide more granular insights alongside SHAP. These methods will improve transparency and help stakeholders better understand the underlying factors driving crash risks. Please refer to the text from Line 273 to Line 371 in the revised manuscript. Comments 3: The lack of validation through a training-test split increases the risk of overfitting and undermines the reliability of your results. Response 3: Thank you for your valuable comment regarding the validation process. We would like to clarify that the sub-sample testing approach (training and testing data) was followed for each sub-stage of the model, both the classification stage and the regression stage. This ensured that overfitting was minimized and the model’s performance was validated at these critical stages. We believe this approach provides a robust validation framework, ensuring that both components of the model generalize well to unseen data. Please refer to the text from Line 555 to Line 569 in the revised manuscript.

Author Response File: Author Response.docx

Article Menu

An Interpretable Machine Learning-Based Hurdle Model for Zero-Inflated Road Crash Frequency Data Analysis: Real-World Assessment and Validation

Further Information

Guidelines

MDPI Initiatives

Follow MDPI