Multi-vehicle (MV) crashes, which can lead to great damages to society, have always been a serious issue for traffic safety. A further understanding of crash severity can help transportation engineers identify the critical reasons and find effective countermeasures to improve transportation safety. However, studies involving methods of machine learning to predict the possibility of injury-severity of MV crashes are rarely seen. Besides that, previous studies have rarely taken temporal stability into consideration in MV crashes. To bridge these knowledge gaps, two kinds of models: random parameters logit model (RPL), with heterogeneities in the means and variances, and Random Forest (RF) were employed in this research to identify the critical contributing factors and to predict the possibility of MV injury-severity. Three-year (2016–2018) MV data from Washington, United States, extracted from the Highway Safety Information System (HSIS), were applied for crash injury-severity analysis. In addition, a series of likelihood ratio tests were conducted for temporal stability between different years. Four indicators were employed to measure the prediction performance of the selected models, and four categories of crash-related characteristics were specifically investigated based on the RPL model. The results showed that the machine learning-based models performed better than the statistical models did when taking the overall accuracy as an evaluation indicator. However, the statistical models had a better prediction performance than the machine learning models had considering crash costs. Temporal instabilities were present between 2016 and 2017 MV data. The effect of significant factors was elaborated based on the RPL model with heterogeneities in the means and variances.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited