Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model
Abstract
1. Introduction
- A real-world setup for post-accident impact prediction: This paper proposes an effective setup to use easy-to-obtain, real-world data resources (i.e., datasets on accident events, congestion incidents, weather data, and spatial information) to estimate accident impact on the surrounding area shortly after an accident occurs. Also, this framework employs a data augmentation approach to obtain accurate feature vectors from heterogeneous data sources.
- A data-driven label refinement process: This study illustrates a novel feature to demonstrate post-accident impact on its surrounding traffic flow. This feature combines three factors, namely “severity”, “duration”, and “distance”, to create a compelling feature that we refer to as gamma in this work.
- A cascade model for accident impact prediction: This paper proposes a cascade model to employ the power of LSTM and CNN to efficiently predict the post-accident impacts in two stages. First, it distinguishes between accident and non-accident events, then it predicts the intensity of impact for accident events.
2. Related Work
2.1. Accident Risk Prediction
2.2. Accident Impact Prediction
3. Dataset
3.1. Data Sources
3.1.1. Accident Dataset
3.1.2. Congestion Dataset
3.1.3. POI Dataset
3.1.4. Weather Condition Dataset
3.2. Data Quality Assurance
3.3. Preprocessing and Preparation
- Remove duplicated records: Since the accident data is collected from two potentially overlapping sources, some accidents might be reported twice. Therefore, we first process the input accident data to remove duplicated cases.
- Fill missing values using K-Nearest Neighbors imputation: The weather data suffers from missing values for some of features (e.g., sunrise_sunset). Based on two nearest neighbors, we impute missing values with the mean (for numerical features) or mode (for categorical features). We use time and location to determine distance when finding neighbors. This approach leverages the smooth variation in weather data across space and time. For cases with higher missingness, more advanced imputation methods could be explored in future work.
- Treat outliers: For a numerical feature f, if or , then we replace it with or , respectively. Here, and are the mean and standard deviation of feature f, respectively, calculated on all weather records. This three-sigma range is a standard statistical method that effectively removes extreme outlier values while preserving the vast majority of valid data.
- Omit redundant features: We remove those features that satisfy either of the following conditions: (1) correlated features based on Pearson correlation; (2) categorical features with more than 90% data frequency on a specific value.
- Discretize data: After data cleaning, both accident and congestion data are first discretized in space and time. The temporal resolution is 2 h intervals and the spatial resolution is set to 5 km × 5 km squares in uniform grids.
3.4. Data Augmentation
3.5. Accident Duration Distribution
4. Label Development of Post-Accident Impact
4.1. Accident Impact: A Derived Factor
- Severity: This is a categorical attribute that shows severity in terms of delay in free-flow traffic due to accidents. It is defined by the original data providers based on real-time traffic conditions and reports, where 1 indicates minimal disruption and 4 indicates significant traffic delays. Although it seems to be a highly relevant feature, it suffers from skewness when looking at its distribution (see Figure 5), and is a coarse-grained factor (i.e., it is represented by just a few categorical values).
- Duration: The duration of a traffic accident shows the period from when the accident was first reported until its impact had been cleared from the road network. In this sense, the duration can be considered as another factor to determine impact. Please note that a long duration does not necessarily indicate a significant impact, as it could be related to the type of location and accessibility concerns. However, generally speaking, duration is positively correlated with impact.
- Distance: Distance shows the length of road affected by an accident. Similar to duration, a long distance does positively correlate with higher impact. However, a high-impact accident may not necessarily result in impacting a long extent of road.
4.2. Delay as a Proxy for Impact
4.3. Estimating Delay
- Linear regression (LR) model: Given our input features, a linear model seems a natural choice to estimate . So, we use a linear regression model for this purpose.
- Artificial Neural Network (ANN): To examine the impact of non-linearity to estimate , we also used a multi-layer perceptron network with four layers, each with three neurons, and a single neuron in the last layer. We used the Adam optimizer [53] with a learning rate of and trained the model for 200 epochs.
4.4. Scope of Predicted Accident Impact
5. Accident Impact Prediction Methodology
5.1. Basic Concepts
5.1.1. Long Short-Term Memory
5.1.2. Convolutional Neural Network
5.2. Model Input and Output
5.3. Model Development
- (A)
- Label Prediction: The first model is a binary LSTM classifier that predicts whether the next interval would have an accident (i.e., label = 1) or not (i.e., label = 0).We adopt a cascade modeling approach instead of a single-step classifier to decompose the task into two more focused subtasks: detecting the occurrence of accidents and then assessing their impact. This decoupled structure allows each model to specialize—Model 1 learns patterns for accident likelihood, while Model 2 focuses on severity estimation—leading to improved performance, especially in imbalanced and noisy settings. Moreover, this setup provides greater interpretability and flexibility, enabling separate tuning and the evaluation of detection and impact stages.In accident prediction, it is vital that if an accident is likely to happen, the model can predict it in advance. Therefore, in the first model, the focus is on detecting accident events, and we use a weighting mechanism for this purpose, such that the weight of the accident class is higher than the weight of the non-accident class.Class weights play a key role in guiding the model to compensate for the imbalance in the data, particularly by penalizing missed accident predictions more heavily than false alarms. There is no fixed theoretical formula for choosing class weights; they are typically treated as hyperparameters that must be tuned empirically. To this end, we conducted a grid search over a range of weight ratios from 1:1 up to 1:8 (non-accident/accident) and selected the combination that yielded the best precision–recall tradeoff on a validation set.A similar process was applied to the second model, where we explored different weighting schemes to balance the three gamma classes (non-impact, medium-impact, and high-impact). The selected weights were those that minimized the misclassification of impactful accidents while keeping false positives low. Using grid search, we found the optimum weights to be 1 and 3 for non-accident and accident classes, respectively.The first model takes as its input two components: (a) the index of a given zone (denoted as ), and (b) the temporal feature sequence , for .Here, “subtraction” indicates that the zone index is excluded from the temporal feature vectors. The first input is passed through an embedding layer of size , where R is the set of all spatial grid cells, allowing the model to learn a distributed representation of regional characteristics such as spatial heterogeneity, traffic dynamics, and environmental context. The second input is a sequence of w vectors, each with 35 features, which is processed by two LSTM layers with 12 and 24 neurons, respectively. The resulting sequence encoding is concatenated with the embedded zone representation and passed through two fully connected layers, each with 25 neurons. Batch normalization is applied throughout the network to enhance convergence stability and reduce internal covariate shift. The structure of the first model is shown in Figure 14.The overall architecture was selected through extensive hyperparameter tuning using a validation dataset. Simpler or shallower model variants were evaluated but failed to capture the complex temporal dependencies necessary for accurate accident prediction.To mitigate overfitting, we employed batch normalization and early stopping, which are well-established regularization methods in deep learning. Although the chosen architecture introduces additional computational cost and longer training time, this trade-off was necessary to achieve improved predictive accuracy in the context of highly imbalanced and noisy real-world traffic data.
- (B)
- Impact Prediction: The second model is a three-class CNN classifier that predicts for the next interval. The weight of each class is assigned based on its frequency and importance in our problem setup. Class imbalance in the impact prediction task can lead to biased predictions favoring the majority class (i.e., non-impact). To mitigate this, we incorporated class weights in the loss function to amplify the contribution of under-represented classes during training.To determine the optimal set of weights, we conducted an extensive grid search over different configurations, varying the weights for and in the range of 1.0 to 5.0 (in increments of 0.5), while slightly adjusting the weight of around 0.5 to 1.0 to maintain baseline stability. This search aimed to balance the model’s sensitivity to impactful accidents (classes 1 and 2) with the need to avoid excessive false positives. The best-performing combination was selected based on validation set performance, prioritizing recall for impactful cases. The optimum weights are found to be 0.7, 4.5, and 3.5 for , and , respectively.The second model is a CNN-based classifier that predicts the impact level (i.e., ) for intervals flagged as accident-prone by the first model. Although the input is primarily composed of samples predicted as accidents, we also include selected non-accident cases to help correct potential misclassifications from the first model and improve the cascade system’s overall reliability.This CNN model consists of three convolutional layers followed by a flattening operation and three fully connected (dense) layers, with one dropout layer applied before the final two layers. The architecture is designed to extract hierarchical spatial and short-term temporal patterns from the input feature maps, enabling the model to differentiate between varying levels of post-accident impact. The specific configuration was determined through extensive hyperparameter tuning on a validation set. Simpler architectures with fewer convolutional layers or reduced capacity were tested but did not perform as well, particularly in capturing the nuances of medium- and high-impact events.To mitigate overfitting and stabilize training, we included early stopping and dropout layers, both of which are widely used and effective in deep learning models [57]. While the depth of the network increases the training time, we found this to be an acceptable trade-off for improved performance. In future work, we aim to investigate lighter-weight alternatives such as MobileNet-style CNNs or dilated convolutions to reduce computational cost without compromising prediction quality.The structure of the model is illustrated in Figure 15.
6. Experiments and Results
6.1. Evaluation Metrics
6.2. Experimental Setup
6.3. Baseline Models
6.4. Results and Model Comparison
6.5. Influencing Factor Analysis
7. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, C.; Li, X.; Zhou, X.; Wang, A.; Nedjah, N. Soft computing in big data intelligent transportation systems. Appl. Soft Comput. 2016, 38, 1099–1108. [Google Scholar] [CrossRef]
- Zhang, J.; Wang, J.; Fang, S. Prediction of urban expressway total traffic accident duration based on multiple linear regression and artificial neural network. In Proceedings of the 2019 5th International Conference on Transportation Information and Safety (ICTIS), Liverpool, UK, 14–17 July 2019; pp. 503–510. [Google Scholar]
- Austin, R.D.; Carson, J.L. An alternative accident prediction model for highway-rail interfaces. Accid. Anal. Prev. 2002, 34, 31–42. [Google Scholar] [CrossRef] [PubMed]
- Oyedepo, O.; Makinde, O. Accident Prediction Models for Akure–Ondo Carriageway, Ondo State Southwest Nigeria; Using Multiple Linear Regressions. Afr. Res. Rev. 2010, 4, 30–49. [Google Scholar] [CrossRef]
- Wei, C.H.; Lee, Y. Sequential forecast of incident duration using artificial neural network models. Accid. Anal. Prev. 2007, 39, 944–954. [Google Scholar] [CrossRef]
- Salahadin Seid Yassin, P. Road accident prediction and model interpretation using a hybrid K-means and random forest algorithm approach. SN Appl. Sci. 2020, 1576. [Google Scholar]
- Sarkar, S.; Vinay, S.; Raj, R.; Maiti, J.; Mitra, P. Application of optimized machine learning techniques for prediction of occupational accidents. Comput. Oper. Res. 2019, 106, 210–224. [Google Scholar] [CrossRef]
- Wenqi, L.; Dongyu, L.; Menghua, Y. A model of traffic accident prediction based on convolutional neural network. In Proceedings of the 2017 2nd IEEE International Conference on Intelligent Transportation Engineering (ICITE), Singapore, 1–3 September 2017; pp. 198–202. [Google Scholar]
- Ozbayoglu, M.; Kucukayan, G.; Dogdu, E. A real-time autonomous highway accident detection model based on big data processing and computational intelligence. In Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 5–8 December 2016; pp. 1807–1813. [Google Scholar]
- Marcillo, P.; Valdivieso Caraguay, Á.L.; Hernández-Álvarez, M. A systematic literature review of learning-based traffic accident prediction models based on heterogeneous sources. Appl. Sci. 2022, 12, 4529. [Google Scholar] [CrossRef]
- Li, P.; Abdel-Aty, M.; Yuan, J. Real-time crash risk prediction on arterials based on LSTM-CNN. Accid. Anal. Prev. 2020, 135, 105371. [Google Scholar] [CrossRef] [PubMed]
- Moosavi, S.; Samavatian, M.H.; Nandi, A.; Parthasarathy, S.; Ramnath, R. Short and Long-Term Pattern Discovery Over Large-Scale Geo-Spatiotemporal Data. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, Anchorage, AK, USA, 4–8 August 2019; pp. 2905–2913. [Google Scholar] [CrossRef]
- Caliendo, C.; Guida, M.; Parisi, A. A crash-prediction model for multilane roads. Accid. Anal. Prev. 2007, 39, 657–670. [Google Scholar] [CrossRef]
- Najjar, A.; Kaneko, S.; Miyanaga, Y. Combining satellite imagery and open data to map road safety. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Ren, H.; Song, Y.; Wang, J.; Hu, Y.; Lei, J. A deep learning approach to the citywide traffic accident risk prediction. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 3346–3351. [Google Scholar]
- Yuan, Z.; Zhou, X.; Yang, T. Hetero-convlstm: A deep learning approach to traffic accident prediction on heterogeneous spatio-temporal data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 984–992. [Google Scholar]
- Chen, Q.; Song, X.; Yamada, H.; Shibasaki, R. Learning deep representation from big and heterogeneous data for traffic accident inference. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Yu, R.; Abdel-Aty, M. Utilizing support vector machine in real-time crash risk evaluation. Accid. Anal. Prev. 2013, 51, 252–259. [Google Scholar] [CrossRef]
- Lin, L.; Wang, Q.; Sadek, A.W. A novel variable selection method based on frequent pattern tree for real-time traffic accident risk prediction. Transp. Res. Part C Emerg. Technol. 2015, 55, 444–459. [Google Scholar] [CrossRef]
- Bianchi, F.M.; Maiorino, E.; Kampffmeyer, M.C.; Rizzi, A.; Jenssen, R. Recurrent Neural Networks for Short-Term Load Forecasting: An Overview and Comparative Analysis; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Tian, Y.; Pan, L. Predicting short-term traffic flow by long short-term memory recurrent neural network. In Proceedings of the 2015 IEEE International Conference on Smart city/SocialCom/SustainCom (SmartCity), Chengdu, China, 19–21 December 2015; pp. 153–158. [Google Scholar]
- Parsa, A.B.; Chauhan, R.S.; Taghipour, H.; Derrible, S.; Mohammadian, A. Applying Deep Learning to Detect Traffic Accidents in Real Time Using Spatiotemporal Sequential Data. arXiv 2019, arXiv:1912.06991. [Google Scholar] [CrossRef]
- Wang, X.; Chai, Y.; Li, H.; Wang, W.; Sun, W. Graph Convolutional Network-based Model for Incident-related Congestion Prediction: A Case Study of Shanghai Expressways. ACM Trans. Manag. Inf. Syst. 2021, 12, 21. [Google Scholar] [CrossRef]
- Bao, J.; Liu, P.; Ukkusuri, S.V. A spatiotemporal deep learning approach for citywide short-term crash risk prediction with multi-source data. Accid. Anal. Prev. 2019, 122, 239–254. [Google Scholar] [CrossRef]
- Singh, G.; Pal, M.; Yadav, Y.; Singla, T. Deep neural network-based predictive modeling of road accidents. Neural Comput. Appl. 2020, 32, 12417–12426. [Google Scholar] [CrossRef]
- Theofilatos, A.; Chen, C.; Antoniou, C. Comparing machine learning and deep learning methods for real-time crash prediction. Transp. Res. Rec. 2019, 2673, 169–178. [Google Scholar] [CrossRef]
- Moosavi, S.; Samavatian, M.H.; Parthasarathy, S.; Teodorescu, R.; Ramnath, R. Accident risk prediction based on heterogeneous sparse data: New dataset and insights. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Chicago, IL, USA, 5–8 November 2019; pp. 33–42. [Google Scholar]
- Chen, C.; Fan, X.; Zheng, C.; Xiao, L.; Cheng, M.; Wang, C. Sdcae: Stack denoising convolutional autoencoder model for accident risk prediction via traffic big data. In Proceedings of the 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD), Lanzhou, China, 12–15 August 2018; pp. 328–333. [Google Scholar]
- Kaffash Charandabi, N.; Gholami, A.; Abdollahzadeh Bina, A. Road accident risk prediction using generalized regression neural network optimized with self-organizing map. Neural Comput. Appl. 2022, 34, 8511–8524. [Google Scholar] [CrossRef]
- Parsa, A.B.; Movahedi, A.; Taghipour, H.; Derrible, S.; Mohammadian, A.K. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 2020, 136, 105405. [Google Scholar] [CrossRef]
- Khattak, A.J.; Schofer, J.L.; Wang, M.H. A simple time sequential procedure for predicting freeway incident duration. J. Intell. Transp. Syst. 1995, 2, 113–138. [Google Scholar] [CrossRef]
- Garib, A.; Radwan, A.; Al-Deek, H. Estimating magnitude and duration of incident delays. J. Transp. Eng. 1997, 123, 459–466. [Google Scholar] [CrossRef]
- Peeta, S.; Ramos, J.L.; Gedela, S. Providing Real-Time Traffic Advisory and Route Guidance to Manage Borman Incidents On-Line Using the Hoosier Helper Program; Indiana Department of Transportation and Purdue University: West Lafayette, IN, USA, 2000. [Google Scholar]
- Yu, B.; Xia, Z. A methodology for freeway incident duration prediction using computerized historical database. In CICTP 2012: Multimodal Transportation Systems—Convenient, Safe, Cost-Effective, Efficient, Proceedings of the 12th COTA International Conference of Transportation Professionals, Beijing, China, 3–6 August 2012; American Society of Civil Engineers: Reston, VA, USA, 2012; pp. 3463–3474. [Google Scholar]
- Wang, W.; Chen, H.; Bell, M.C. Vehicle breakdown duration modeling. J. Transp. Stat. 2005, 8, 75. [Google Scholar]
- Vlahogianni, E.I.; Karlaftis, M.G. Fuzzy-entropy neural network freeway incident duration modeling with single and competing uncertainties. Comput.-Aided Civ. Infrastruct. Eng. 2013, 28, 420–433. [Google Scholar] [CrossRef]
- Yu, B.; Wang, Y.; Yao, J.; Wang, J. A comparison of the performance of ANN and SVM for the prediction of traffic accident duration. Neural Netw. World 2016, 26, 271. [Google Scholar] [CrossRef]
- Ma, X.; Ding, C.; Luan, S.; Wang, Y.; Wang, Y. Prioritizing influential factors for freeway incident clearance time prediction using the gradient boosting decision trees method. IEEE Trans. Intell. Transp. Syst. 2017, 18, 2303–2310. [Google Scholar] [CrossRef]
- Yu, R.; Li, Y.; Shahabi, C.; Demiryurek, U.; Liu, Y. Deep learning: A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, TX, USA, 27–29 April 2017; pp. 777–785. [Google Scholar]
- Lin, Y.; Li, R. Real-time traffic accidents post-impact prediction: Based on crowdsourcing data. Accid. Anal. Prev. 2020, 145, 105696. [Google Scholar] [CrossRef]
- Zhang, S.; Liu, H.; Yang, Y.; Zhang, S.; Zhang, Z.; Wang, C.; Wang, M. Prediction of traffic accident impact range based on CatBoost ensemble algorithm. In Proceedings of the Second International Conference on Algorithms, Microchips, and Network Applications (AMNA 2023), Zhengzhou, China, 13–15 January 2023; Volume 12635, pp. 7–11. [Google Scholar]
- Ekanem, I. Analysis of Road Traffic Accident Using AI Techniques. Open J. Saf. Sci. Technol. 2025, 15, 36–56. [Google Scholar] [CrossRef]
- Moosavi, S.; Samavatian, M.H.; Parthasarathy, S.; Ramnath, R. A countrywide traffic accident dataset. arXiv 2019, arXiv:1906.05409. [Google Scholar] [CrossRef]
- Xing, F.; Huang, H.; Zhan, Z.; Zhai, X.; Ou, C.; Sze, N.; Hon, K. Hourly associations between weather factors and traffic crashes: Non-linear and lag effects. Anal. Methods Accid. Res. 2019, 24, 100109. [Google Scholar] [CrossRef]
- Malin, F.; Norros, I.; Innamaa, S. Accident risk of road and weather conditions on different road types. Accid. Anal. Prev. 2019, 122, 181–188. [Google Scholar] [CrossRef] [PubMed]
- Fountas, G.; Fonzone, A.; Gharavi, N.; Rye, T. The joint effect of weather and lighting conditions on injury severities of single-vehicle accidents. Anal. Methods Accid. Res. 2020, 27, 100124. [Google Scholar] [CrossRef]
- Doane, D.P. Aesthetic frequency classifications. Am. Stat. 1976, 30, 181–183. [Google Scholar] [CrossRef]
- Skabardonis, A.; Varaiya, P.; Petty, K.F. Measuring recurrent and nonrecurrent traffic congestion. Transp. Res. Rec. 2003, 1856, 118–124. [Google Scholar] [CrossRef]
- Lv, Y.; Tang, S.; Zhao, H. Real-time highway traffic accident prediction based on the k-nearest neighbor method. In Proceedings of the 2009 International Conference on Measuring Technology and Mechatronics Automation, Zhangjiajie, China, 11–12 April 2009; Volume 3, pp. 547–550. [Google Scholar]
- Wang, C.; Quddus, M.; Ison, S. A spatio-temporal analysis of the impact of congestion on traffic safety on major roads in the UK. Transp. A Transp. Sci. 2013, 9, 124–148. [Google Scholar] [CrossRef]
- Sánchez González, S.; Bedoya-Maya, F.; Calatayud, A. Understanding the Effect of Traffic Congestion on Accidents Using Big Data. Sustainability 2021, 13, 7500. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, Z.; Zhu, L.; Jiang, H. Determinants of the congestion caused by a traffic accident in urban road networks. Accid. Anal. Prev. 2020, 136, 105327. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Yuan, J.; Abdel-Aty, M.; Gong, Y.; Cai, Q. Real-time crash risk prediction using long short-term memory recurrent neural network. Transp. Res. Rec. 2019, 2673, 314–326. [Google Scholar] [CrossRef]
- Ounoughi, C.; Yahia, S.B. Sequence to sequence hybrid Bi-LSTM model for traffic speed prediction. Expert Syst. Appl. 2024, 236, 121325. [Google Scholar] [CrossRef]
- Yang, B.; Sun, S.; Li, J.; Lin, X.; Tian, Y. Traffic flow prediction using LSTM with feature enhancement. Neurocomputing 2019, 332, 320–327. [Google Scholar] [CrossRef]
- Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv 2017, arXiv:1709.04875. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Wang, B.; Zhou, S.; Li, C. Prediction Model for Traffic Congestion Based on the Deep Learning of Convolutional Neural Network. In CICTP 2017: Transportation Reform and Change—Equity, Inclusiveness, Sharing, and Innovation, Proceedings of the 17th COTA International Conference of Transportation Professionals, Shanghai, China, 7–9 July 2017; American Society of Civil Engineers: Reston, VA, USA, 2018; pp. 2494–2505. [Google Scholar]
Study | Input Features | Predicted Output | Model | Best Modeling Results |
---|---|---|---|---|
Moosavi et al. [27] | 1. Weather conditions
2. Time of day 3. Location 4. Road features | Accident risk on different road segments | DNN with recurrent, fully connected and embedding components) | F1 score = 0.59 for accident class F1 score = 0.89 for non-accident class |
Li et al. [11] | 1. Traffic flow characteristics
2. Signal timing 3. Weather condition | Real-time crash risk | LSTM-CNN | False alarm rate = 0.132 AUC = 0.932 |
Parsa et al. [30] | 1. Traffic
2. Network 3. Demographic 4. Land use 5. Weather feature | Occurrence of accidents | eXtreme Gradient Boosting (XGBoost) | Accuracy = close to 100%, Detection rate = varies between 70% and 83%, false alarm rate is less than 0.4% |
Yu and Abdel-Aty [18] | 1. Crash data
2. Real-time traffic data | Crash occurrence | SVM with RBF kernel | AUC = 0.74 for all crash type, AUC = 0.80 for Multi-vehicle crashes, AUC = 0.75 for Single-vehicle crashes (evaluated on 30% of the whole dataset) |
Ozbayoglu et al. [9] | 1. Average velocity
2. Occupancy 3. The capacity difference between time t and t + 1 4. Weekday/weekend 5. Rush hour | Crash occurrence | Nearest neighbor (NN), Regression tree (RT), feedforward neural network (FNN), | Accuracy = 95.12 for NN Accuracy = 97.59 for RT Accuracy = 99.79 for FNN |
Study | Input Features | Predicted Output | Model | Best Modeling Results |
---|---|---|---|---|
Zhang et al. [2] | 1. Temporal 2. Spatial 3. Environmental 4. Traffic 5. Accident details | Total duration and clearance time | Multiple linear regression and ANN | MAPE = 27.1% for total duration, MAPE = 49.8% for clearance time |
Ekanem [42] | 1. Accident time, 2. Accident date, 3. Day of the week, 4. Road surface condition, 5. Accident type, etc. | Accident severity class: Fatal, Serious injury, Minor injury, Property damage only | Logistic Regression, Random Forest, XGBoost | Logistic Regression achieved 100% recall for minor injuries, but all models struggled with fatal cases% |
Ma et al. [38] | 1. Accident details 2. Temporal 3. Geographical 4. Environmental 5. Traffic 6. Operational | Incident clearance time | Gradient boosting decision trees (GBDT) | MAPE = 16% for clearance time less than 15 min and MAPE = 33% for clearance time more than 15 min |
Rose Yu et al. [39] | 1. Accident type 2. Downstream post mile 3. Affected traffic direction 4. Traffic speed | The delay corresponding to the accident | Mixture Deep LSTM | MAPE = 0.97 |
Yu and Abdel-Aty [18] | 1. Crash data
2. Real-time traffic data | Crash occurrence | SVM | AUC = 0.74 for SVM with RBF for all crash type |
Lin and Li [40] | 1. Accident details 2. Traffic details 3. Environmental 4. Air quality 5. Geographical | TAPI (derived from post-accident congestion level and its duration) | NN, SVM, RF | MAPE = 5.5–53.8% |
Wang et al. [35] | 1. Vehicle type 2. Location 3. Time of day 4. Report mechanism | Vehicle breakdown duration | fuzzy logic (FL) and ANN | RMSE = 24 for FL model RMSE = 19.5 for ANN mode V1 RMSE = 24.1 for ANN mode V2 |
Category | Features |
---|---|
Geographical features | Start latitude, Start longitude, End latitude, End longitude, Distance, Street, Number, Airport code, City, Country, State, Zip code, Side, Amenity, Bump, Crossing, Give way, Junction, No-exit, Railway, Roundabout, Station, Stop, Traffic Calming, Traffic Signal, Turning Loop |
Temporal features | Start time, End time, Time zone, Sunrise Sunset, Civil Twilight, Nautical Twilight, Astronomical Twilight |
Other | ID, Severity, TMC |
Features | Unique Values |
---|---|
Weather Conditions | ’Mostly Cloudy’, ’Scattered Clouds’, ’Partly Cloudy’, ’Clear’, ’Light Rain’, ’Overcast’, ’Heavy Rain’, ’Rain’, ’Haze’, ’Patches of Fog’, ’Fog’, ’Shallow Fog’, ’Thunderstorm’, ’Light Drizzle’, ’Thunderstorms and Rain’, ’Cloudy’, ’Fair’, ’Mist’, ’Mostly Cloudy / Windy’, ’Fair / Windy’, ’Partly Cloudy / Windy’, ’Light Rain with Thunder’ |
Wind Directions | ’E’, ’W’, ’CALM’, ’S’, ’N’, ’SE’, ’NNE’, ’NNW’, ’SSE’, ’ESE’, ’NE’, ’NW’, ’WSW’, ’ENE’, ’SW’, ’SSW’, ’WNW’ |
Feature Category | Feature |
---|---|
Temporal | Day of Week, Part of Day (day/night), Sunrise/Sunset |
Weather | Weather Condition |
Accident | Severity, Accident Count, Duration, Distance |
Congestion | Congestion Count, Congestion Delay |
Spatial | Geohash Code, latitude, longitude, Amenity, Bump, Crossing, Give way, Junction, No-exit, Railway, Roundabout, Station, Stop, Traffic Calming, Traffic Signal, Turning Loop |
Event ID | Description | Severity | Duration (Minute) | Distance (Mile) |
---|---|---|---|---|
1 | Delays increasing and delays of nine minutes on Colorado Blvd Westbound in LA. Average speed five mph. | Slow | 49.6 | 2.42 |
2 | Delays of three minutes on Harbor Fwy Northbound between I-10 and US-101. Average speed 20 mph. | Moderate | 43.5 | 3.18 |
3 | Delays of eight minutes on Verdugo Rd Southbound between Verdugo Rd and Shasta Cir. Average speed five mph. | Slow | 41.6 | 0.55 |
4 | Delays of two minutes on I-5 I-10 Northbound between Exits 132 132A Calzona St and Exit 135A 4th St. Average speed 20 mph. | Fast | 41.6 | 2.26 |
Model | ANN | LR |
---|---|---|
MSE | 0.141 | 0.153 |
MAE | 0.239 | 0.255 |
Models | Precision Class 0 | Precision Class 1 | Precision Class 2 | Recall Class 0 | Recall Class 1 | Recall Class 2 |
---|---|---|---|---|---|---|
Gradient Boosting | 0.92 | 0.14 | 0.04 | 0.81 | 0.14 | 0.23 |
Random Forest | 0.93 | 0.13 | 0.05 | 0.69 | 0.31 | 0.30 |
LSTM | 0.94 | 0.10 | 0.04 | 0.56 | 0.34 | 0.36 |
CNN | 0.94 | 0.12 | 0.04 | 0.11 | 0.32 | 0.32 |
Cascade Model | 0.96 | 0.10 | 0.04 | 0.31 | 0.41 | 0.50 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sajadi, P.; Qorbani, M.; Moosavi, S.; Hassannayebi, E. Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model. Urban Sci. 2025, 9, 299. https://doi.org/10.3390/urbansci9080299
Sajadi P, Qorbani M, Moosavi S, Hassannayebi E. Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model. Urban Science. 2025; 9(8):299. https://doi.org/10.3390/urbansci9080299
Chicago/Turabian StyleSajadi, Pouyan, Mahya Qorbani, Sobhan Moosavi, and Erfan Hassannayebi. 2025. "Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model" Urban Science 9, no. 8: 299. https://doi.org/10.3390/urbansci9080299
APA StyleSajadi, P., Qorbani, M., Moosavi, S., & Hassannayebi, E. (2025). Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model. Urban Science, 9(8), 299. https://doi.org/10.3390/urbansci9080299