# A Crash Prediction Method Based on Artificial Intelligence Techniques and Driving Behavior Event Data

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Literature Review

#### 2.1. Safety Indicators Based on Trajectory Data

^{3}can be counted and used as an indicator of a dangerous situation, and as a result of matching and analyzing accident data; it was found to be statistically significant [11]. Feng et al. classified an aggressive driver group and a normal driver group, and discovered a correlation between the frequency of large negative jerks (LNJs) and large positive jerks (LPJs) and aggressive driving behavior [12]. Wu and Jovanis defined the yaw rate as when the heading of a vehicle shifted by more than 4°, and its lateral driving safety was assessed through the yaw rate [13]. To capture a vehicle’s driving variability, Kamrani et al. conducted a study to establish an accident prediction model by deriving driving volatility measures [14]. Kim et al. developed an erratic driving indicator (EDI) to reflect the usual driving patterns of drivers; it then detects aggressive driving by adding threshold values for each driver based on those normal patterns [15].

#### 2.2. Methodology for Evaluating Crash Risk

#### 2.3. Differentiation of Research

## 3. Methodology

#### 3.1. Overall Framework

#### 3.2. Gradient Boosting

#### 3.3. Neural Network

- ${x}^{*}:Optimalhyperparameter$
- $x:hyperparameter$
- $f(x):objectivefunction(CCR)$

- (1)
- Assuming that f(x) follows the Gaussian process (GP) prior, a model is trained using the given data D.
- (2)
- Calculate the acquisition function for data not included in D.
- (3)
- The data point (${x}_{n+1},\mathrm{f}({x}_{n+1})$) with the largest acquisition function value is included in D.

- $\widehat{f(}x{)}_{max}:Themaximumpredictedclassificationaccuracyforanyhyperparameters$
- $\mu (\widehat{f(}x)):Theaveragepredictedclassificationaccuracyforanyhyperparameters$
- $\sigma (\widehat{f(}x)):Thestandarddeviationpredictedclassificationaccuracyforanyhyperparameters$
- $G(x):\mathrm{normal}\mathrm{cumulative}\mathrm{distribution}\mathrm{function}$
- $g(x):probabilitydensityfunction$

## 4. Data

#### 4.1. Data and Traffic Flow Definition

#### 4.2. Safety Indicators

## 5. Results and Discussion

#### 5.1. Gradient Boosting

#### 5.2. Neural Network

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Appendix A

Indicator | Measurement | Variable Name | Threshold | Equation | |
---|---|---|---|---|---|

Peak to peak jerk | Jerk | P to p jerk | - | Max jerk–Min jerk (Analysis unit: 5 s) | |

SRI | Speed | SRI_speed | Average of link | $\frac{{{\displaystyle \sum}}_{i=1}^{n}Timestepwhen({x}_{i}threshold)}{n}\times 100\%$ (n: Total time step) | |

Acc | SRI_acc | ||||

Jerk | SRI_jerk | ||||

Yaw | SRI_yaw | ||||

EDI | Speed | EDI_speed | Average of link | $\frac{A}{T},IfV(t)=\frac{{{\displaystyle \sum}}_{i=1}^{T}\left|V(i)-threshold\right|}{{t}_{i+1}-{t}_{i}}=A$ V(n): Measurement at time step n A: Total areas where the variables exceeded the thresholds T: Total time step | |

Speed | EDI_acc | ||||

Acc | EDI_jerk | ||||

Jerk | EDI_yaw | ||||

Dangerous driving events rate | Speeding | Speed | Dangerous event | Speed: 20 km/h or more | $\frac{TotalofDangerousevents}{n}\times 100\%$ |

Rapid Acceleration | Speed, Acc | Speed: 6 km/h or more Acceleration over 6 km/h per second | |||

Rapid Deceleration | Speed, Acc | Speed: 6 km/h or more Acceleration over 9 km/h per second | |||

Sudden stop | Speed, Acc | Speed: 5 km/h or less Acceleration over 9 km/h per second | |||

Rapid turn | Speed, Yaw | Speed: 25 km/h or more Yaw: Cumulative value within 4 s 60~120° | |||

RDEs | Acc | RDE | 7.35$\mathrm{m}/{s}^{2}$ | $\frac{{{\displaystyle \sum}}_{i=1}^{n}Timestepwhen({x}_{i}threshold)}{n}\times 100\%$ | |

LNJ/LPJ | Jerk | LNJ_ threshold | −1.5, −2, −3, −4$\mathrm{m}/{s}^{3}$ | $\frac{{{\displaystyle \sum}}_{i=1}^{n}Timestepwhen({x}_{i}threshold)}{n}\times 100\%$ | |

LPJ_ threshold | 1.5, 2, 3, 4$\mathrm{m}/{s}^{3}$ | ||||

Peak to peak jerk rate | Jerk | P to p jerk rate | $14.7\mathrm{m}/{s}^{3}$ | $\frac{{{\displaystyle \sum}}_{i=1}^{n}Timestepwhen({x}_{i}threshold)}{n}\times 100\%$ | |

Yaw rate | Yaw | Yaw rate | 4° | $\frac{{{\displaystyle \sum}}_{i}^{n}Timestepwhen({x}_{i}threshold)}{n}\times 100\%$ | |

Driving volatility | Standard deviation | Speed | S.D_speed | - | $\sqrt{\frac{1}{n-1}{\displaystyle {\displaystyle \sum}_{i=1}^{n}}{({x}_{i}-\overline{x})}^{2}}$ |

Acc | S.D_acc | ||||

Jerk | S.D_jerk | ||||

Yaw | S.D_yaw | ||||

Mean absolute deviation | Speed | MAD_speed | $\frac{1}{n}{\displaystyle {\displaystyle \sum}_{i=1}^{n}}|{x}_{i}-\overline{x}|$ | ||

Acc | MAD_acc | ||||

Jerk | MAD_jerk | ||||

Yaw | MAD_yaw | ||||

TVSV (Time-varying stochastic volatility) | Speed | TVSV_speed | $\sqrt{\frac{1}{n-1}{\displaystyle {\displaystyle \sum}_{i=1}^{n}}{({r}_{i}-r)}^{2}}$ ${r}_{i}=\mathrm{ln}(\frac{{x}_{i}}{{x}_{i-1}})\times 100$ | ||

Acc | TVSV_acc | ||||

Jerk | TVSV_jerk | ||||

Yaw | TVSV_yaw |

## References

- World Health Organization. Global Status Report on Road Safety 2018: Summary; World Health Organization: Geneva, Switzerland, 2018. [Google Scholar]
- Lee, C.; Hellinga, B.; Saccomanno, F. Real-Time Crash Prediction Model for Application to Crash Prevention in Freeway Traffic. Transp. Res. Rec. J. Transp. Res. Board
**2003**, 1840, 67–77. [Google Scholar] [CrossRef] - Wang, L.; Abdel-Aty, M.; Lee, J. Safety analytics for integrating crash frequency and real-time risk modeling for express-ways. Accid. Anal. Prev.
**2017**, 104, 58–64. [Google Scholar] [CrossRef] [PubMed] - Wu, Y.; Abdel-Aty, M.; Lee, J. Crash risk analysis during fog conditions using real-time traffic data. Accid. Anal. Prev.
**2018**, 114, 4–11. [Google Scholar] [CrossRef] [PubMed] - Abdel-Aty, M.A.; Hassan, H.M.; Ahmed, M.; Al-Ghamdi, A.S. Real-time prediction of visibility related crashes. Transp. Res. Part C Emerg. Technol.
**2012**, 24, 288–298. [Google Scholar] [CrossRef] - Moreno, A.T.; García, A. Use of speed profile as surrogate measure: Effect of traffic calming devices on crosstown road safety performance. Accid. Anal. Prev.
**2013**, 61, 23–32. [Google Scholar] [CrossRef] - Xie, K.; Yang, D.; Ozbay, K.; Yang, H. Use of real-world connected vehicle data in identifying high-risk locations based on a new surrogate safety measure. Accid. Anal. Prev.
**2019**, 125, 311–319. [Google Scholar] [CrossRef] [PubMed] - Korea Transportation Safety Authority. Traffic Safety Model Design; Korea Transportation Safety Authority: Ansan-si, Korea, 2017.
- Chevalier, A.; Coxon, K.; Chevalier, A.J.; Clarke, E.; Rogers, K.; Brown, J.; Boufous, S.; Ivers, R.; Keay, L. Predictors of older drivers’ in-volvement in rapid deceleration events. Accid. Anal. Prev.
**2017**, 98, 312–319. [Google Scholar] [CrossRef] - Bagdadi, O. Assessing safety critical braking events in naturalistic driving studies. Transp. Res. Part F Traffic Psychol. Behav.
**2013**, 16, 117–126. [Google Scholar] [CrossRef] - Bagdadi, O.; Várhelyi, A. Development of a method for detecting jerks in safety critical events. Accid. Anal. Prev.
**2013**, 50, 83–91. [Google Scholar] [CrossRef] - Feng, F.; Bao, S.; Sayer, J.R.; Flannagan, C.; Manser, M.; Wunderlich, R. Can vehicle longitudinal jerk be used to identify aggressive drivers? An examination using naturalistic driving data. Accid. Anal. Prev.
**2017**, 104, 125–136. [Google Scholar] [CrossRef] - Wu, K.-F.; Jovanis, P.P. Defining and screening crash surrogate events using naturalistic driving data. Accid. Anal. Prev.
**2013**, 61, 10–22. [Google Scholar] [CrossRef] - Kamrani, M.; Arvin, R.; Khattak, A.J. Extracting Useful Information from Basic Safety Message Data: An Empirical Study of Driving Volatility Measures and Crash Frequency at Intersections. Transp. Res. Rec. J. Transp. Res. Board
**2018**, 2672, 290–301. [Google Scholar] [CrossRef] [Green Version] - Kim, Y.; Oh, C.; Choe, B.; Choi, S. Development of a Methodology for Detecting Intentional Aggressive Driving Events Using Multi-agent Driving Simulations. J. Korean Soc. Transp.
**2018**, 36, 51–65. [Google Scholar] [CrossRef] - Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat.
**2001**, 29, 1189–1232. [Google Scholar] [CrossRef] - Park, H.; Haghani, A.; Samuel, S.; Knodler, M.A. Real-time prediction and avoidance of secondary crashes under unex-pected traffic congestion. Accid. Anal. Prev.
**2018**, 112, 39–49. [Google Scholar] [CrossRef] - Tang, J.; Liang, J.; Han, C.; Li, Z.; Huang, H. Crash injury severity analysis using a two-layer Stacking framework. Accid. Anal. Prev.
**2019**, 122, 226–238. [Google Scholar] [CrossRef] - Kidando, E.; Kitali, A.E.; Kutela, B.; Ghorbanzadeh, M.; Karaer, A.; Koloushani, M.; Moses, R.; Ozguven, E.E.; Sando, T. Prediction of vehicle occupants injury at signalized intersections using real-time traffic and signal data. Accid. Anal. Prev.
**2021**, 149, 105869. [Google Scholar] [CrossRef] - Xu, X.; Wang, X.; Wu, X.; Hassanin, O.; Chai, C. Calibration and evaluation of the Responsibility-Sensitive Safety model of autonomous car-following maneuvers using naturalistic driving study data. Transp. Res. Part C Emerg. Technol.
**2021**, 123, 102988. [Google Scholar] [CrossRef] - Wang, C.; Xie, Y.; Huang, H.; Liu, P. A review of surrogate safety measures and their applications in connected and automated vehicles safety modeling. Accid. Anal. Prev.
**2021**, 157, 106157. [Google Scholar] [CrossRef] - Sinha, A.; Chand, S.; Wijayaratna, K.P.; Virdi, N.; Dixit, V. Comprehensive safety assessment in mixed fleets with connected and automated vehicles: A crash severity and rate evaluation of conventional vehicles. Accid. Anal. Prev.
**2020**, 142, 105567. [Google Scholar] [CrossRef] - Abdel-Aty, M.; Pande, A.; Das, A.; Knibbe, W.J. Assessing Safety on Dutch Freeways with Data from Infrastructure-Based Intelligent Transportation Systems. Transp. Res. Rec. J. Transp. Res. Board
**2008**, 2083, 153–161. [Google Scholar] [CrossRef] [Green Version] - Harb, R.; Yan, X.; Radwan, E.; Su, X. Exploring precrash maneuvers using classification trees and random forests. Accid. Anal. Prev.
**2009**, 41, 98–107. [Google Scholar] [CrossRef] [PubMed] - Jiang, X.; Abdel-Aty, M.; Hu, J.; Lee, J. Investigating macro-level hotzone identification and variable importance using big data: A random forest models approach. Neurocomputing
**2016**, 181, 53–63. [Google Scholar] [CrossRef] - Shangguan, Q.; Fu, T.; Wang, J.; Luo, T.; Fang, S. An integrated methodology for real-time driving risk status prediction using naturalistic driving data. Accid. Anal. Prev.
**2021**, 156, 106122. [Google Scholar] [CrossRef] - Lin, L.; Wang, Q.; Sadek, A.W. A novel variable selection method based on frequent pattern tree for real-time traf-fic accident risk prediction. Transp. Res. Part C Emerg. Technol.
**2015**, 55, 444–459. [Google Scholar] [CrossRef] - Xiong, X.; Chen, L.; Liang, J. Vehicle Driving Risk Prediction Based on Markov Chain Model. Discret. Dyn. Nat. Soc.
**2018**, 2018, 1–12. [Google Scholar] [CrossRef] [Green Version] - Wang, J.; Kong, Y.; Fu, T. Expressway crash risk prediction using back propagation neural network: A brief inves-tigation on safety resilience. Accid. Anal. Prev.
**2019**, 124, 180–192. [Google Scholar] [CrossRef] - Chen, J.; Wu, Z.; Zhang, J. Driving safety risk prediction using cost-sensitive with nonnegativity-constrained au-toencoders based on imbalanced naturalistic driving data. IEEE Trans. Intell. Transp. Syst.
**2019**, 20, 4450–4465. [Google Scholar] [CrossRef] - Costela, F.M.; Castro-Torres, J.J. Risk prediction model using eye movements during simulated driving with lo-gistic regressions and neural networks. Transp. Res. Part F Traffic Psychol. Behav.
**2020**, 74, 511–521. [Google Scholar] [CrossRef] - Friedman, J.H. Stochastic gradient boosting. Comput. Stat. Data Anal.
**2002**, 38, 367–378. [Google Scholar] [CrossRef] - Dougherty, M. A review of neural networks applied to transport. Transp. Res. Part C Emerg. Technol.
**1995**, 3, 247–260. [Google Scholar] [CrossRef] - Jones, D.R. A Taxonomy of Global Optimization Methods Based on Response Surfaces. J. Glob. Optim.
**2001**, 21, 345–383. [Google Scholar] [CrossRef] - Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. Available online: https://arxiv.org/pdf/1206.2944.pdf (accessed on 16 April 2021).
- Wang, Z.; de Freitas, N. Theoretical analysis of Bayesian optimisation with unknown Gaussian process hyper-parameters. arXiv
**2014**, arXiv:1406.7758. [Google Scholar] - Gelbart, M.A.; Snoek, J.; Adams, R.P. Bayesian optimization with unknown constraints. arXiv
**2014**, arXiv:1403.5607. [Google Scholar] - Joy, T.T.; Rana, S.; Gupta, S.; Venkatesh, S. A flexible transfer learning framework for Bayesian optimization with con-vergence guarantee. Expert Syst. Appl.
**2019**, 115, 656–672. [Google Scholar] - Wu, K.-F.; Jovanis, P.P. Crashes and crash-surrogate events: Exploratory modeling with naturalistic driving data. Accid. Anal. Prev.
**2012**, 45, 507–516. [Google Scholar] [CrossRef] [PubMed] - Lee, H.-R.; Kum, K.-J.; Son, S.-N. A study on the factor analysis by grade for highway traffic accident. Int. J. Highw. Eng.
**2011**, 13, 157–165. [Google Scholar] [CrossRef] - Polat, K.; Güneş, S. A novel hybrid intelligent method based on C4. 5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst. Appl.
**2009**, 36, 1587–1592. [Google Scholar] [CrossRef]

Hyperparameters | Definition |
---|---|

N tree | Number of tree |

Interaction depth | The number of branches to extend from each node (the depth of the tree) |

Shrinkage | Controls how fast the algorithm makes the gradient descent (learning rate) |

Indicator | Variable Name | |
---|---|---|

Peak to peak jerk | P to p jerk | |

SRI | SRI_variable (speed, acc, jerk, yaw) | |

EDI | EDI_variable (speed, acc, jerk, yaw) | |

Dangerous driving events rate | Dangerous event | |

RDEs | RDE | |

LNJ/LPJ | LNJ/LPJ_threshold | |

Rapid peak to peak jerk rate | Rapid p to p jerk rate | |

Yaw rate | Yaw rate | |

Driving volatility | Standard deviation | S.D_variable (speed, acc, jerk, yaw) |

Mean absolute deviation | MAD_variable (speed, acc, jerk, yaw) | |

TVSV (Time-varying stochastic volatility) | TVSV_variable (speed, acc, jerk, yaw) |

Hyperparameters | Optimal Value |
---|---|

N tree | 44 |

Interaction.depth | 5 |

Shrinkage | 0.05 |

Rank | Variable Name | Relative Influence |
---|---|---|

1 | Dangerous event | 33.65 |

2 | SRI_yaw | 9.03 |

3 | RDEs | 8.33 |

4 | TVSV_speed | 4.85 |

5 | P to p jerk | 4.07 |

6 | MAD_speed | 3.68 |

7 | Rapid p to p jerk rate | 3.51 |

8 | SRI_acc | 3.49 |

9 | S.D_acc | 3.32 |

10 | Yaw rate | 3.06 |

11 | S.D_yaw | 2.63 |

12 | MAD_yaw | 2.51 |

13 | SRI_speed | 2.42 |

14 | TVSV_acc | 2.37 |

15 | LPJ_1.5 | 2.16 |

16 | LPJ_2 | 1.84 |

17 | S.D_jerk | 1.57 |

18 | SRI_jerk | 1.36 |

19 | MAD_acc | 1.12 |

20 | LPJ_4 | 0.87 |

Hyperparameters | Optimal Value |
---|---|

Transfer function | Symmetric sigmoid |

Number of hidden layers | 3 |

Number of neurons | (30,75,55) |

Predicted | |||
---|---|---|---|

Normal Traffic Flow | Hazardous Traffic Flow | Correct Classification Rate (%) | |

Normal traffic flow | 341 | 10 | 97.15 |

Hazardous traffic flow | 13 | 61 | 82.43 |

Overall percentage (%) | 94.59 |

Data Set | Total | Normal Traffic Flow | Hazardous Traffic Flow | |
---|---|---|---|---|

Total | 1429 | 1183 | 246 | |

Severity | D | 1029 | 849 | 180 |

ABC | 390 | 324 | 66 | |

D/N | Day | 945 | 790 | 155 |

Night | 474 | 383 | 91 | |

Weather | Good | 1101 | 908 | 193 |

Bad | 299 | 249 | 50 |

Data Set | Accuracy | Sensitivity | Specificity | |
---|---|---|---|---|

Total | 97.15% | 96.33% | 85.92% | |

Severity | D | 94.81% | 96.85% | 85.19% |

ABC | 97.44% | 96.98% | 94.44% | |

D/N | Day | 93.64% | 96.6% | 79.17% |

Night | 95.07% | 97.35% | 86.21% | |

Weather | Good | 94.24% | 96.34% | 84.21% |

Bad | 95.51% | 97.3% | 86.67% |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kim, Y.; Park, J.; Oh, C.
A Crash Prediction Method Based on Artificial Intelligence Techniques and Driving Behavior Event Data. *Sustainability* **2021**, *13*, 6102.
https://doi.org/10.3390/su13116102

**AMA Style**

Kim Y, Park J, Oh C.
A Crash Prediction Method Based on Artificial Intelligence Techniques and Driving Behavior Event Data. *Sustainability*. 2021; 13(11):6102.
https://doi.org/10.3390/su13116102

**Chicago/Turabian Style**

Kim, Yunjong, Juneyoung Park, and Cheol Oh.
2021. "A Crash Prediction Method Based on Artificial Intelligence Techniques and Driving Behavior Event Data" *Sustainability* 13, no. 11: 6102.
https://doi.org/10.3390/su13116102