# Modeling Urban Freeway Rear-End Collision Risk Using Machine Learning Algorithms

## Abstract

## 1. Introduction

#### 1.1. Background

#### 1.2. Research Gap

#### 1.3. Objectives

## 2. Literature Review

#### 2.1. Surrogate Indicator

#### 2.2. Traffic Flow Risk

## 3. Data Collection and Accuracy

#### 3.1. Data Collection

#### 3.2. Data Accuracy

## 4. Methodology

#### 4.1. Modeling Vehicle Rear-End Collision Probability (RCP)

#### 4.1.1. Vehicle Collision Analysis

#### 4.1.2. Generalized Pareto Distribution Model

#### 4.2. Modeling Freeway Rear-End Collision Risk (F-RCR)

#### 4.2.1. Classification of F-RCR Based on Fuzzy C-Means (FCM)

#### 4.2.2. Machine Learning Model

#### 4.2.3. Macroscopic Traffic Flow Variables

## 5. Results and Discussion

#### 5.1. RCP Model

#### 5.1.1. Distribution of Deceleration and Model Results

#### 5.1.2. Model Calibration and Validation

#### 5.2. F-RCR Model

#### 5.2.1. Distribution of F-RCR and Classification

#### 5.2.2. Variable Importance

#### 5.2.3. Machine Learning Model

#### 5.3. Case Study

## 6. Conclusions

## 7. Limitations and Future Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

Time | Vehicle ID | Speed (km/h) | Acceleration (m/s^{2}) | Horizontal Coordinates (m) | Longitudinal Coordinates (m) | Lane No. | Vehicle Type |
---|---|---|---|---|---|---|---|

00:00:04.714 | 7161 | 112.51 | −0.5 | −3.26 | 509.09 | 1 | car |

00:00:04.714 | 7158 | 78.30 | −0.1 | 25.13 | 757.06 | 3 | Truck |

00:00:04.714 | 7159 | 73.79 | −0.1 | 13.80 | 635.40 | 3 | Truck |

00:00:04.714 | 7160 | 82.82 | −0.1 | 1.74 | 515.21 | 2 | car |

No. | Variable Type | Variable | Note |
---|---|---|---|

1 | Speed | Average speed of all vehicle | $\overline{{v}_{5\mathrm{min}}}=\frac{1}{n}{\displaystyle \sum _{i=1}^{n}{V}_{i}}$ |

2 | Average speed of small cars | $\overline{{v}_{car}}=\frac{1}{n}{\displaystyle \sum _{i}^{n}{V}_{car}}$ | |

3 | Coefficient of variation of speed | $c{v}_{speed}={\sigma}_{speed}/{\mu}_{speed}$ | |

4 | Speed difference of lane 1 and lane 2 | $\Delta {v}_{12}={v}_{lane1}-{v}_{lane2}$ | |

5 | Speed difference of lane 2 and lane 3 | $\Delta {v}_{23}={v}_{lane2}-{v}_{lane3}$ | |

6 | Volume | Volume in 5 min | ${Q}_{5\mathrm{min}}$ |

7 | Volume difference of lane 1 and lane 2 | $\Delta {Q}_{12}={Q}_{1}-{Q}_{2}$ | |

8 | Volume difference of lane 2 and lane 3 | $\Delta {Q}_{23}={Q}_{2}-{Q}_{3}$ | |

9 | Vehicle distribution | Coefficient of variation of vehicle distribution (CVVD) | $c{v}_{q}={\sigma}_{volume}/{\mu}_{volume}$ The number of arriving vehicles was counted every minute. Coefficients of variation within 5 min were calculated to reflect vehicle distribution characteristics. |

10 | Vehicle Type | Proportion of large trucks | ${P}_{truck}$ |

Machine Learning | Predicted Value True Value | Low-Risk | Median-Risk | High-Risk |
---|---|---|---|---|

MLP | Low-risk | 97.7% | 2.3% | 0.0% |

Median-risk | 14.4% | 84.9% | 0.7% | |

High-risk | 0.0% | 13.6% | 86.4% | |

RBF | Low-risk | 95.9% | 4.1% | 0.0% |

Median-risk | 14.2% | 85.8% | 0.0% | |

High-risk | 0.0% | 16.7% | 83.3% | |

RF | Low-risk | 91.3% | 8.8% | 0.0% |

Median-risk | 7.0% | 84.2% | 8.8% | |

High-risk | 0.0% | 15.6% | 84.4% |

