# Predicting Crash Injury Severity with Machine Learning Algorithm Synergized with Clustering Technique: A Promising Protocol

## Abstract

## 1. Introduction

#### 1.1. Background

#### 1.2. Application of Statistical Models in Crash Severity Prediction

#### 1.3. Application of Machine Learning Models in Crash Severity Prediction

#### 1.4. Artificial Neural Networks

#### 1.5. Support Vector Machine

#### 1.6. Fuzzy C-Means Clustering

is the membership value of x

for the cluster I; x

is the j

of d-dimensional measured data; c

is the d-dimension center of the cluster; and ||*|| is the Euclidean distance between any training vector and the center.

#### 1.7. Study Objectives

#### 1.8. Outline

## 2. Data Set Description

- The injury that causes a person to be detained in hospital as an in-patient for an extended period and which may have required surgery.
- An injury that will have lasting or even permanent implications for the injured person and that will have an impact upon their ability to work or which involve a change to their level of independence.
- An injury that causes death 30 or more days after the accident.

## 3. Model Development

#### 3.1. Feedforward Neural Networks

#### 3.2. Support Vector Machine

#### 3.3. FCM-Based FNN and SVM

## 4. Results and Discussion

## 5. Conclusions

#### Limitations and Future Study

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

**Figure 5.**Confusion matrices for feed-forward neural network (FNN) model (training and testing data).

**Figure 7.**Confusion matrices for FNN combined with fuzzy c-means (FCM) clustering (training and testing data).

Input Variables | Data Type | No. of Categories |
---|---|---|

Vehicle attributes | ||

Number of vehicles involved | Numeric | - |

Vehicle type | Nominal | 12 |

Road condition attributes | ||

Road type | Nominal | 5 |

Junction type | Nominal | 9 |

Junction control | Nominal | 5 |

Light | Nominal | 5 |

Weather | Nominal | 9 |

Road surface condition | Nominal | 7 |

Area type | Nominal | 2 |

Speed limit | Numeric | - |

Road class | Nominal | 6 |

Crash attributes | ||

Number of causalities | Numeric | - |

Day of the week | Numeric | 7 |

No. of Clusters | FNN-FCM^{1} Testing Accuracy (%) | SVM-FCM^{2} Testing Accuracy (%) |
---|---|---|

1 | 70.0 | 73.0 |

2 | 71.8 | 72.2 |

3 | 71.0 | 73.0 |

4 | 70.2 | 74.2 |

5 | 67.9 | 72.1 |

^{1}FNN-FCM: fuzzy c-means clustering based feed-forward neural network.

^{2}SVM-FCM: fuzzy c-means clustering based support vector machine.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

