# Crash Frequency Modeling Using Real-Time Environmental and Traffic Data and Unbalanced Panel Data Models

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Real-Time Crash Risk Models in Refined Temporal Scales

#### 1.2. Crash Frequency Models: Panel Data Application and Zero-Inflated Consideration

## 2. Data Description

## 3. Methods

_{it}is defined as:

_{i}is the number of repeated observations in site i (site-specific panel data structure), I is the total number of different sites. For balanced panel data, t

_{i}is the same for all sites. Because the real-time weather, road surface and traffic data was not recorded in a perfectly continuous manner, t

_{i}is not all the same and thus the panel data structure here was actually unbalanced.

_{it}:

## 4. Results

_{i}is 7.54). The over-dispersion parameter α is statistically significant (t-statistics of 3.57), which implies the negative binomial model is indeed preferred over the Poisson model. The selection of zero-inflated model is endorsed by the Vuong’s test results for zero-inflation (V = 4.48 for model with site-specific random effects). Therefore, random effect zero-inflated negative binomial model is confirmed to be the most appropriate one for the present study. To save space, only the detailed model results from the random effect zero-inflated negative binomial model are presented hereafter.

#### 4.1. Environmental Characteristics

#### 4.2. Traffic Characteristics

#### 4.3. Temporal Characteristics

#### 4.4. Road Characteristics

## 5. Conclusions

- (1)
- Random effect zero-inflated negative binomial model is confirmed to be the most appropriate one according to the modeling fitness results. Elasticities are also computed to provide some important observations of the influence from different factors.
- (2)
- The estimation results from the unbalanced panel data models show that both time-varying factors (e.g., visibility and hourly traffic volume) and site-varying factors (e.g., speed limit and number of lanes) may significantly influence the crash frequency on highways like I-25. Even for a typical highway without experiencing frequent adverse weather, the effects from road surface and weather conditions are found significant to the crash frequency model.
- (3)
- Among all the significant variables, visibility condition is found to be the most influential environment-related factors affecting crash frequencies on I-25. Dark light condition (night), crosswind speed and wet road surface decrease crash frequency, while chemically wet road surface increases crash frequency. It is interesting that other hourly weather conditions, such as precipitation conditions and temperature, are not found to be significant on top of the current variables. It can be explained by the fact that precipitation and temperature does not influence crash likelihood directly, instead precipitation and temperature impact crash likelihood through changing visibility and road surface conditions. Since visibility and road surface conditions are already incorporated in the model, it is not surprising that precipitation and temperature becomes insignificant. Therefore the findings above underline the unique value and importance of the real-time road surface condition data to crash frequency studies.
- (4)
- This paper reports the explorative effort on developing the new crash frequency models using detailed traffic, weather and road surface condition data in much more refined temporal scale (e.g., hourly data). Such a study bears a lot of potentials for engineering applications to make major highways safer and more resilient to adverse conditions.

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Abbreviations

ZINB | Zero-inflated Negative Binomial |

RWIS | Road Weather Information System |

ZIP | Zero-inflated Poisson |

NB | Negative Binomial |

CSP | Colorado State Patrol |

CDOT | Colorado Department of Transportation |

MM | Mile Marker |

RCI | Roadway Characteristics Inventory |

## References

- Lord, D.; Mannering, F. The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives. Transport. Res. A.-Pol.
**2010**, 44, 291–305. [Google Scholar] [CrossRef] - Washington, S.P.; Karlaftis, M.G.; Mannering, F.L. Statistical and Econometric Methods for Transportation Data Analysis, 2nd ed.; Chapman Hall/CRC: Boca Raton, FL, USA, 2010. [Google Scholar]
- Lee, C.; Saccomanno, F.; Hellinga, B. Analysis of crash precursors on instrumented freeways. Transp. Res. Rec.
**2002**, 1784, 1–8. [Google Scholar] [CrossRef] - Lee, C.; Hellinga, B.; Saccomanno, F. Real-time crash prediction model for application to crash prevention in freeway traffic. Transp. Res. Record
**2003**, 1840, 67–77. [Google Scholar] [CrossRef] - Abdel-Aty, M.; Uddin, N.; Pande, A.; Abdalla, M.F.; Hsia, L. Predicting freeway crashes based on loop detector data using matched case-control logistic regression. Transp. Res. Record
**2004**, 1897, 88–95. [Google Scholar] [CrossRef] - Abdel-Aty, M.A.; Pemmanaboina, R. Calibrating a real-time traffic crash-prediction model using archived weather and ITS traffic data. IEEE Trans. Intell. Transp.
**2006**, 7, 167–174. [Google Scholar] [CrossRef] - Golob, T.F.; Recker, W.W. Relationships among urban freeway accidents, traffic flow, weather, and lighting conditions. J. Transp. Eng.-Asce.-Asce
**2003**, 129, 342–353. [Google Scholar] [CrossRef] - Golob, T.F.; Recker, W.W. A method for relating type of crash to traffic flow characteristics on urban freeways. Transport. Res. Part A
**2004**, 38, 53–80. [Google Scholar] [CrossRef] - Golob, T.F.; Recker, W.; Pavlis, Y. Probabilistic models of freeway safety performance using traffic flow data as predictors. Safety Sci.
**2008**, 46, 1306–1333. [Google Scholar] [CrossRef] - Hossain, M.; Muromachi, Y.A. Bayesian network based framework for real-time crash prediction on the basic freeway segments of urban expressways. Accident Anal. Prev.
**2012**, 45, 373–381. [Google Scholar] [CrossRef] [PubMed] - Abdel-Aty, M.; Pande, A.; Lee, C.; Gayah, V.; Santos, C.D. Crash risk assessment using intelligent transportation systems data and real-time intervention strategies to improve safety on freeways. J. Intell. Transport. Syst.
**2007**, 11, 107–120. [Google Scholar] [CrossRef] - Ahmed, M.M.; Abdel-Aty, M.A. The viability of using automatic vehicle identification data for real-time crash prediction. IEEE Trans. Intell. Transp.
**2012**, 13, 459–468. [Google Scholar] [CrossRef] - Hossain, M.; Muromachi, Y. Understanding crash mechanism on urban expressways using high-resolution traffic data. Accident Anal. Prev.
**2013**, 57, 17–29. [Google Scholar] [CrossRef] [PubMed] - Yu, R.; Abdel-Aty, M. Multi-level Bayesian analyses for single- and multi-vehicle freeway crashes. Accident Anal. Prev.
**2013**, 58, 97–105. [Google Scholar] [CrossRef] [PubMed] - Yu, R.; Abdel-Aty, M.; Ahmed, M. Bayesian random effect models incorporating real-time weather and traffic data to investigate mountainous freeway hazardous factors. Accident Anal. Prev.
**2013**, 50, 371–376. [Google Scholar] [CrossRef] [PubMed] - Xu, C.; Wang, W.; Liu, P. Identifying crash-prone traffic conditions under different weather on freeways. J. Safety Res.
**2013**, 46, 135–144. [Google Scholar] [CrossRef] [PubMed] - Abdel-Aty, M.A.; Hassan, H.M.; Ahmed, M.; Al-Ghamdi, A.S. Real-time prediction of visibility related crashes. Transport. Res. C-Emer.
**2012**, 24, 288–298. [Google Scholar] [CrossRef] - Pande, A.; Abdel-Aty, M. Assessment of freeway traffic parameters leading to lane-change related collisions. Accident Anal. Prev.
**2006**, 38, 936–948. [Google Scholar] [CrossRef] [PubMed] - Roshandel, S.; Zheng, Z.; Washington, S. Impact of real-time traffic characteristics on freeway crash occurrence: Systematic review and meta-analysis. Accident Anal. Prev.
**2015**, 79, 198–211. [Google Scholar] [CrossRef] [PubMed] - Theofilatos, A.; Yannis, G. A review of the effect of traffic and weather characteristics on road safety. Accident Anal. Prev.
**2014**, 72, 244–256. [Google Scholar] [CrossRef] [PubMed] - Noland, R.B. Traffic fatalities and injuries: The effect of changes in infrastructure and other trends. Accident Anal. Prev.
**2003**, 35, 599–611. [Google Scholar] [CrossRef] - Noland, R.B.; Oh, L. The effect of infrastructure and demographic change on traffic-related fatalities and crashes: A case study of Illinois county-level data. Accident Anal. Prev.
**2004**, 36, 525–532. [Google Scholar] [CrossRef] - Shankar, V.N.; Albin, R.B.; Milton, J.C.; Mannering, F.L. Evaluating median crossover likelihoods with clustered accident counts: An empirical inquiry using the random effects negative binomial model. Transp. Res. Rec.
**1998**, 1635, 44–48. [Google Scholar] [CrossRef] - Chin, H.C.; Quddus, M.A. Applying the random effect negative binomial model to examine traffic accident occurrence at signalized intersections. Accident Anal. Prev.
**2003**, 35, 253–259. [Google Scholar] [CrossRef] - Miaou, S.-P.; Song, J.J.; Mallick, B.K. Roadway traffic crash mapping: A space-time modeling approach. J. Transport. Stat.
**2003**, 6, 33–57. [Google Scholar] - Kweon, Y.-J.; Kockelmam, K.M. Safety effects of speed limit changes use of panel models, including speed, use, and design variables. Transp. Res. Rec.
**2005**, 1908, 148–158. [Google Scholar] [CrossRef] - Anastasopoulos, P.C.; Mannering, F.L. A note on modeling vehicle accident frequencies with random-parameters count models. Accident Anal. Prev.
**2009**, 41, 153–159. [Google Scholar] [CrossRef] [PubMed] - Haque, M.M.; Chin, H.C.; Huang, H. Applying Bayesian hierarchical models to examine motorcycle crashes at signalized intersections. Accident Anal. Prev.
**2010**, 42, 203–212. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Aguero-Valverde, J. Full Bayes Poisson gamma, Poisson lognormal, and zero inflated random effects models: Comparing the precision of crash frequency estimates. Accident Anal. Prev.
**2013**, 50, 289–297. [Google Scholar] [CrossRef] [PubMed] - Dong, C.; Clarke, D.B.; Yan, X.; Khattak, A; Huang, B. Multivariate random-parameters zero-inflated negative binomial regression model: An application to estimate crash frequencies at intersections. Accident Anal. Prev.
**2014**, 70, 320–329. [Google Scholar] [CrossRef] [PubMed] - Ulfarsson, G.F.; Shankar, V.N. An accident count model based on multi-year cross-sectional roadway data with serial correlation. Transp. Res. Record
**2003**, 1840, 193–197. [Google Scholar] [CrossRef] - Caliendo, C.; Guida, M.; Parisi, A. A crash-prediction model for multilane roads. Accident Anal. Prev.
**2007**, 39, 657–670. [Google Scholar] [CrossRef] [PubMed] - Aguero-Valverde, J.; Jovanis, P.P. Spatial analysis of fatal and injury crashes in Pennsylvania. Accident Anal. Prev.
**2006**, 38, 618–625. [Google Scholar] [CrossRef] [PubMed] - Miaou, S.-P. Relationship between truck accidents and geometric design of road sections: Poisson vs. negative binomial regressions. Accident Anal. Prev.
**1994**, 26, 471–482. [Google Scholar] [CrossRef] - Lee, J.; Mannering, F. Impact of roadside features on the frequency and severity of run-off-roadway accidents: An empirical analysis. Accident Anal. Prev.
**2002**, 34, 149–161. [Google Scholar] [CrossRef] - Chin, H.C.; Quddus, M.A. Modeling count data with excess zeroes—An empirical application to traffic accidents. Sociol. Method Res.
**2003**, 32, 90–116. [Google Scholar] [CrossRef] - Lord, D.; Washington, S.; Ivan, J.N. Poisson, Poisson-gamma and zero inflated regression models of motor vehicle crashes: Balancing statistical fit and theory. Accident Anal. Prev.
**2005**, 37, 35–46. [Google Scholar] [CrossRef] [PubMed] - Lord, D.; Washington, S.; Ivan, J.N. Further notes on the application of zero inflated models in highway safety. Accident Anal. Prev.
**2007**, 39, 53–57. [Google Scholar] [CrossRef] [PubMed] - Dong, C.; Nambisan, S.S.; Richards, S.H. Assessment of the effects of highway geometric design features on the frequency of truck involved crashes using bivariate regression. Transport. Res. part A
**2015**, 75, 30–41. [Google Scholar] [CrossRef] - Anjana, S.; Anjaneyulu, M.V.L.R. Development of safety performance measures for urban roundabouts in India. J. Transp. Eng.-Asce.
**2015**, 141, 04014066. [Google Scholar] [CrossRef] - Huang, H.; Chin, H.C. Modeling road traffic crashes with zero-inflation and site-specific random effects. Stat. Method Appl.
**2010**, 19, 445–462. [Google Scholar] [CrossRef] - Malyshkina, N.V.; Mannering, F.L. Zero-state Markov switching count-data models: An empirical assessment. Accident Anal. Prev.
**2010**, 42, 122–130. [Google Scholar] [CrossRef] [PubMed] - Castro, M.; Paleti, R.; Bhat, C.R. A latent variable representation of count data models to accommodate spatial and temporal dependence: Application to predicting crash frequency at intersections. Trans. Res. Part B: Methods
**2012**, 46, 253–272. [Google Scholar] [CrossRef] - Milton, J.; Mannering, F. The relationship among highway geometrics, traffic-related elements and motor-vehicle accident frequencies. Transportation
**1998**, 25, 395–413. [Google Scholar] [CrossRef] - Carson, J.; Mannering, F. Effect of ice warning signs on ice-accident frequencies and severities. Accident Anal. Prev.
**2001**, 33, 99–109. [Google Scholar] [CrossRef] - Vuong, Q.H. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica
**1989**, 57, 307–333. [Google Scholar] [CrossRef] - Greene, W. Econometric Analysis, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 1987. [Google Scholar]
- Usman, T.; Fu, L.; Miranda-Moreno, L.F. A disaggregate model for quantifying the safety effects of winter road maintenance activities at an operational level. Accident Anal. Prev.
**2012**, 48, 368–378. [Google Scholar] [CrossRef] [PubMed] - Ahmed, M.; Abdel-Aty, M. A data fusion framework for real-time risk assessment on freeways. Transport. Res. C-Emer.
**2013**, 26, 203–213. [Google Scholar] [CrossRef] - Yu, R.; Xiong, Y.; Abdel-Aty, M. A correlated random parameter approach to investigate the effects of weather conditions on crash risk for a mountainous freeway. Transport. Res. C-Emer
**2015**, 50, 68–77. [Google Scholar] [CrossRef] - Bham, G.H.; Javvadi, B.S.; Manepalli, U.R.R. Multinomial logistic regression model for single-vehicle and multivehicle collisions on urban U.S. highways in Arkansas. J. Transp. Eng.-Asce.
**2012**, 138, 786–797. [Google Scholar] [CrossRef] - Chen, S.R.; Cai, C.S. Accident assessment of vehicles on long-span bridges in windy environments. J. Wind Eng. Ind. Aerod.
**2004**, 92, 991–1024. [Google Scholar] [CrossRef] - Chen, S.R.; Chen, F. Simulation-based assessment of vehicle safety behavior under hazardous driving conditions. J. Transp. Eng.-Asce.-Asce.
**2010**, 136, 304–315. [Google Scholar] [CrossRef] - Pei, X.; Wong, S.C.; Sze, N.N. The roles of exposure and speed in road safety analysis. Accident Anal. Prev.
**2012**, 48, 464–471. [Google Scholar] [CrossRef] [PubMed] - Ahmed, M.; Huang, H.; Abdel-Aty, M.; Guevara, B. Exploring a Bayesian hierarchical approach for developing safety performance functions for a mountainous freeway. Accident Anal. Prev.
**2011**, 43, 1581–1589. [Google Scholar] [CrossRef] [PubMed] - Qi, Y.; Smith, B.L.; Guo, J.H. Freeway accident likelihood prediction using a panel data analysis approach. J. Transp. Eng.-Asce.-Asce.
**2007**, 133, 149–156. [Google Scholar] [CrossRef] - Wang, C.; Quddus, M.A.; Ison, S.G. The effect of traffic and road characteristics on road safety: A review and future research direction. Saf. Sci.
**2013**, 57, 264–275. [Google Scholar] [CrossRef]

Variable | Mean | Std. Dev | Minimum | Maximum |
---|---|---|---|---|

Crash frequency | 0.004 | 0.066 | 0 | 4 |

Environmental characteristics | ||||

Wet road surface | 0.082 | 0.275 | 0 | 1 |

Chemically Wet road surface | 0.037 | 0.188 | 0 | 1 |

Visibility (miles) | 1.075 | 0.136 | 0.000 | 1.100 |

Cross wind speed (mph) | 4.147 | 3.906 | 0.000 | 31.980 |

Average precipitation rate per minute(inches) | 0.021 | 0.443 | 0 | 32 |

Average temperature (℉) | 57.016 | 24.473 | −1.333 | 159 |

Traffic characteristics | ||||

Speed limit (mph) | 61.027 | 5.327 | 55 | 75 |

Speed limit minus traffic speed (mph) | 2.642 | 5.533 | 0.000 | 69.180 |

Hour traffic volume (in 1000 vehicles per hour) | 2.916 | 2.101 | 0.030 | 14.988 |

Truck percentage (%) | 6.215 | 1.922 | 2.800 | 10.700 |

Temporal characteristics | ||||

Night | 0.431 | 0.495 | 0 | 1 |

Sunset | 0.062 | 0.241 | 0 | 1 |

November | 0.095 | 0.294 | 0 | 1 |

4 am–5 am | 0.040 | 0.196 | 0 | 1 |

Road characteristics | ||||

Number of merging ramps per lane per mile | 0.252 | 0.215 | 0.000 | 0.926 |

Segment length (miles) | 1.014 | 0.769 | 0.236 | 4.500 |

Number of lanes | 4.159 | 0.562 | 3 | 5 |

Remaining service life of rutting | 97.013 | 2.984 | 86.000 | 100.000 |

Curvature (degree) | 0.947 | 0.681 | 0.000 | 2.260 |

Good pavement condition | 0.419 | 0.493 | 0 | 1 |

Median width (ft) | 13.812 | 28.604 | 4 | 183 |

Outside shoulder width (ft) | 10.335 | 2.181 | 6 | 15 |

Inside shoulder width (ft) | 9.006 | 2.583 | 5 | 15 |

Grade (%) | −0.018 | 1.206 | −2.334 | 2.334 |

Variable | Estimate Coefficients | t-Statistic |
---|---|---|

Zero-inflated State | ||

Constant | −10.731 | −5.84 |

Environmental characteristics | ||

Visibility (miles) | 0.959 | 1.78 |

Wet road surface indicator (1 if the road surface is wet, 0 otherwise) | −1.663 | −3.64 |

Chemically Wet road surface indicator (1 if the road surface is chemically wet, 0 otherwise) | −1.864 | −5.04 |

Traffic characteristics | ||

Hourly traffic volume (in 1000 vehicles per hour) | −0.611 | −9.06 |

Truck percentage (%) | 0.439 | 5.95 |

Temporal characteristics | ||

Night indicator (1 if the time period is at night, 0 otherwise) | 0.352 | 1.94 |

Road characteristics | ||

Segment length (miles) | 0.755 | 4.60 |

Number of lanes | 1.917 | 6.10 |

Good pavement condition indicator (1 if the pavement condition is good, 0 otherwise) | 0.680 | 2.63 |

Negative Binomial State | ||

Constant | −10.673 | −9.31 |

Environmental characteristics | ||

Cross wind speed (mph) | −0.013 | −1.75 |

Wet road surface indicator (1 if the road surface is wet, 0 otherwise) | −0.529 | −3.70 |

Traffic characteristics | ||

Low speed limit (1 if the speed limit is less than 60 mph, 0 otherwise) | 0.387 | 1.83 |

Difference between speed limit and current traffic speed (speed limit minus traffic speed) | 0.081 | 29.75 |

Truck percentage (%) | 0.107 | 2.69 |

Temporal characteristics | ||

Sunset indicator (1 if the time period is during sunset, 0 otherwise) | −0.200 | −1.88 |

November indicator (1 if the time period is in November, 0 otherwise) | 0.292 | 3.20 |

4 am–5 am indicator (1 if the time period is between 4 am to 5 am, 0 otherwise) | −0.608 | −1.96 |

Road characteristics | ||

Number of merging ramps per lane per mile | −1.072 | −2.65 |

Segment length (miles) | 0.786 | 5.44 |

Number of lanes | 0.849 | 3.69 |

Curvature (degree) | 0.406 | 3.07 |

Long Remaining service life of rutting indicator (1 if the value of ruti is higher than 99, 0 otherwise) | 0.546 | 2.88 |

α | 1.818 | 3.57 |

σ_{i} (site-specific) | 0.484 | 7.54 |

Vuong statistic | 4.48 | |

−2 Log Likelihood | 15,145 | |

AIC (smaller is better) | 15,197 | |

BIC (smaller is better) | 15,250 |

Variable | Elasticity |
---|---|

Environmental characteristics | |

Visibility (miles) | −0.562 |

Cross wind speed (mph) | −0.054 |

Wet road surface indicator (1 if the road surface is wet, 0 otherwise) | −0.243 |

Chemically Wet road surface indicator (1 if the road surface is chemically wet, 0 otherwise) | 0.465 |

Traffic characteristics | |

Low speed limit (1 if the speed limit is less than 60mph, 0 otherwise) | 0.321 |

Difference between speed limit and traffic speed (speed limit minus current traffic speed) | 0.215 |

Hourly traffic volume (in 1000 vehicles per hour) | 0.637 |

Truck percentage (%) | −0.892 |

Temporal characteristics | |

Night indicator (1 if the time period is at night, 0 otherwise) | −0.219 |

Sunset indicator (1 if the time period is during sunset, 0 otherwise) | −0.221 |

November indicator (1 if the time period is in November, 0 otherwise) | 0.253 |

4am-5am indicator (1 if the time period is between 4 am to 5 am, 0 otherwise) | −0.836 |

Road characteristics | |

Number of merging ramps per lane per mile | −0.270 |

Segment length (miles) | 0.350 |

Number of lanes | −0.838 |

Curvature (degree) | 0.385 |

Long remaining service life of rutting indicator (1 if the value of ruti is higher than 99, 0 otherwise) | 0.421 |

Good pavement condition indicator (1 if the pavement condition is good, 0 otherwise) | −0.484 |

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chen, F.; Chen, S.; Ma, X.
Crash Frequency Modeling Using Real-Time Environmental and Traffic Data and Unbalanced Panel Data Models. *Int. J. Environ. Res. Public Health* **2016**, *13*, 609.
https://doi.org/10.3390/ijerph13060609

**AMA Style**

Chen F, Chen S, Ma X.
Crash Frequency Modeling Using Real-Time Environmental and Traffic Data and Unbalanced Panel Data Models. *International Journal of Environmental Research and Public Health*. 2016; 13(6):609.
https://doi.org/10.3390/ijerph13060609

**Chicago/Turabian Style**

Chen, Feng, Suren Chen, and Xiaoxiang Ma.
2016. "Crash Frequency Modeling Using Real-Time Environmental and Traffic Data and Unbalanced Panel Data Models" *International Journal of Environmental Research and Public Health* 13, no. 6: 609.
https://doi.org/10.3390/ijerph13060609