Analyzing Risk Factors for Fatality in Urban Traffic Crashes : A Case Study of Wuhan , China

How to maintain public transit safety and sustainability has become a major concern for the department of Road Traffic Administration. This study aims to analyze the risk factors that contribute to fatality in road traffic crashes using a 5-year police-reported dataset from the Wuhan Traffic Management Bureau. Four types of variables, including driving experience, environmental factor, roadway factor and crash characteristic, were examined in this research by a case-control study. To obtain a comprehensive understanding of crash fatality, this study explored a detailed set of injury-severity risk factors such as impact direction, light and weather conditions, crash characteristic, driving experience and high-risk driving behavior. Based on the results of statistical analyses, fatality risk of crash-involved individuals was significantly associated with driving experience, season, light condition, road type, crash type, impact direction, and high-risk driving behavior. This study succeeded in identifying the risk factors for fatality of crash-involved individuals using a police-reported dataset, which could provide reliable information for implementing remedial measures and improving sustainability in urban road network. A more detailed list of explanatory variables could enhance the accountability of the analysis.


Introduction
Each year, road traffic crashes lead to numerous lives lost and economic losses, which is a major societal concern for the general public and governmental agencies.According to the National Bureau of Statistics of China, the death rate per ten thousand vehicles declined to 2.22 in 2014, a decrease of 5.1% over 2013 [1].However, road traffic injuries exceeded any other causes of injury death and took first place in China [2,3].Wuhan is the capital of Hubei province, which is the most populous city in Central China.With a population of over six million on 890 km 2 in central districts, the number of licensed vehicles was over 1.3 million in 2012 and the number of motor vehicle crashes was likely to continue rising, which could be attributed to the ubiquitous traffic activities in the emerging urbanized area.Therefore, how to maintain public transit safety and sustainability has become a major concern for the department of Road Traffic Administration [4][5][6][7].Deaths, injuries, property loss resulting from crashes can cause not only personal suffering, but also heavy burdens to society.
Compared with other major cities in China, the death rate of road traffic crashes is relatively low in Wuhan [8].The number of motor vehicles has grown 25% annually in the period from 2000 to 2010; however, the road mileage has increased only 3% annually over the same period [9].From 2010 to 2012, the road traffic volumes have also increased dramatically in Wuhan.In urban areas, the number of road junctions with peak-hour volume more than 5000 vehicles has increased from 61 to 116 over this period [8].In addition, Wuhan has seven bridges and one tunnel across the Yangtze River.In 2012, every day over 443,000 vehicles traveled across the river with a significant increase from 390,000 vehicles in 2011 [8].With the increasing number of vehicles, the number of traffic violations has also increased to almost 3 million in 2012.Therefore, how to provide a sustainable development of urban traffic environment has become a major concern for the Wuhan Traffic Management Bureau.
The study aimed to analyze the risk factors that contributed to fatality in road traffic crashes using a police-reported crash dataset from the Wuhan Traffic Management Bureau.Previous studies on road traffic crashes have demonstrated there were five main causes affecting crash-injury severity, including human error, vehicle conditions, traffic characteristics, roadway conditions and environmental factors [10][11][12].All these causes were interrelated and interacted on each other and influenced road traffic safety simultaneously.Moreover, human-related factors such as driver's behavior, which were strongly associated with other causes, were crucial to crash occurrence and injury severity [13][14][15].Given the importance of traffic safety, considerable studies have been conducted to analyze the relationship between potential explanatory variables and road traffic crashes, although none specifically reported in Wuhan.In China, road traffic crash records were collected and maintained by the public security traffic administrative department based on the National Standard for Data Structure for Accident File Information and Codes for Traffic Accident Information.This study made use of five years of crash records to investigate risk factors affecting fatality of crash-involved individuals.
A wide variety of statistical techniques have been used to investigate crash-severity data.The statistical methods that were applied by researchers largely depended on the nature of the crash-severity data [16,17], such as binary outcome models [18][19][20], ordered discrete outcome models [21][22][23], and unordered multinomial discrete outcome models [24,25].The primary goal of this study was to analyze and identify risk factors influencing fatality of crash-involved individuals in a densely populated city of China.Due to the dichotomous nature of the dependent variable in this study, a case-control study was conducted to examine the factors affecting crash fatality in Wuhan.The significant risk factors associated with crash fatality were identified by means of conditional logistic regression model.The next section of this paper describes data preparation and definition of all variables and introduces the statistical analyses briefly.Section 3 presents the results of the experiments and summarizes the effects of the potential risk factors.Section 4 discusses the findings of experiments and concludes the study.

Data Preparation
The police-reported crash records from the Wuhan Traffic Management Bureau for calendar years 2008-2012 were used in this study.In China, crash fatality is defined as immediate death or subsequent death occurring within 7 days from injuries.Here, cases referred to fatal crashes and controls were crashes with property damage only (PDO).The study area consisted of a total of 19 police districts in Wuhan.According to the administrative region of Wuhan city, there were four regions with six districts in Wuchang, three districts in Hankou, one district in Hanyang and nine districts in the suburban region.The cases and controls were matched in proportions by district (Wuchang, Hankou, Hanyang, and suburban region), driver's gender (male and female), and driver's age (<26, 26-55, and >55 years old).The records of 1516 fatal crashes and 1867 PDO crashes were included in the statistical analyses.
There were four components in the police-reported dataset, namely traffic crash profile, vehicle involvement profile, road condition profile and crash environment profile.The traffic crash profile illustrated location, crash type, impact direction, crash cause, violation type, driver's action, alcohol and fatigue involvement, the demographic information of driver, and other special circumstances; the vehicle involvement profile provided license status and registration number; the road condition profile indicated road grade, terrain form, surface condition and type, traffic control status, and road geometric features; the crash environment profile contained the information about the occurrence time of a crash, and the light and weather conditions.Several irrelevant variables such as crash code, road name, road ID, and vehicle registration number were excluded in the analyses.

Covariates Description
The independent variables were classified as groups of unordered categorical variables based on the existing findings or experiences, and some variables were transformed into dummy variables for a reasonable analysis.A number of covariates were explored in the statistical analyses including driving experience, environmental factors, roadway factors, and crash characteristics.These factors were detailed as follows.

Driving Experience
The driving experience of the driver was considered as an important risk factor, which was classified into four groups based on the driving license: <1, 1-5, >5 years and without a license.

Environmental Factor
Five environmental factors were examined in this study.The time of a crash was organized according to four seasons: March-May, June-August, September-November and December-February.The occurrence day was divided into two groups: weekday (Monday-Friday) and weekend (Saturday and Sunday).Light conditions were categorized into three types: nighttime without streetlight, nighttime with streetlight and daytime.Three categories of weather conditions were specified: rainy/foggy, cloudy and clear.

Roadway Factor
According to the crash location, terrain form was classified into mountainous area and flat area.Road surface conditions were classified into two categories: wet/slippery and dry.Road surface was organized into three types: asphalt, concrete and others.Three types of road were assigned: single-way carriageway, two-way carriageway and multi-/dual carriageway.Five types of road section were specified: T/Y intersection, X intersection, other intersection, special section (e.g., narrow road, bridge and crosswalk) and normal road.Road alignment was separated into two parts: flat and straight, and others.According to the design and function of the road, road grade was grouped into expressway, arterial, secondary and branch road.The effect of traffic control was examined in statistical analyses.Control = "yes" indicated road traffic system was controlled by traffic lights or road signs or traffic police.

Crash Characteristics
To measure the effects of crash characteristics on fatality, we classified the collision into four types: "single vehicle", "pedestrian-vehicle", "vehicle-fixed object" and "vehicle-vehicle" collision.The impact direction of a crash was classified into four groups: side, front, rear-end and others.The responsible party was categorized into two classes: pedestrian/non-motorized vehicle, and others.We further divided the high-risk driving behavior of driver into several types based on the previous findings: not a driver's fault, failing to yield, over-speeding, alcohol or fatigue involvement and others.
The influence of driving experience and environmental risk factors (day of week, season, weekday, light and weather conditions) were investigated in this study.We also explored the effects of roadway characteristics such as terrain form, road surface condition and type, road grade, road geometric features, and traffic control.Furthermore, the effects of crash characteristics (crash type, impact direction, responsible party, and high-risk driving behavior) were also examined.Table 1 describes all the variables selected for this study.

Statistical Analyses
The Wilcoxon rank sum statistics were used to examine the distribution of traffic crash records between groups with different demographic characteristics of driver (i.e., age and gender) and districts for the case and control groups; p-values were generated to determine statistical significance (Table 2).The chi-square (χ 2 ) test and Cramer's V were applied to assess the contribution of potential explanatory variables on crash fatality (Table 3).The insignificant factors, such as weather condition (e_5) and road surface condition (r_2), were removed during this procedure.

Spatial Stratified Heterogeneity Analyses
The geographical detector method [26] is a spatial analysis method for measuring spatial stratified heterogeneity [27,28].It was applied in this study to examine whether multiple variables (i.e., driver's license, day of week, terrain form, etc.) independently or dependently affect crash fatality occurrences.The effects of the multiple influencing variables on crash fatality occurrences may be independent or dependent.The effects on crash fatality occurrences may be stronger or weaker after interaction (Tables 4 and 5).

Stratified/Conditional Logistic Regression Analyses
The effects of potential risk factors on crash fatality were first modeled using univariate and multivariate logistic regression (Table 6).Then, stratified/conditional logistic regression analyses were applied to model the risks of crash fatality associated with the potential risk factors for the case and control groups (Table 7).The matched odd ratio (mOR) was used to estimate the influence of different risk factors on response with 95% confidence intervals (CIs).The study employed the Epi package in R v.3.2.3 (R Development Core Team, Vienna, Austria) to carry out the analyses using stratified/conditional and binary logistic regression models.

Results
Table 2 summarized characteristics of the cases and controls matched by district, driver's gender and age.There were no significant differences between the groups with respect to district (p = 0.465), driver's gender (p = 0,180), and driver's age (p = 0.109).Almost 48% of the cases occurred in the urban region of Wuhan (Wuchang, Hankou and Hanyang).The number of male drivers was 1092, which almost tripled the number of female drivers.Over 70% of the fatal crashes occurred in elderly (>55 years) drivers.
Table 3 showed the contribution of individual risk factors that were measured by the chi-square χ2 test and Cramer's V. A number of factors were found to be significantly associated with fatality of crash-involved individuals (p < 0.05).The driving experience was highly correlated with fatal crash.Four environmental factors including season, day of week, weekdays and light condition were important risk factors affecting crash fatality.Seven roadway factors were found significantly associated with crash fatality, including terrain form, surface type, road type, road section type, road alignment, road grade, and traffic control.Crash characteristics (e.g., crash type, impact direction, responsible party, and crash cause) were highly correlated with crash fatality.Table 4 summarized the interaction effects between the important risk factors and the results of q-statistics were summarized in Table 5.The q-statistics of the driving experience was 0.119 using the geographical detector method.The spatial stratified heterogeneity analysis indicated significant association (p < 0.05) between crash fatality and the driving experience.For other significant factors, the q-statistics were 0.004-0.178for environmental factors, 0.006-0.039for roadway factors, and 0.016 to 0.124 for crash characteristics (Table 5).There was a nonlinear enhancement of driving experience and day of week in contributing to crash fatality, as shown in Table 4. Table 6 presents the estimated risks of potential risk factors in all populations using univariate and multivariate logistic regression.After controlling for the effects of other explanatory variables, road alignment (r_6) was not identified to be significantly associated with crash fatality.Due to the underreporting nature of crash records, the effects of some independent variables (e.g., terrain form, expressway, traffic control, impact direction, and responsible party) could be overestimated.Therefore, we focused on how the estimated risks varied in groups of district, driver's age and gender in this study.The stratified/conditional logistic regression analyses indicated that the fatality risk of crash-involved individuals was significantly associated with driving experience (d_), season (e_1), light condition (e_4), road type (r_4), crash type (c_1), impact direction (c_2), and high-risk driving behavior (c_3).The drivers with driving experience less than five years (d_1, d_2 and d_3) showed significantly higher mOR values than the experienced drivers (d_4).Compared with winter (e_14), summer (e_12) and autumn (e_13) had stronger associations with crash fatality in three groups.Besides, nighttime (e_41 and e_42), "pedestrian-vehicle" collision (c_12), front impact (c_22), pedestrian responsible crashes (c_31), and over-speeding (c_43) were significantly associated with crash fatality.Several factors had low mOR values, such as single-way carriageway (r_41), "vehicle-vehicle" crashes (c_13), and without high-risk driving behavior (c_41).

Discussion and Conclusions
Based on the results of stratified/conditional logistic regression analyses, fatality of crash-involved individuals was significantly associated with driving experience, season, light condition, road type, crash type, impact direction and high-risk driving behavior.In this study, we focused on how the estimated risks varied in groups of district, driver's age and gender.The driving experience was a significant factor affecting driving behavior and experienced drivers could avoid the occurrence of traffic crash effectively [29].However, the driving experience that was measured by driving license could lead to some statistical bias.A large number of drivers who had a driving license could not drive skillfully in China, especially in young drivers, which could explain that less experienced drivers were associated with significantly high risk of crash fatality, as shown in Hu and Cao [30] and an annual official report from the Bureau of Traffic Management of China.
Among the five environmental factors, time of a crash was an important risk factor for crash fatality as shown in previous studies [16,19,20,31].However, the temporal factors were usually correlated with other potential causes such as weather condition, road surface condition and driving behavior.To explain the contribution of the temporal factors, a more detailed set of variables should be investigated in further research.Light condition was another environmental risk factor.Poor light conditions were associated with a significant increase in fatality risk that confirmed the previous findings [19][20][21]23].
Several roadway factors were associated with fatality of crash-involved individuals.An important finding was that blacktop road surface was associated with a significant increase in fatality risk for driver's gender and age groups, compared with gravel/stone and concrete surface.Crashes that occurred on single-way carriageway led to a significant decrease in fatality risk, which was not consistent with the findings of Sze and Wong [19].An intuitive finding was that road traffic crashes occurring at expressway and secondary road had a significant increment of fatality risk, which supported the previous findings [21,32].With regard to the crash characteristics, a valuable finding was that "pedestrian-vehicle" collision involved a higher fatality risk compared to "vehicle-fixed object" crash, which could be attributed to pedestrian-vehicle conflicts in a densely populated urban area with numerous roadside activities in Wuhan.Additionally, in collisions involving pedestrians, the driver and passengers are protected by the vehicle itself and eventually by the seatbelt and airbags; however, specific protection measures for pedestrians are lacking.
Compared to no impact (c_24), the side, front, and rear-end impact had stronger associations with crash fatality.Furthermore, the traffic violations of pedestrian and non-motorized vehicle were significant contributors to fatal crash.A valuable finding was that reckless driving behavior (e.g., failing to yield and over-speeding) was significantly associated with fatality risk of crash-involved individuals.Moreover, an expected finding was that the factors of alcohol and/or fatigue involvement were always associated with a significant increase in fatality risk, which indicated the hazardous effects of drinking and fatigue driving [21].
One disadvantage of this study was the biased parameter estimates resulted from the correlation among crash-injury observations [33].In previous studies, some researchers have investigated crash severity by considering the injury-severity level of driver, while others have considered the injury severity of crash-involved individuals.For the latter, it was necessary to account for the within-crash correlation among observations by applying complex models [33][34][35].In this study, the issue was mitigated by considering the most severe injury (fatality) sustained in crash-involved vehicle, which has been discussed in related works [17].Another source of statistical bias was the underreporting nature of crash records.Not all crashes were included in this police-reported dataset, which could violate the statistical assumption that the sample data are randomly selected from a population that each crash has an equal probability of being sampled.These statistical biases could lead to marginal impacts that are overestimated for key variables.If the underreporting rates in the population are known, a weighted maximum likelihood function can be used to analyze outcome-based sample, but the true rate of underreporting is unknown in Wuhan, making corrective measures difficult.In China, the police-reported crash records are the main source of traffic crash data for road safety research and have been applied to make the "official" estimates of fatalities in road traffic crashes.Although the underreporting is inevitable in traditional crash databases, the police-reported data are the only data source of any information on crashes including more than 20 causes of crashes [36].
This study attempted to identify the risk factors affecting fatality of crash-involved individuals in road traffic crashes using a 5-year police-reported dataset in Wuhan, in conjunction with a case-control study.Besides the crash characteristics (e.g., crash type and impact direction), we also considered the effects of driving experience, environmental factors, and roadway characteristics on crash fatality in this research, which could provide reliable information for implementing remedial measures in Wuhan.However, a more detailed list of variables (e.g., land use, vehicle conditions, delta-v of the crash, restraint usage, and safety belt use) could enhance the accountability of the investigation.

Table 1 .
The summary of the variables.

Table 2 .
Statistical summary of cases and matched controls by district, gender and age.
1 df denotes degrees of freedom.

Table 4 .
Interactions between different covariates in contributing to crash fatality.

Table 5 .
Results of the q-statistics using geographical detector method.

Table 6 .
Estimated risks (unadjusted and adjusted Odds Ratio (OR); 95% Confidence Intervals (CI)) of crash fatality associated with potential risk factors.

Table 7 .
Estimated risks (matched Odds Ratio; 95% Confidence Intervals) of crash fatality associated with potential risk factors.