A Random Parameters Ordered Probit Analysis of Injury Severity in Truck Involved Rear-End Collisions

Social and economic burdens caused by truck-involved rear-end collisions are of great concern to public health and the environment. However, few efforts focused on identifying the difference of impacting factors on injury severity between car-strike-truck and truck-strike-car in rear-end collisions. In light of the above, this study focuses on illustrating the impact of variables associated with injury severity in truck-related rear-end crashes. To this end, truck involved rear-end crashes between 2006 and 2015 in the U.S. were obtained. Three random parameters ordered probit models were developed: two separate models for the car-strike-truck crashes and the truck-strike-car crashes, respectively, and one for the combined dataset. The likelihood ratio test was conducted to evaluate the significance of the difference between the models. The results show that there is a significant difference between car-strike-truck and truck-strike-car crashes in terms of contributing factors towards injury severity. In addition, indicators reflecting male, truck, starting or stopped in the road before a crash, and other vehicles stopped in lane show a mixed impact on injury severity. Corresponding implications were discussed according to the findings to reduce the possibility of severe injury in truck-involved rear-end collisions.


Introduction
As reported by the World Health Organization, road traffic injuries are the 8th leading cause of death, leading to nearly 1.35 million deaths each year, causing great public health issues and environmental concerns [1]. As the dominant mode in short-distance shipments, according to the U.S. Department of Transportation [2], truck transport contributes almost 60% of the total weight of shipments in 2015, making it the largest share of freight transportation. However, owing to the unique attributes of trucks in terms of weight, height, length, and braking performance, large truck-involved crashes have drawn considerable attention to the government and the public in recent years. According to the National Highway Traffic Safety Administration [3], there is a 12% increase in the total number of people killed in large truck crashes over ten years from 2007 to 2017 in the United States.
Among these large truck-involved fatal crashes, 72% in 2017 are occupants of other vehicles, including passenger cars [3]. The consequences are often severe, not only for the physical harm but also for the costs, including property damage and social influence (such as travel delays). Hence, to prevent injuries and costs caused by large trucks, an examination of the hidden dangers and contributing factors to truck-passenger car crashes is required.
Past studies have examined occupant injuries of truck-involved rear-end crashes by considering the two categories, passenger cars strike trucks (P2T) and passenger cars stuck by trucks (T2P), jointly [4]. Although they share similar crash characteristics such as vulnerable passenger car occupants and unsafe driver actions, injury severities may vary according to different collision orders [5]. Moreover, using a combined dataset might omit important differences in contributing factors between P2T and T2P crashes.
With this in mind, this study focus on the relationships between contributing factors and occupant injury severity in the large truck-involved rear-end collisions through a comparative investigation based on the dataset from the National Automotive Sampling System (NASS) General Estimates Data System (GES) of the United States. The rear-end collisions are among the most frequent types of truck-passenger car crashes [2]. In addition, this type of collision has a higher injury severity rate as compared to non-truck-involved crashes [6].
This study adopted a random-parameters ordered probit approach to provide a deep understanding of the influence of contributing factors on occupant injury. This model considers the ordinal nature of injury data, and it is statistically superior to the fixed parameters ordered probit model as it accounts for possible unobserved factors [7][8][9]. The findings of this study will shed light on identifying the potential difference of contributing factors between truck as leading vehicle and passenger-car as the leading vehicle in rear-end collisions, therefore guide making tailored countermeasures to alleviate the resulting injury severity levels.

Literature Review
Previous studies on truck-involved crashes seek to understand the crash characteristics and injury severity under the impact of human behavior, vehicle performance, geometric roadway, and surroundings. With distinct objectives and databases, studies vary in different ways. Table 1 provides a summary of these studies in terms of the truck definition, dependent variable and scale, model, and key findings.
In general, some studies were trying to identify the significant influencing factors on injury severity in terms of driver features, vehicle, roadway, environmental conditions, and collision characteristics [10][11][12]17], while others focused on a specific aspect only such as weather and lighting condition [13,22]. There are also some studies focused on a specific collision type such as rear-end and single-vehicle crash [4,9,16], while others are concentrating on finding the injury severity difference between various locations [10,23]. In addition, crash frequency and safety evaluating approach related to larger trucks are also been examined [24,25]. Overall, these studies have contributed greatly to the understanding of contributing factors towards injury severity in truck-involved crashes. Ordered probit and heteroskedastic ordered probit The likelihood of fatalities and severe injury increased with the number of trailers but decreased with truck length and GVWR.
Chen and Chen (2011) [11] Single-unit truck, tractor with a semi-trailer, and tractor without a semi-trailer Injury severities of truck drivers with 3-level, no injury, possible/non-incapacitating injury, incapacitating injury/fatal (19,741 crashes)

Mixed logit/Random Parameters Logit
Sixteen variables were found to be only significant in single-vehicle crashes, whereas another sixteen factors were showed significance only in multi-vehicle crashes on a rural highway.

Ordered probit
Driver behavior variables, including driver distraction, alcohol use, and emotional factors, were found to have a statistically significant impact on severe injury.

Classification and regression tree
Drunk-driving was the most detrimental factor for the injury severity of truck accidents. Uddin and Huynh (2018) [20] Hazmat large trucks Injury severities of most severely injured occupants with 3-level, major injury, minor injury, no injury (1173 observations)

Random parameters probit
Male occupants, truck drivers, crashes occurring in rural locations, dark-unlighted conditions, dark-lighted conditions, and weekdays were associated with increased probability of major injuries.
Behnood and Mannering (2019) [21] Any medium or heavy truck, excluding buses and motor homes, with GVWR greater than 10,000 lb Injury severities of most severely injured occupants with 3-level, no injury, minor injury, severe injury (large truck crashes in Los Angeles from 2010 to 2017, amount unclear)

Mixed logit/Random
Parameters Logit The effect of factors that determine injury severity varied significantly across time-of-day/time-period combinations.
However, few efforts were made to identify the influence of collision order, and the differences in injury severity between P2T and T2P rear-end crashes were largely overlooked. In light of the above discussion, this study assumes that there are unique contributing factors between truck as a leading vehicle and passenger car as a leading vehicle in rear-end crashes, as the similar results illustrated in other comparison studies [10,11,26,27]. Specifically, the objective of this study is twofold: (i) to investigate the differences of effects of factors that contribute to injury severity in truck-involved rear-end collisions, and (ii) to compare the model performance between combined dataset (model truck-involved rear-end crashes as a whole) and separate dataset (model P2T crashes and T2P crashes separately).

Data Description
Ten years of truck-involved crash records (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) were collected from NASS-GES, a nationally representative sample of police-reported crashes [28]. The NASS-GES data were randomly sampled and weighted from 60 geographic sites across the United States. Although a majority of crashes involving minor property damage only and no significant personal injury are not reported, the NASS-GES still works as a great database for analyzing highway safety problems.
A total of 10,455 records were obtained from the 10-year truck-involved crash database to investigate the potential contributing factors to injury severity. Each record represents the most severely injured occupant in the crash. Note that all observations were reserved if a crash resulted in several most severely injured levels. After screening out the records with incomplete information, we got a final dataset consisting of 8506 observations, which includes 4866 P2T records and 3640 T2P records.
Trucks in this study were defined as vehicles with Gross Vehicle Weight Rating (GVWR) greater than 10,000 lb. The dependent variable, injury severity, was described using the KABCO scale in the raw data set, which was regrouped into four levels to ensure each category had a decent number of observations. The descending order of injury level was considered to reduce bias and variability of the estimated parameters for the ordered probit model [29]. The scale is presented as follows: Among 4866 injury records in the P2T crash, 4.7% occupants had a severe injury, 10.4% had an evident injury, 13.4% had a possible injury, and 71.5% had no injury. Among 3640 maximum injury records in the T2P crash, 3.7% occupants had a severe injury, 12.7% had an evident injury, 30.4% had a possible injury, and 53.2% had no injury or property damage only.
To test for possible collinearity, the authors conducted Pearson's correlation test. Two pairs were found to be highly correlated, with a correlation parameter of 0.93 between the person type and seat position, and 0.69 between road surface condition and weather. Both correlations are self-explanatory since the driver can only sit in driver's seat, and road surface status changes according to the weather condition (e.g., wet surface in rainy day and icy surface on a snowy day). Both the seat position and weather were discarded to avoid collinearity. In consequence, a total number of 36 variables were to be tested in models for this study. The variables were categorized into five groups: person characteristics, vehicle characteristics, roadway and environment, crash mechanism, and temporal characteristics. A summary of the statistics is presented in Table 2.  Note that all variables were in dummy coding (with values of 0 and 1). Therefore, the mean value can be interpreted as proportion. For example, the mean value for male variables in the P2T data set represents there are 69.9% of samples are male. These variables were subsequently examined in the injury severity model specifications of truck-involved rear-end collisions.

Methodology
This study focuses on identifying the differences of contributing factors to injury severity between P2T and T2P collisions. Four crash injury severity outcomes were considered: severe injury, evident injury, possible injury, and no injury. As two categories of the most frequently used approaches, both the ordered model and the unordered model have been widely applied to examine the impact of contributing factors on injury severity [4,10,12,14]. Given that both the ordered and unordered models have their strength as well as limitations, the current study uses the ordered probit model as it accounts for the indexed nature of injury severity levels [30].
The ordered probit model is derived by introducing a latent variable y * as a basis for modeling the injury severity of each observation, which can be defined as follows [31]: where X is a vector of independent variables considered, β is a vector of estimable parameters, and ε is a random error term assumed to be normally distributed across observations with a mean equal to 0 and a variance equal to 1. Given Equation (1), the dependent variable y is defined by the unobserved variable y * as follows: where µ 0 = 0, µ 1 , and µ 2 are thresholds that are jointly estimated with β parameters. Then the probability of each injury category for given variables can be described on the distribution of random error ε: However, this standard probit model might lead to potential bias by treating the parameters β as a constant value across observations, which restricts each variable to have the same impact on every individual observation [32][33][34]. Therefore, to account for these circumstances, a random parameter ordered probit model was developed to capture the unobserved heterogeneity, which is achieved by adding a randomly distributed error term ϕ (e.g., a normally distributed term with mean = 0 and variance = σ 2 ) [35]: Since the interpretation of the estimated coefficient β on injury severity is not straightforward, marginal effects were computed to measure the effect of one unit change in an independent variable on the probability of injury severity. This is usually used to measure the influence of a variable in an injury severity level while keeping all other variables constant. The marginal effects are calculated as follows [31]:

Model Evaluation
To determine whether the separate models of P2T and T2P can be warranted, the likelihood ratio test is adopted [31]. The likelihood ratio statistics is defined as follows [36]: where LL(β ALL ), LL(β P2T ), and LL(β T2P ) are the log-likelihoods at the convergence of the joint data model, car-strike-truck model, and truck-strike-car model, respectively. The 2 statistic is a χ 2 distribution with the degrees of freedom d equal to the sum of the number of parameters considered in each separate dataset minus the one in the joint dataset, which in this case is, d = K P2T + K T2P − K ALL . With 260.60 χ 2 value and 12 degrees of freedom, a confidence level of over 99.99% was obtained. This indicates modeling P2T and T2P separately are more likely to present a superior fit compared to the joint dataset model. Another likelihood ratio test was conducted to compare the differences between the random parameters model and their fixed parameters model using the test statistics below [31]: where LL β f ixed and LL(β random ) are the log-likelihoods at the convergence of fixed parameters ordered probit model and random parameters ordered probit model estimated using the same dataset (All, P2T, and T2P), respectively. This 2 statistic is a χ 2 distribution with the degrees of freedom equal to the difference in the number of parameters between the two models. A list of χ 2 value and the degrees of freedom for each dataset are presented in Table 3. The test results were significant for these three crash datasets with the p-values all below 0.001, which gives more than 99.99% confident to believe that the random parameters ordered probit models outperforms the corresponding fixed ones.

Empirical Results and Discussion
The random parameters ordered probit model was estimated through simulated maximum likelihood 200 Halton draws, which has been demonstrated to be an efficient method in producing accurate results for discrete choice models with low dimensionality of integration [37]. In addition, Halton draws provides a better simulation performance than random draws due to the dramatic speed gains with no degradation. The normal distribution was considered as the distribution for random parameters among other distributions including lognormal, triangular, and uniform distribution since previous studies have shown that normal distribution almost always outperforms other distributions [13,18,38].
Fixed and random parameters model estimation results for P2T and T2P crashes are presented in Tables 4 and 5, respectively. The marginal effects of the random parameters model for these two datasets are shown in Tables 6 and 7. The backward selection was performed to select the best subsets of the independent variables with a criterion of p-value > 0.1. Hence, all estimated parameters included in the final model were statistically significant at a confidence level of 90% and the results were plausible as discussed below. Note: ***, **, * refer to Significance at 1%, 5%, 10% level.
To start with, because of the difference in the potential characteristics of the collision order, the variables found statistically significant in the P2T model were significantly different from those found in the T2P model. It was also noteworthy that 6 of these variables determining injury severity gave generally similar results between P2T and T2P crashes. The details of each model were discussed separately. Note: ***, **, * refer to Significance at 1%, 5%, 10% level.

P2T Crashes Variables
Regarding the model estimation results of P2T crashes, Table 4 shows that 16 variables have significant impacts on injury severity. A positive sign of a parameter indicates the probability of more severe outcomes (i.e., severe injury) increase, while less severe outcomes (i.e., no injury) decrease. In total, 2 out of 17 significant variables were determined to be random parameters and thus had a variable effect on injury severity outcome probabilities, i.e., male indicator and indicator for starting/stopped in the road before the crash.

Person Characteristics
Male occupant indicator and use of restraint system indicator tend to increase the probability of less severe injuries. Possible explanations are that males are physically stronger and have greater injury-sustaining capacity over females [39], and restraint system such as shoulder belt or lap belt tend to protect occupants from bumping into the wheel, dashboard or being ejected from the seat [40]. In addition, restraint systems are mostly developed based on the body structure of men, which might be another possible reason to explain the decreased likelihood of injury severity for a male with the restraint system. However, the male indicator resulted in a normally distributed random parameter with a mean of −0.2793 and a standard deviation of 0.5610, indicating a mixed impact on injury severity. In particular, 30.9% of males (above zero) in P2T crashes were more likely to experience the risk of injuries than females, probably due to the aggressive driving behavior of males, which is also consistent with previous studies as being identified by several researchers [41,42]. In addition, two age groups, age under 25 and age above 64, were found to have a strong association with injury severities in the P2T model. Occupants above 64 were more likely to experience more severe injuries while those under 25 tend to have a lower risk of sustaining more severe injuries. This finding is likely attributable to the lower bone mass density and longer reaction time for elders than youngsters [43,44].

Vehicle Characteristics
The model results also suggested that three vehicle characteristics were statistically significantly associated with injury severities. As consistent with past research [45], trucks in rear-end collisions with passenger cars have natural advantages because of their heavier weight and larger body and therefore are more likely to experience less severe injuries. Vehicles with one or more trailing units also decrease the likelihood of severe injuries (significant in P2T crashes only), which is possibly due to more proficient driving experience and training of drivers of these vehicles as compared to those of vehicles without a trailing unit [18]. In addition, in line with common sense, the results indicated a positive association between driver drinking in vehicle and injury severity for P2T crashes in this study. Specifically, drink and driving led to an increase in the likelihood of possible injury, evident injury, and severe injury by 0.0567, 0.0476, and 0.0139, respectively.

Roadway and Environment
Surface conditions and lighting conditions were found to affect severity outcomes. Interestingly, the model results showed that severe injury accidents happened less on the icy or snowy road surface than normal or dry road surface for P2T crashes only. It is likely a result of more cautious driving behavior on icy or snowy surfaces [13]. This finding was consistent with previous studies where normal road surface condition was found to provoke more severe accidents [45,46]. Whereas for lighting conditions, the dark and the dawn or dusk conditions were found to be significant in affecting injury outcomes, especially for dawn and dusk conditions (significant for P2T model only). The likelihood of severe injury raised by 0.0540, 0.0453, and 0.0132 for possible injury, evident injury, and severe injury, respectively. This result is perhaps due to the poor driving vision in dawn or night and gives passenger car drivers less time to perceive the forward environment and react accordingly [47].

Crash Mechanism
Four variables in crash mechanism showed a strong association with injury outcomes in P2T crashes. Going straight before crash increased the probability of more severe injury compared to the vehicle making curve or changing lanes. On the contrary, the two critical event indicators of other vehicles stopped in lane and other vehicles in lane traveling in the same direction while decelerating were found to be negatively associated with injury severity. Specifically, these two critical event indicators (significant for the P2T model only) implied the occupant sat in the striking vehicle. Past studies have shown that the front vehicle tended to suffer high levels of injury severity in a rear-end collision [5], which explains why the striking vehicle suffers less severe injury than the struck vehicle. Similar model estimation results were also found in ALL dataset. It showed that the collision order tended to determine injury severity levels. Specifically, when the passenger-car was struck by a truck as opposed to the other way around, the probability of severe injury increased (Please refer to Tables A1 and A2 in Appendix A for complete model results for ALL dataset). The variable reflecting starting or stopped in the road before the crash was found to be normally distributed with a mean of −0.9347 and a standard deviation of 1.0185 in the P2T model only, implying its impact on injury severity varied across individuals. That is to say, 82.1% of individuals (below zero) who starting or stopped in the road before a crash in this study experienced less injury while the rest of 17.9% of observations (above zero) were more likely to sustain severe injury.

Temporal Characteristics
Two variables were found to have a positive impact on injury severity for P2T crashes only. In particular, the probability of severe injury slightly increased for the crashes occurring during weekdays and the spring. Similar results have been found in previous studies [22].

T2P Crashes Variables
Turning to the model estimation results for T2P crashes, as shown in Table 5, six variables were found to have similar impact on the injury severity as those in P2T crashes, i.e., male indicator, use of restraint system, truck indicator, dark condition, going straight before crash, and critical event for other vehicle stopped in lane. Moreover, driver indicator, age between 55 and 64, other vehicles in lane traveling in the same direction with higher speed, and summer indicators were showed to be statistically significant in determining injury severity for T2P crashes only. In addition, truck and other vehicles stopped in lane indicators were found to have a random effect on injury severity outcomes. This section will only discuss these six variables.
Regarding person type, the driver was found to decrease the possibility of severe injury in T2P crashes, likely due to the seat position and self-protect mechanism of drivers in a rear-end collision. Another variable in person characteristics reflecting the age between 55 and 64 showed a positive association with severe injury severity. This finding is in line with past studies as the physiological strength and injury-sustaining capability of this age group are relatively low [18].
As illustrated in the above section, it is evident that occupants in trucks experience a lower injury than in passenger cars in a rear-end collision. However, the effect seems to be mixed as the truck indicator was found to be normally distributed with a mean of −3.2639 and a standard deviation of 1.8687. It implied a 4.0% of observations (above zero) were more likely to obtain more severe injury while 96% of observations (below zero) were found to experience less injury in T2P crashes. This small proportion could refer to the aberrant driving behavior of truck drivers or unused safety measurement of truck occupants.
The indicator representing other vehicles stopped in the lane was also found to have a combined effect on injury severity, resulting in a normal distribution with a mean of −0.6351 and a standard deviation of 0.9271. This indicated that except for the 75.3% of observations who were more likely to be more severely injured (discussed before), 24.67% of occupants (above zero) were more likely to experience less severe injury. One possible explanation could be that these drivers in the rear vehicle tend to maintain a safer stopping distance, therefore, it gives the driver enough time to react to the sudden brakes of the front vehicle.
The model also indicated that other vehicles in lane traveling in the same direction with a higher speed slightly decreased the probability of more severe injury. A possible reason could be the minor speed difference between truck and passenger car, which reduced the injury severity accordingly.
In T2P crashes, only one temporal characteristic variable was found to be statistically significant. The indicator reflecting summer time slightly increased the severe injury probability. The high temperature in the summertime has been suggested as an important factor leading to an increase in stress and a decrease in motor skill performance for drivers [48].
Overall, separate injury severity models based on P2T crashes and T2P crashes can shed light on identifying the significant contributing factors. However, similar to previous research on accident severity, limitations also exist which should be taken into account before applying its findings. One limitation is the underreporting for minor property damage only and no significant personal injury in the GES database, and second, potentially crucial variables such as the avoidance maneuver taken by the driver within crash have been neglected due to the lack of available data. The findings would be more generalizable if the dataset were abundant to provide additional information about truck-involved rear-end collisions.

Conclusions
Injury severity of rear-end crash is of great concern to public health and environment. This study employed a random parameter ordered probit modeling framework to investigate the impact of contributing factors on injury severity of truck-involved rear-end collisions. Using the data from NASS-GES, separate models for the most severely injured occupant of truck-involved rear-end crashes (ALL model), passenger cars strike trucks (P2T model), and passenger cars stuck by trucks (T2P model) were developed. The likelihood ratio tests were conducted to evaluate the goodness of fit for these three models. The model estimation results demonstrated the necessity of modeling P2T and T2P crashes separately to analyze truck-involved rear-end collisions.
Similarities and differences were observed across the two models in terms of person characteristics, vehicle characteristics, roadway and environment, crash mechanism, and temporal characteristics. Some variables are significant only in P2T crashes, but not in T2P crashes, and vice versa. Key differences include age group, trailing units, drinking driving, road surface and critical events which made the crash imminent. For example, age under 25 and age above 64 was found to be significant in P2T crashes, while age between 55 and 64 indicator was only found significant in T2P crashes. Moreover, driver drinking in vehicle and dawn or dusk indicators were found to have a significant impact on increasing the injury levels. Furthermore, variables reflecting male, truck, starting or stopped in the road before crash, and other vehicles stopped in the lane were found to have a mixed impact on injury severity. In terms of these variables, a majority of observations showed an increase of probability in less severe injuries.
The results obtained from the developed models in this study have a number of practical implications. First, the use of the restraint system was found to decrease the probability of severe injury, suggesting that measurement in increasing seatbelt or lap belt compliance among truck and passenger-car occupants is needed. Second, age above 55 is more likely to experience injury severely, indicating a special carefulness targeting this age group is necessary to reduce injury severity. Third, it was found that severe injury increased under dark, dawn, or dusk conditions, suggesting that safety and enforcement agencies should seek extra instruments during dawn, dusk, and dark conditions. Fourth, a vehicle stopped in the lane before crash tends to decrease the probability of severe injury, therefore a kind reminder to keep a safe stopping distance between vehicles is necessary to allow enough reaction time for rear vehicles in a rear-end crash. Lastly, temporal variables reflecting weekdays, spring, and summer were found to be positively associated with severe injuries, indicating more attentions are needed by safety and enforcement agencies in these particular periods.
Funding: Project 51808402 supported by National Natural Science Foundation of China, and Project 18YF1424600 sponsored by Shanghai Sailing Program.

Conflicts of Interest:
The authors declare no conflict of interest.