Modeling the Unobserved Heterogeneity in E-bike Collision Severity Using Full Bayesian Random Parameters Multinomial Logit Regression

Understanding the risk factors of e-bike collisions can improve e-bike riders’ safety awareness and help traffic professionals to develop effective countermeasures. This study investigates risk factors that significantly contribute to the severity of e-bike collisions. Two months of e-bike collision data were collected in the city of Ningbo, China. A random parameters multinomial logit regression (RP-MNL) is proposed to account for the unobserved heterogeneity across observations. A fixed parameters multinomial logit regression (FP-MNL) is estimated and compared with the RP-MNL under the Bayesian framework. The full Bayesian approach based on Markov chain Monte Carlo simulation is employed to estimate the model parameters. Both parameter estimates and odds ratio (OR) are used to interpret the impact of risk factors on the severity of e-bike collisions. The model comparison results show that RP-MNL outperforms FP-MNL, indicating that accommodating the unobserved heterogeneity across observations could improve the model fit. The model estimation results show that age, gender, e-bike behavior, license plate, bicycle type, location, and speed limit are statistically significant and associated with the severity of e-bike collisions. Furthermore, four risk factors, i.e., gender, e-bike behavior, bicycle type, and speed limit, are found to have heterogeneous effects on severity of e-bike collisions, appearing in the form of random parameters in the statistical model.


Introduction
E-bikes are becoming a rising transportation mode for commuting in China.E-bike ownership increased significantly from 58 thousand in 1988 to 466 million in 2016, with an annual increase of 64.8% [1].Due to their small size and flexible routes, e-bikes provide road users with convenient and affordable mobility.Furthermore, they are beneficial to the public for the advantages of low cost and easy parking [2].However, a big issue preventing the use of e-bikes is the increasing number of e-bike collisions [3].Particularly, e-bikes can result in serious collisions due to their lack of protection.As such, it is important to understand risk factors related to e-bike collisions in order to increase e-bike riders' safety awareness and help traffic professionals develop effective countermeasures.
A number of risk factors such as traffic and road/intersection design factors are known to affect e-bike collisions [4][5][6][7][8][9][10].However, there is unobserved heterogeneity across observations, which may public safety concern.In 2015, 1541 e-bike collisions were reported by the Ningbo Police Department (NBPD).The number of e-bike collisions was almost twice that of bicycle collisions (854) and was comparable to motorcycle collisions (1676).However, the fatality proportion of e-bike collisions (4.5%) is much higher than those of motorcycle collisions (2.3%) and bicycle collisions (3.8%).
E-bike collision data were provided by NBPD.The crash database includes rich information.Within each record, the data, time, location, collision type, crash severity, involved vehicle type, weather, lights condition, intentions and behaviors of involved drivers, driver's age, gender, ID number, license plate number and telephone number were available.Two months of collision data (July and August in 2015) were extracted from the crash database, which was used in a previous study [10].To filter the e-bike collision data, at least one of the collision objects should be e-bike in the collision record.
Finally, a total of 310 e-bike collision records were extracted and used for the analysis.Among them, 7 were reported as fatality (F), accounting for 2.26% of the total e-bike collisions, 43 (13.87%) were reported as incapacitating injury (I), 233 (75.16%) were reported as non-incapacitating injury (NI), and 27 (8.71%)were reported as property damage only (PDO).Due to the small sample size of fatality collisions, they are combined with the incapacity injury.As such, the collision severity was reorganized into three categories: F/I, NI, and PDO.
In addition to the original information included in the crash database, the collision location (coordinates) was used to match the collision points on the Google Earth map.As such, the geometric design factors such as road type, median type, and width were collected.Table 1 shows detailed information regarding variable definitions and descriptions, as well as the percentage of observed collision frequency at each severity level.

Multinomial Logit Regression
The multinomial logit regression (MNL) is commonly used in collision severity analysis, in which collisions can be categorized into more than two levels with one level as a reference category [17].In this study, an MNL is developed to explore the risk factors with different severity levels of e-bike collisions, with the PDO e-bike collisions as reference level.The MNL is expressed as where Y i = j is the collision severity j for the ith observation; X = [x i1 , x i2 , . . ., T is the coefficient vector for the vector X.The likelihood function for MNL is given as where N is the sample size; J is the total number of outcomes (e-bike collision severity levels); δ ij is an indicator which equals to 1 if the discrete outcome for sample i is j, and 0 otherwise.

Random Parameters Multinomial Logit Regression
The parameters of explanatory variables in standard MNL are assumed to be fixed across observations, indicating that the impact of each explanatory variable is the same across observations [18].However, this assumption is somehow contrary to the fact that the effect of explanatory variable varies across observations.To handle the unobserved heterogeneity across observations, this study proposes a random parameters multinomial logit regression (RP-MNL) to investigate the risk factors affecting e-bikes collision severity of e-bikes collisions.Compared with the fixed-parameters MNL (FP-MNL), the RP-MNL allows all parameters to vary randomly across observations.As such, more features in the collision data can be extracted and the accuracy of the regression can be improved.The RP-MNL is developed by being written Equation (1) as follows T is the coefficient vector, and these random parameters are allowed to vary across observations.In this study, the random parameters in RP-MNL are assumed to be normally distributed as . . .
Similarly, the likelihood of the RP-MNL is given as

Full Bayesian Estimation
The full Bayesian approach based on Markov chain Monte Carlo (MCMC) is utilized to estimate the RP-MNL.In the full Bayesian approach, prior information and observed data are combined to obtain the RP-MNL parameters' posterior distributions.Let Θ represent the parameters in RP-MNL, which is given as follows According to the Bayesian inference, the posterior distribution of parameters Θ can be estimated as follows where f (Θ|Y) is the posterior distribution of parameters Θ conditional on observed dataset Y; f (Y, Θ) is the joint probability distribution of observed dataset Y and parameters Θ; π(Θ) is the prior distribution of parameters Θ; f (Y|Θ) is the likelihood conditional function based on parameters Θ.
Due to the lack of information of the random parameters, the non-informative prior distributions for the random parameters in RP-MNL are used in this study.The prior distributions for parameters are given as follows where the mean of the random parameters follows normal distribution, and the variance of the random parameters follows inverse gamma distribution.
The parameters with over lines in Equations ( 8) and ( 9) are hyper-parameters which are given as follows Based on the prior distributions of parameters Θ, the posterior distribution f (Θ|Y) can be derived as follows

Risk Factors Analysis
The odds ratio (OR) is employed to analyze the impact of risk factors on the e-bike collision severities.The OR of a risk factor means an increase in the odds of the outcome (severity levels) if the value of the risk factor increases by one unit [19].The OR for a risk factor x j in the RP-MNL can be calculated as follows

Models Comparison
The Deviance Information Criteria (DIC) is used to compare the full Bayesian estimated regressions.The DIC is given as follows where

Estimation Results
Both FP-MNL and RP-MNL were estimated to identify and evaluate the impact of risk factors on severity of e-bike collisions.The performance of these two models were compared using DIC.The MCMC simulation-based full Bayesian approach was employed to estimate the posterior distributions of the models' parameters.WinBUGS software was used as the modeling platform.Two independent Markov chains for each of the parameters with diverse initial values run for 40,000 iterations.The first 20,000 iterations in each chain were used for monitoring convergence and then discarded as burn-in runs.The convergence of the posterior distribution was monitored by visual inspection of the trace, and autocorrelation plots.In this study, the collision severity PDO was selected as the reference category.Tables 2 and 3 show the estimates results of the FP-MNL and the RP-MNL respectively.The variables which are significant at 95% credible interval were kept in the tables.As shown in Tables 2 and 3, seven risk factors are found to be significantly associated with e-bike collision severities, including age, gender, e-bike behavior, license plate, bicycle type, location, and speed limit.The FP-MNL coefficients are similar in terms of magnitude and sign to those from RP-MNL.Furthermore, the parameters of gender, e-bike behavior, bicycle type, and speed limit are found to be random parameters in RP-MNL, presenting significant heterogeneous effects on e-bike collision severities.The DIC for FP-MNL is 457.3 while the DIC for RP-MNL is 423.6, showing a difference in DIC of 33.7.As pointed out by Spiegelhalter et al. [20], models with DIC values' difference within 2-7 show less support to the higher DIC model.The comparison results indicate that the proposed RP-MNL has better model performance than the FP-MNL, which confirms that accommodating the unobserved heterogeneity across observations could improve the model fit.

Interpretation of Model Estimates
Given that the RP-MNL outperforms the FP-MNL, it was selected for evaluating the risk factors associated with e-bike collision severities.As shown in Table 3, gender is found to be significantly associated with e-bike collision severities.Males are 1.88 times more likely to be involved in F/I e-bike collisions than females.Moreover, males are found to be 2.88 times more likely to be involved in NI e-bike collisions than females.This finding is consistent with previous studies [10,12] which showed that males are more crash prone while women are more risk averse.Wang et al. [9] also pointed out that male e-bike riders are more likely to be at fault in collisions than female riders.The parameters for this variable is normally distributed with (0.66, 0.45) for F/I collision and (1.13, 0.67) for NI collision.As shown in Figure 1a,b, the parameters distributions indicate that 92.88% of the male e-bike riders prefer to be involved in F/I e-bike collisions than female riders, whereas the remaining 7.12% of male e-bike riders have lower probability of being involved in F/I e-bike collisions than female riders.Similarly, 95.42% of the male riders have higher likelihood in NI e-bike collisions than female riders, while 4.58% of male riders are less likely to be involved in NI e-bike collisions than female riders.The result confirms the heterogeneous effects across individuals.E-bike collision severity is found to be significantly affected by e-bike riders' behaviors.Violation e-bike riders are 3.24 times more likely to be involved in F/I collisions and 2.78 times more likely to be involved in NI collisions.Distracted e-bike riders were 2.56 times more likely to be involved in F/I collisions and 2.04 times more likely to be involved in NI collisions.The result is straightforward because violation and distraction could increase e-bike riders' crash risk.The parameters of this variable (distraction) follow normal distributions with (0.97, 0.63) for F/I collisions and with (0.77, 0.63) for NI collisions.As shown in Figure 2a and Figure 2b, the distraction could increase the probability of F/I e-bike collisions for 93.82% e-bike riders, while for the other 6.18% ebike riders, the probability of F/I e-bike collisions decrease.Similarly, distraction increases probability of NI e-bike collisions for 88.92% e-bike riders, while it decreases probability of NI e-bike collisions for 11.08% e-bike riders.Age is found to be significantly related to e-bike collision severities.According to the OR analysis, young and middle-aged e-bike riders are found to be 2.47 times and 1.45 times more likely to be involved in F/I e-bike collisions than old e-bike riders.Moreover, young and middle-aged e-bike riders are 1.57times and 1.44 times more likely to be involved in NI e-bike collisions than old e-bike riders.The finding is consistent with those of previous studies [21,22].Bernhoft and Carstensen [21] found that older e-bike riders are more cautious than younger riders, leading to a lower collision risk.Furthermore, Wu and Liu [22] found that young and middle-aged e-bike riders have higher probability of violating traffic rules than older e-bike riders.However, this finding is inconsistent with Hu et al. [6] and Wang et al. [9].Hu et al. [6] found that older e-bike riders have greater injury severity than the younger group.Wang et al. [9].found that older e-bike riders are more likely to be at fault in a collision.E-bike collision severity is found to be significantly affected by e-bike riders' behaviors.Violation e-bike riders are 3.24 times more likely to be involved in F/I collisions and 2.78 times more likely to be involved in NI collisions.Distracted e-bike riders were 2.56 times more likely to be involved in F/I collisions and 2.04 times more likely to be involved in NI collisions.The result is straightforward Sustainability 2019, 11, 2071 9 of 12 because violation and distraction could increase e-bike riders' crash risk.The parameters of this variable (distraction) follow normal distributions with (0.97, 0.63) for F/I collisions and with (0.77, 0.63) for NI collisions.As shown in Figure 2a,b, the distraction could increase the probability of F/I e-bike collisions for 93.82% e-bike riders, while for the other 6.18% e-bike riders, the probability of F/I e-bike collisions decrease.Similarly, distraction increases probability of NI e-bike collisions for 88.92% e-bike riders, while it decreases probability of NI e-bike collisions for 11.08% e-bike riders.
E-bike collision severity is found to be significantly affected by e-bike riders' behaviors.Violation e-bike riders are 3.24 times more likely to be involved in F/I collisions and 2.78 times more likely to be involved in NI collisions.Distracted e-bike riders were 2.56 times more likely to be involved in F/I collisions and 2.04 times more likely to be involved in NI collisions.The result is straightforward because violation and distraction could increase e-bike riders' crash risk.The parameters of this variable (distraction) follow normal distributions with (0.97, 0.63) for F/I collisions and with (0.77, 0.63) for NI collisions.As shown in Figure 2a and Figure 2b, the distraction could increase the probability of F/I e-bike collisions for 93.82% e-bike riders, while for the other 6.18% ebike riders, the probability of F/I e-bike collisions decrease.Similarly, distraction increases probability of NI e-bike collisions for 88.92% e-bike riders, while it decreases probability of NI e-bike collisions for 11.08% e-bike riders.The binary variable license plate is found to be significantly associated with e-bike collision severities.The result showed an OR of 0.57 in F/I e-bike collision and 0.59 in NI e-bike collision for this variable, indicating that e-bike riders who install a license plate on their e-bikes are less likely to be involved in both F/I and NI collisions.The result is consistent with Guo et al. [10] that e-bike collisions have strong negative relations with e-bike license plate use.Weinert et al. [12] highlighted that imposing a license system for e-bikes could make it easier to enforce traffic laws.As such, the encouragement of the use of license plates is an effective countermeasure to improve the e-bike safety.The binary variable license plate is found to be significantly associated with e-bike collision severities.The result showed an OR of 0.57 in F/I e-bike collision and 0.59 in NI e-bike collision for this variable, indicating that e-bike riders who install a license plate on their e-bikes are less likely to be involved in both F/I and NI collisions.The result is consistent with Guo et al. [10] that e-bike collisions have strong negative relations with e-bike license plate use.Weinert et al. [12] highlighted that imposing a license system for e-bikes could make it easier to enforce traffic laws.As such, the encouragement of the use of license plates is an effective countermeasure to improve the e-bike safety.
Bicycle type is significantly related with e-bike collision severities.Based on the OR analysis, e-bikes are found to be 0.65 times less likely to be involved in F/I collisions and 0.61 times less likely to be involved in NI collisions than e-scooters.This finding is in line with several studies [4,16,22] which found that the e-scooters are more risky due to their higher operating speeds and their conflicts with other road users.The parameter of this variable is normally distributed with (−0.44, 0.25) for F/I e-bike collision and (−0.47, 0.32) for NI e-bike collision.As shown in Figure 3a,b, 96.08% of the e-bikes have lower probability of being involved in F/I collisions than e-scooters, whereas the remaining 3.92% of e-bikes have higher probability of being involved in F/I collisions than e-scooters.Moreover, 92.9% of the e-bikes are less likely to be involved in NI collisions than e-scooters, whereas 7.1% of the e-bikes are more likely to be involved in NI collisions than e-scooters.The result implies the heterogeneous effects across bicycle types.
the e-bikes have lower probability of being involved in F/I collisions than e-scooters, whereas the remaining 3.92% of e-bikes have higher probability of being involved in F/I collisions than e-scooters.Moreover, 92.9% of the e-bikes are less likely to be involved in NI collisions than e-scooters, whereas 7.1% of the e-bikes are more likely to be involved in NI collisions than e-scooters.The result implies the heterogeneous effects across bicycle types.Location is found to be a significant risk factor with e-bike collision severities.The OR analysis result shows that F/I e-bike collisions are 2.03 times more likely to occur at intersections than at a road segment.NI e-bike collision are found to be 1.87 times more likely to take place at intersections than at road segment.This finding is intuitive that e-bikes need to handle more complicated driving behaviors, such as interacting with turning vehicles, at intersections than at segments.As such, ebikes are explored in more traffic conflicts at intersections than at a road segment, which increases the e-bike crash risk.
Speed limit showed significant positive relation with e-bike collision severities.According to the OR result, e-bikes are 1.79 times more likely to be involved in F/I collisions at a speed limit greater than 45 km/h than that at a speed limit less than 45 km/h.As for the NI collisions, e-bikes are found to be 1.88 times more likely to be involved in NI collisions at a speed limit greater than 45 km/h than that at a speed limit less than 45 km/h.The parameters of this variable were found to be random with a normal distribution with (0.57, 0.44) for F/I collision and (0.35, 0.28) for NI collision.The heterogeneous effect of this variable on F/I collisions and NI collisions can be found in Figure 4a and Figure 4b.Location is found to be a significant risk factor with e-bike collision severities.The OR analysis result shows that F/I e-bike collisions are 2.03 times more likely to occur at intersections than at a road segment.NI e-bike collision are found to be 1.87 times more likely to take place at intersections than at road segment.This finding is intuitive that e-bikes need to handle more complicated driving behaviors, such as interacting with turning vehicles, at intersections than at segments.As such, e-bikes are explored in more traffic conflicts at intersections than at a road segment, which increases the e-bike crash risk.
Speed limit showed significant positive relation with e-bike collision severities.According to the OR result, e-bikes are 1.79 times more likely to be involved in F/I collisions at a speed limit greater than 45 km/h than that at a speed limit less than 45 km/h.As for the NI collisions, e-bikes are found to be 1.88 times more likely to be involved in NI collisions at a speed limit greater than 45 km/h than that at a speed limit less than 45 km/h.The parameters of this variable were found to be random with a normal distribution with (0.57, 0.44) for F/I collision and (0.35, 0.28) for NI collision.The heterogeneous effect of this variable on F/I collisions and NI collisions can be found in Figure 4a,b.

Conclusions
This study investigated the unobserved heterogeneity in severity of e-bike collisions.E-bike collisions data from the city of Ningbo, China were used for the evaluation.A random parameters multinomial logit regression (RP-MNL) was developed to explore the risk factors associated with ebike collision severities.The RP-MNL was estimated using the full Bayesian approach and compared with the fixed parameters multinomial logit regression (FP-MNL).The unobserved heterogeneous effects associated with observations were captured successfully by the proposed RP-MNL.
The comparison results showed that RP-MNL outperformed FP-MNL reflecting by decreased DIC.The RP-MNL estimates showed that seven risk factors, including age, gender, e-bike behavior, license plate, bicycle type, location, and speed limit, were found to be significantly related to both F/I

Conclusions
This study investigated the unobserved heterogeneity in severity of e-bike collisions.E-bike collisions data from the city of Ningbo, China were used for the evaluation.A random parameters multinomial logit regression (RP-MNL) was developed to explore the risk factors associated with e-bike collision severities.The RP-MNL was estimated using the full Bayesian approach and compared with the fixed parameters multinomial logit regression (FP-MNL).The unobserved heterogeneous effects associated with observations were captured successfully by the proposed RP-MNL.
The comparison results showed that RP-MNL outperformed FP-MNL reflecting by decreased DIC.The RP-MNL estimates showed that seven risk factors, including age, gender, e-bike behavior, license plate, bicycle type, location, and speed limit, were found to be significantly related to both F/I e-bike collisions and NI e-bike collisions.Although previous studies found risk factors such as age [6,21,22] and gender [9,10,12] contributed to e-bike collisions, they are limited to their capacity to explain the impact of such risk factors on different severity levels of e-bike collisions.In addition, these prior studies did not take the unobserved heterogeneity into the modeling process.In this study, four risk factors, i.e., gender, e-bike behavior, bicycle type, and speed limit, were found to have heterogeneous effects on e-bike collision severities, appearing in the form of random parameters in the statistical model.
There are several limitations in this study.The data used in this study was limited to only one city.E-bike riders' driving patterns may vary among different cultural contexts and traffic environments, leading to different characteristics of e-bike collisions.As such, the transferability of the proposed model should be validated using the data from other cities.Given the difficulty in collecting field data, e-bike riders' socioeconomic characteristics, such as income, educational background, and profession, were not included in the model.Future work could include these variables in the model and explore the mechanism of these variables on e-bike collision severities.Due to the difference in fatal injuries among age groups, future study could model e-bike crash severity separately for different age groups.
represents the unstandardized deviance of the postulated model; D represents the posterior mean of D; D represents the point estimate obtained by substituting the posterior means of the model's parameters in D; p D is a measure of model complexity.The model with a smaller DIC outperforms the model with a larger DIC.

Figure 1 .
Figure 1.Varying effects of gender on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision; (b) Distribution of parameter estimates for NI collision.

Figure 1 .
Figure 1.Varying effects of gender on e-bike collision severities.Distribution of parameter estimates for F/I collision; (b) Distribution of parameter estimates for NI collision.

Figure 2 .
Figure 2. Varying effects of e-bike behavior on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision.(b) Distribution of parameter estimates for NI collision.

Figure 2 .
Figure 2. Varying effects of e-bike behavior on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision.(b) Distribution of parameter estimates for NI collision.

Figure 3 .
Figure 3. Varying effects of bicycle type on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision.(b) Distribution of parameter estimates for NI collision.

Figure 3 .
Figure 3. Varying effects of bicycle type on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision.(b) Distribution of parameter estimates for NI collision.

Figure 4
Figure 4 Varying effects of speed limit on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision.(b) Distribution of parameter estimates for NI collision.

Figure 4 .
Figure 4. Varying effects of speed limit on e-bike collision severities.(a) Distribution of parameter estimates for F/I collision.(b) Distribution of parameter estimates for NI collision.

Table 1 .
Descriptive statistics of e-bike collisions severity and explanatory variables.
* Selected as the base of the categorical variable.

Table 2 .
Estimates of parameters in FP-MNL.

Table 3 .
Estimates of parameters in RP-MNL.