Surrogate Safety Measures from Tra ﬃ c Simulation: Validation of Safety Indicators with Intersection Tra ﬃ c Crash Data

.


Introduction
In transportation engineering, analytical methods have been applied to address traffic congestion problems by making efforts to shift demand from private vehicles to transit systems [1] and to put into operation better road traffic control implementing tools, such as traffic simulation [2][3][4][5][6][7], network dynamic equilibrium models [8][9][10], and other methodologies that are devoted to change user route choice [11][12][13][14][15]. Analytical methods based on traffic network simulation have rarely been implemented in order to produce an evaluation of the level of safety of a given traffic scenario. In this paper, traffic simulation is applied to assess safety levels related to the risk of road crashes.
Classical statistical methods can extract relevant information, such as the probable causes of crashes, from crash databases.

•
Time to collision (TTC), which is the measure of the interval between two vehicles that will collide, if they keep their present trajectory. When vehicles are moving the value of this indicator can change so the significant value for the risk of one conflict is assumed as the minimum time to collision considering the evolution of TTC over time.

•
Post-encroachment time (PET) is generally defined for two vehicles that are following each other in the same direction as the interval between the instant when a leader vehicle is moving on a determinate road space and the instant when a follower vehicle occupies the same road space in Allen et al. [40].
Another notable indicator is: • Deceleration required for avoiding a crash (DRAC), which is the maximum deceleration rate that has to be applied to prevent a crash, this value is calculated with this expression: All the aforementioned traffic indicators can be calculated on vehicle trajectories and the microsimulation tool can be used to generate these trajectories. In this paper we used TTC and PET indicators obtained from microsimulation.
Microsimulation software can provide detailed information on the dynamic evolution of traffic flow, and performance measures such as delay and travel times and/or queue lengths (e.g.,Caliendo [41,42]), as well as pollutant and noise emissions. However, most microsimulation packages do not explicitly evaluate safety levels. Only recently has research been carried out on this issue [3,43,44].
The U.S. Federal Highway Administration [50] proposes the use of simulated conflict indicators to find out dangerous road spots, and analyze convenient planning solutions. The idea is to use the traffic conflicts as surrogate safety indicators instead of crash data; conflicts that are not derived from observed trajectories in the field (which are difficult or impossible to obtain for new projects) but by using microsimulation tools.
Conflict-based techniques can also be used to assess crash risk levels where new connected and autonomous vehicles are introduced [59][60][61].
A new microsimulation package has been instead developed by some co-authors of this paper. The software named TRITONE has been described by Astarita et al. [35,62,63]. TRITONE was developed and implemented with the aims of reproducing traffic dynamics [2,64] and evaluating road safety levels. Since the first original version it has allowed users to evaluate directly and dynamically many road safety indicators such as the deceleration rate for avoiding collision (DRAC) and time to collision (TTC) indicators [65]. In the more recent versions, TRITONE has been checked against SSAM software guaranteeing coherence in the evaluation of safety indicators: The safety evaluation of traditional safety indicators performed inside TRITONE brings about exactly the same results as if SSAM software were applied to the same set of trajectories. New versions also incorporate the simulation of new technologies, such as connected vehicles [43,[66][67][68][69].
TRITONE can estimate traffic conflicts in defined areas of a road scenario allowing the users to make comparison among areas and to help find out where there could be an increased risk of crashes.
The application of microsimulation to evaluate safety levels is potentially very useful, since the surrogate safety measures allow the engineers to foresee dangerous spots in the planning phase of a road project, and thus to apply solutions that can improve a road project in terms of potential road safety.

Some Authors' Concerns in the Use of Classical Surrogate Safety-Measures
It is our opinion that some concerns exist in the use of classic microsimulation-produced surrogate safety-measures regarding four important issues:

4.
Effects of friction and shear forces in traffic flows.
The new safety indicator that is introduced in this paper and tested with real crash data can resolve these four issues, which are briefly discussed in the following:

Human Factor Modelling
The surrogate safety performance indicators that are frequently used in literature are based on measures which do not consider the real dynamic of vehicle crashes. Some crashes cannot be considered, especially those that are the result of some specific serious mistake of the driver, for example, a car crossing a traffic signal regulated intersection when the red light is on, or crashes that are the result of trajectories that do not intersect. In reality, crashes can originate from driver errors and a deviation on given trajectories must be applied as explained by Astarita and Giofrè in [70] to allow microsimulation models to consider these risks.
Astarita and Giofrè, in [71], present a probabilistic approach that is applied to establish a general causal model of traffic conflicts and crashes. The possibility of drivers committing specific errors has practically never been considered in previous works on surrogate safety performance. Bevrani and Chung, in [72], investigate the meaning of driver mistakes, and Stutts and Gish, in [73], affirm that the 30% of crashes considered in the U.S. originated from some driver error. These works list among the possible driver failures: Using a smart-phone, shaving, applying makeup, consuming food or beverages, tuning the radio or adjusting audio-features, turning around and talking to the back of the car where an infant was seated.

Traffic Simulation Packages
Most microsimulation software does not usually include the feature of evaluating surrogate safety performances. The SSAM software, that can be used on trajectories produced by other software in Stutts et al. [74] has some issues that are discussed by Souleyrette and Hochstein in [75], such as not being able to classify the severity of conflicts and not being able to produce maps of conflict locations on the road. In this respect, the authors of Pu et al. [75] propose some possible changes to SSAM software.

Traffic Safety Indicators
Most surrogate safety performance indicators do not evaluate the potential outcome of a conflict. Different dynamics can cause different crashes, and a measure of conflict magnitude with a relationship with the potential seriousness of a crash is desirable in Souleyrette et al. [76]. Only recently has a paper been published on this issue in Ivan et al. [77].

Effects of Friction and Shear Forces in Traffic Flows
Effects of friction and shear forces in traffic flows are not normally taken into consideration in surrogate safety measures. In other words, conflicts between vehicles that are moving on trajectories that do not intersect, or potential conflicts with road-side objects and obstacles, are not taken into account in the most used surrogate safety measures indicators. Vehicles sometimes move in proximity of dangerous road-side objects and can also move in proximity of vehicles that are coming from the opposite direction. In these circumstances, a simple small trajectory modification might translate into a serious crash, but in most published papers these circumstances are not taken into consideration, and this is an important lacuna that hitherto has not been sufficiently well solved in traffic engineering.
In this paper a new traffic safety methodology, which addresses all the above four points and especially introduces human error and the perturbation of trajectories, is presented. This methodology, named 'Zombie Driver,' is described in the subsequent paragraph and applied to certain urban intersection scenarios that have already been investigated in the papers by Caliendo et al. [26,52]. The novelty of this paper is that of applying the new methodology of Zombie Driver on established traffic and crash data set (obtained by the city administration of Salerno) and comparing results with the traditional safety indicators that were validated by Caliendo et al. in [26,52]. Moreover, the validation produced in this paper is based on a comparison with real crashes by using a more recent statistical approach based on random-parameters models.

Materials and Methods
The methodology which is presented in this paper has been introduced in [5,70,78,79] and is based on the following two assumptions: (i) The human error must be considered in traffic conflict indicators since it is the main thing that generates crashes; (ii) the perturbation of the observed (correct) trajectories must be applied in defining a traffic conflict indicator in order to consider collisions with road-side objects and barriers.
The general procedure is depicted in Figure 1 and has the following steps: A. This is the starting data set: The set of trajectories for a given time period in a defined traffic network in a road scenario. Zombie (or bullet) vehicle choice: According to a random law (or deterministic procedure) a travelling vehicle position in a defined instant of time t is extracted from the traffic network; x being the chosen zombie vehicle ( Figure 2) and p(x,t) the position at time t. The speed s(x,t) of vehicle x at time t is also known (as a bi-dimensional vector). All calculations in this paper are based on a deterministic procedure: Every vehicle is considered, every second of the simulation, for a deviated trajectory, and in this way a great number of potential crashes is generated. C.
Deviation of trajectory: The choice of a deviated trajectory in case of a driver error, driver distraction, or a mechanical failure is an open research field. With this model we would like to replicate all kinds of human mistakes a driver could possibly commit. This is not a simple task given the lack of an established probability distribution for driver mistakes. For this reason and in order to implement the first applications of our trajectory perturbation based model, we assumed a certain basic unsophisticated hypothesis, namely that the deviated trajectory follows a straight path at a constant speed which is equal to the speed of the vehicle before the deviation. The laws of momentum in physics suggest that this could be the more likely event in case the driver or the vehicle fails to perform in a normal way. Moreover, the application of a curved deviated trajectory corresponds to a straight trajectory at a certain (variable in time) angle. For this reason, an arrangement was contrived also to introduce the possibility of a certain angle between the deviated trajectory and the original correct trajectory.
In other words, given that the deviated trajectories can be generated into the Zombie Driver software according to any probabilistic law (or deterministic law) considering the initial correct position p(x,t) and speed s(x,t) of vehicle x at time t. We decided in the application reported in this paper to generate three perturbed trajectories ( Figure 3) for each vehicle, with three different fixed angles corresponding to a straight trajectory and two deviated trajectories (+α) to the right and (−α) to the left.
The angle and the speed define the path of the deviated motion that is simulated for a given time of distraction (DT).
D. Simulation of the deviated trajectory: With start at time t, the vehicle x is simulated as moving on the deviated trajectory for the next DT seconds. E. Potential collisions: When the deviated trajectory of a zombie vehicle brings the vehicle to collide with another vehicle ( Figure 4) (it is important to note that in our first implementation all other vehicles are considered as moving on the original trajectories and cannot perform evasive maneuvers) or a road-side object, a potential collision is calculated by assessing the severity of the crash in terms of impact energy or with any other indicator.  network in a road scenario. B. Zombie (or bullet) vehicle choice: According to a random law (or deterministic procedure) a travelling vehicle position in a defined instant of time t is extracted from the traffic network; x being the chosen zombie vehicle ( Figure 2) and p(x, t) the position at time t. The speed s(x, t) of vehicle x at time t is also known (as a bi-dimensional vector). All calculations in this paper are based on a deterministic procedure: Every vehicle is considered, every second of the simulation, for a deviated trajectory, and in this way a great number of potential crashes is generated. C. Deviation of trajectory: The choice of a deviated trajectory in case of a driver error, driver distraction, or a mechanical failure is an open research field. With this model we would like to replicate all kinds of human mistakes a driver could possibly commit. This is not a simple task given the lack of an established probability distribution for driver mistakes. For this reason and in order to implement the first applications of our trajectory perturbation based model, we assumed a certain basic unsophisticated hypothesis, namely that the deviated trajectory follows a straight path at a constant speed which is equal to the speed of the vehicle before the deviation. The laws of momentum in physics suggest that this could be the more likely event in case the driver or the vehicle fails to perform in a normal way. Moreover, the application of a curved deviated trajectory corresponds to a straight trajectory at a certain (variable in time) angle. For Sustainability 2020, 12, x FOR PEER REVIEW 6 of 21 this reason, an arrangement was contrived also to introduce the possibility of a certain angle between the deviated trajectory and the original correct trajectory.
In other words, given that the deviated trajectories can be generated into the Zombie Driver software according to any probabilistic law (or deterministic law) considering the initial correct position p(x,t) and speed s(x,t) of vehicle x at time t. We decided in the application reported in this paper to generate three perturbed trajectories ( Figure 3) for each vehicle, with three different fixed angles corresponding to a straight trajectory and two deviated trajectories (+α) to the right and (−α) to the left.
The angle and the speed define the path of the deviated motion that is simulated for a given time of distraction (DT). The choice of a Y shaped intersection in Figures 2 to 4 has been done just to depict some aspects of the methodology; the methodology applies to all kind of intersections and road traffic layouts.
The impact energy is calculated by applying the law of physics for inelastic collisions and considering individual vehicle parameters such as weight and geometric dimensions. Further details can be found in Astarita and Giofrè [5].  software according to any probabilistic law (or deterministic law) considering the initial correct position p(x,t) and speed s(x,t) of vehicle x at time t. We decided in the application reported in this paper to generate three perturbed trajectories ( Figure 3) for each vehicle, with three different fixed angles corresponding to a straight trajectory and two deviated trajectories (+α) to the right and (−α) to the left.
The angle and the speed define the path of the deviated motion that is simulated for a given time of distraction (DT). The choice of a Y shaped intersection in Figures 2 to 4 has been done just to depict some aspects of the methodology; the methodology applies to all kind of intersections and road traffic layouts.
The impact energy is calculated by applying the law of physics for inelastic collisions and considering individual vehicle parameters such as weight and geometric dimensions. Further details can be found in Astarita and Giofrè [5].  The impact energy is calculated by applying the law of physics for inelastic collisions and considering individual vehicle parameters such as weight and geometric dimensions. Further details can be found in Astarita and Giofrè [5].
The general procedure repeats itself until an established number of zombie vehicles for consideration is reached, or when an established number of crashes are generated. This is one of the parameters of the methodology indicated in the following as Zombie Driver frequency.
This general procedure solves the four above-indicated issues. In fact: 1.
The methodology considers human error and can give a methodological explanation of why some kinds of crashes occurs in the field (e.g., single vehicle crashes which are not considered in classical surrogate safety performance indicators). The general assumption that "proximity" of vehicles is dangerous is maintained, and near-crash events are replaced with "potential crashes." 2.
With the Zombie Driver software, the micro simulators can be used to evaluate the level of safety also in the presence of road-side objects. In this respect, in the procedure, the set of vehicle trajectories is required, and other additional information such as the location, geometrical shape, and elasticity coefficient of road-side objects and barriers. 3.
Since Zombie Driver explicitly simulates potential crashes, it is possible to evaluate the exact crash dynamic in terms of impact energy and the probability of having injuries and deaths. 4.
It allows one to consider conflicts between vehicles that are travelling on close dangerous trajectories that do not overlap.
Many parameters can be modified in the proposed methodology: • The maximum distraction time: The time frame in which a driver is distracted and during which a collision with another vehicle or an obstacle may (or may not) occur. In our calculations we assumed deviated trajectories of 5 s.

•
The distraction angle: The angle that will be given to two new lateral trajectories (considering that a third straight trajectory is always projected with an angle equal to 0 • ). Currently this angle is set as a deterministic variable, but in the future it can be generated according to a Gaussian distribution with known mean and standard deviation. • The minimum energy threshold: As a standard, this parameter is equal to zero in order to take into account all the collisions, but it is possible to modify it in order to consider only the more dangerous collisions that produce at least a fixed energy level.

•
Size of the grid: One of the most useful outputs of Zombie Driver is the risk map. The risk maps are drawn by using a grid that is superimposed over the maps, and in which the risk areas are identified by using a chromatic scale. This parameter represents the size of the discretization of the study area, and therefore the mesh size of the grid. The standard size of the grid we have used in this work is 5 m (square cells). The calculation of intersection indicators has been performed by manually defining the intersection area with a set of square cells belonging to the 5 m defined grid.

•
The zombie driver frequency: The frequency according to which a driver will commit a driving "error" moving on a deviated trajectory-in all calculations performed in this work, as already stated, we considered every vehicle, every second of the simulation as moving on three deviated trajectories.
The methodology is designed to return quick information and allow the user to understand what the critical points of the network are. Starting from regular safe trajectories it generates three deviated trajectories for every vehicle at every second in time. The result of repeating this methodology for every vehicle is a great number of potential crashes. For every potential crash the crash dynamic is perfectly known. With this information it is possible to extract many output parameters as described in the following:

•
Collision energy means average energy (E a ), total energy (E t ), and maximum energy (E m ), detected from potential collisions on the network. It can be generated either by collisions between vehicles only or between vehicles and objects (e.g., trees, walls, etc.) and is divided into: General-Based on all the crashes detected on the road network under study. Single-For single vehicle, distinguished between the part of energy absorbed by the "zombie" vehicle carrying out the crash and the ones who suffer it. This is calculated also for single vehicle crashes and according to the collided object: Vehicle-vehicle collisions, vehicle-rigid objects, and vehicle-elastic objects.
• Difference between vehicle speed vectors (Delta-V), average (dV a ), total (dV t ), and maximum (dV m ). This represents the module of the vector difference between the two vehicles' speed vectors during a collision. • Index of collision severity, average (G a ), total (G t ), and maximum (G m ). This is the ratio between the energy that is generated during the collision and the time of distraction necessary for the event.

•
Number of collisions, total (C t ) or per angle. These are the number of collisions that occurred during the analysis of the network, and they can also be differentiated according to the angle of the deviated trajectory. • Distraction time, average (T a ) or per angle. This represents the average time that a "zombie" vehicle was driving on a deviated trajectory before being involved in a collision.
Since the Delta-V value for every vehicle involved in a potential crash is known, it is possible to estimate the probability of having a death or an injury for every vehicle involved in a potential crash. The formula used was proposed by Joksch in [80], and used by Evans in [81], and directly establishes a probability of death or injury as a function of Delta-V: P = ∆V α k , where α and k are two fixed parameters.
With this formula it is possible to calculate indicators for the potential number of deaths and injuries: • Number of deaths, with (D y ) or without (D n ) safety belt. This is the sum of the probabilities of having or not having a dead person during a collision depending on the value of the difference between vehicle speed vectors.

•
Number of injured, with (I y ) or without (I n ) safety belt. This is the sum of the probabilities of having or not having an injured person during a collision, depending on the value of the difference between vehicle speed vectors.
It must be noted that in this work we assumed a uniform distribution of errors among drivers, which is a simplification, since it would be possible to imagine a population of drivers with different characteristics and different potential errors. The model has the potential to explore changes in the driver population and the introduction of automatic vehicles by differentiating the potential errors among different vehicle categories. This is beyond the scope of this paper and represents a future promising development of our work, which might be worth investigating, once the necessary input data set is obtained.
The three main factors of traffic crashes are considered human factors, vehicle factors, and road environment factors. The classical safety performance analysis considers the first two factors by evidencing a risky situation when vehicles are driving close to each other in situations where a human error or a vehicle failure could lead to a crash. The Zombie Driver methodology also considers the road environment factor since a potential risky situation is also evidenced when a vehicle is driving near a dangerous road-side object. The Zombie Driver procedure in general can assess safety of a specific road-side design. In this paper this feature has not been applied since the road-side obstacles where not simulated in the Zombie Driver modelling.

Results
The crash data were extracted from the official reports of the urbane police of the city of Salerno, while traffic flows were computed by using video cameras placed at each intersection (for a greater in depth knowledge see El-Basyouny et al. [53]). In particular, four intersection scenarios (specified from I to IV) containing one or more single intersections (i = 1, . . . ,9) were investigated: I (i = 1, 2, and 3), II (i = 4 and 5), III (i = 6, 7, and 8), and IV (i = 9). In Table 1 are reported, during the peak hours, the corresponding crash data observed over a period of 5 years and entering traffic volumes VHP e,i .

Micro-Simulation Calibration
The study scenarios were simulated through TRITONE. The supply and demand of the four areas in the various peak times has been modelled. Each scenario was calibrated using the information of the traffic flows available in the various road sections and the application of the GEH statistic. The GEH statistic was introduced by Geoffrey E. Havers in the 1970s and is an empirical formula that has proven useful for various traffic analysis purposes. In our case we apply the GEH statistic by comparing simulated flows with real measured flows: By applying this methodology in each section of the various simulated scenarios it was possible to establish that the GEH value was less than 5.0 in all the cases studied, and so the total success rate satisfies the standard requirement of having positive cases in more than 85% of the total cases, as indicated by the UK Highways Agency's Design Manual for Roads and Bridges [82].
In all the simulations, Gipps' car-following model was used and the flow was generated on the network with a negative exponential distribution between departures, while 30 random simulations were carried out for each scenario. The main output that TRITONE provided, in order to perform an analysis using the Zombie Driver methodology, was the set of vehicular trajectories. The exchanged data between TRITONE and Zombie Driver were not only the coordinates of the center of gravity of the vehicle, but also the four edges of the vehicle and the other vehicles' specific parameters such as speed and mass.

Analysis of Simulation Results
The outcomes of simulation expressed in terms of the mean collision energy (E a ) and the number of collisions (C) evaluated by the Zombie Driver software are reported in Table 2, together with the number of collisions (C*) calculated by using the TTC and PET as safety indicators. It must be noted that the collisions obtained with the Zombie Driver software are the (potential) crashes obtained by the perturbation of all the trajectories of all vehicles every second, while for TTC and PET the collisions are merely the violations of the threshold values of TTC and PET.
A summary statistic of the results is reported in Table 3.

Literature Review on Random-Parameter Models
Given the simulation outputs, in terms of the safety indicators summarized in Table 3, we performed a statistical crash analysis, to establish the performance of each indicator in providing an estimation for the known number of real occurred crashes.
The potentiality of traditional statistical models and new ones has been discussed above all in [83][84][85]; as well as in [24,25,52] with reference to Italian studies. In the last decade, the random-parameter models, in which it is assumed that the regression parameters are random, have been more and more adopted to take into account the unobserved heterogeneity; which is associated with the variables (covariates). In this respect, several studies show the advantages of random-parameter models compared to the fixed-parameter ones in terms of goodness-of-fit [26,[86][87][88][89]. However, it is to be noted that in the aforementioned studies the random parameters are assumed to be independent of each other. According to Conway and Kniesner [90], instead, when the correlations between the random parameters are neglected, the estimates may be biased.
The correlated random-parameter models have still been little investigated in the literature. Some studies can be found in Conway et al. [91] that employed a correlated random-parameter Tobit model, showing a better goodness-of-fit compared to the corresponding uncorrelated random-parameter model. Coruh et al. [92] found that the correlated random-parameter negative binomial (NB) model was statistically superior to the corresponding uncorrelated random-parameter NB model. Hou et al. [93] also found that the correlated random-parameter NB model is more appropriate. By investigating the crash frequency in Italian tunnels, Caliendo et al. [18] developed a univariate correlated random-parameter model, which is better in terms of goodness-of-fit if compared both to the uncorrelated random-parameter model and the random-intercept one (the latter is also known as random-effect model; for a more in-depth knowledge see Greene [94] and Hilbe [95]. Finally, Saeed et al. [96] found that the correlations among the random parameters were statistically significant.
In light of the above considerations, given the aforementioned lacuna of studies, in this paper we apply a correlated random-parameters model for the statistical analysis of crashes that occurred at the four intersections scenarios investigated.

Methodology
In crash analysis, it is typically assumed that the fluctuation of crash counts, say Y i , which occurs at the peak hour time i of the investigated intersection scenarios during the 5-year monitoring period, is a negative binomial (NB) distribution [53]. However, since we found from a preliminary statistical analysis that α (over-dispersion parameter of the NB model) converged to zero, a univariate correlated random-parameter Poisson (CRPP) model was used in the present paper. Therefore, Y i is distributed as follows: Let λ i and X denote the expectation of Y i and the vector of k covariates, respectively. For λ i , the following log-linear function is used: where β i is the vector of random-parameters for observation i, which assumes a multivariate normal distribution written as follows: where b is the mean vector; C is the variance-covariance matrix, which allows one to take into account the correlations among the elements of β i ; and ω i is the randomly and independently distributed uncorrelated vector term. In particular, b and C are defined as follows: in which j is the number of random parameters. Clearly, in an uncorrelated random-parameter model, the off-diagonal elements in the variance-covariance matrix are equal to zero.
In the present paper, we investigate and compare two different CRPP models, which include different sets of covariates: (i) Model A establishes a relationship between: Collisions (C) calculated in simulation on the basis of the safety indicators such as the TTC and PET, traffic flow (TF, indicated in the text also as VHP e;i ), and dummy variable (Du) for accounting for the occurrence of a much higher number of crashes at the Scenario I in the afternoon peak hours (between 12:00 a.m. and 1:00 p.m., and between 1:00 p.m. and 2:00 p.m.); (ii) Model B involves the following explanatory variables: Mean energy (E a ), traffic flow (TF), and dummy variable (Du). Note that the Du assumes a value of 1 with reference to the afternoon peak hours of the Scenario I and 0 elsewhere.
In each model, the maximization procedure of the likelihood function is employed for estimating the parameters of the model, while a method based on 2000 Halton draws (see in detail Saeed et al. [97]) is adopted to obtain an efficient convergence for the numerical integration in the estimate.

Procedure for Choosing the Statistically More Significant Variables and Models' Comparison
In order to identify the subset of explanatory variables the effect of which is not negligible on crash counts, a procedure based on the likelihood ratio test (LRT) is adopted. The LRT statistic, which in our case has an asymptotic chi-square distribution (χ 2 ) with 1 degree of freedom, is defined as follows: where l β ,φ is the log-likelihood of the regression model containing all covariates; and l β , β k = 0,φ is the log-likelihood of the regression model with the covariate k out. Since a significance level of 10% is assumed in this paper, a variable is included in the regression model when the LRT statistic corresponding to the model without the same is at least 2.71.
With the purpose of comparing Models A and B in terms of goodness-of-fit, also the root mean square error (RMSE) is used. In general, models with lower RMSE are preferred: in which Y i is the crash associated with observation i; λ i denotes the expectation of Y i ; and n is the sample size. Tables 4 and 5 show the estimations results provided by Models A and B, respectively.   Table 4 it is possible to observe that the expected number of crashes increases in a non-linear way with: The collisions (C*), traffic flow (TF), and dummy variable (Du). For Model B (Table 5), instead, one may note that the expected number of crashes is negatively associated with the mean energy (E a ) and positively with other variables (i.e., traffic flow (TF), and dummy variable (Du)). In both models, the LRT statistic associated with each variable is at least 2.71. This means that all covariates are statistically significant at a level less than 10%.

Discussion and Analysis of Results
However, given the values of the log likelihood function, Model B is statistically equivalent to Model A.
Since the negative sign associated with the covariate E a (in Model B) does not appear to have a logical explanation, in order to get a better understanding of the effects of the E a on the crashes, according to the authors the correlation coefficients matrix of random parameters should be more especially commented on. Tables 6 and 7 report the correlation coefficients matrix of random-parameters for Model A and Model B, respectively.  In particular, Table 7 shows that the mean energy (E a ) is positively correlated with the traffic flow (TF) (i.e., correlation coefficient = +0.11). This means that an increase of the E a makes the effect of traffic flow more significant on increasing crashes.
Regarding the comparison between Models A and B in terms of-goodness-of-fit, this was carried out in terms of the RMSE. From Table 8 one can see that the estimated number of crashes is quite similar for both models, as well as their being almost equal to the number of real crashes observed in the field. Summing up, Models A and B and are statistically equivalent both in terms of log likelihood function and RMSE. The total number of crashes in 5 years estimated by the aforementioned models are also similar to each other. Moreover, the total number of estimated crashes is almost equal to that measured in the field over the same period of time. Therefore, the results appear to justify the use of the mean energy (E a ), in alternative to the number of collisions (C*), as a surrogate safety measure.

Summary and Conclusions
This research was motivated by the need to explore the potentiality of the proposed methodology for assessing safety at urban intersections using the collision energy as a surrogate safety measures in alternative to the number of collisions based on (the) TTC and PET.
The results show that the statistical model (Model B), in which the mean energy (E a ) is present, is statistically equivalent to the model based on collisions (Model A), and that it estimates the same number of crashes observed in the field.
The statistical analysis applied in Chapter 6 that is performed with state of the art statistical modelling shows, with the coefficient correlation matrix, that the E a makes the effect of traffic flow on increasing crashes more significant, in contrast with the negative sign of the estimation point.
In conclusion, we can affirm that this methodology can allow engineers to quickly individualize the critical points of a road network by using simulated vehicular trajectories and the collision energy as a surrogate safety indicator.
The proposed Zombie Driver methodology depends on many parameters that should be properly calibrated. It would be logical to calibrate our methodology according to a rich set of traffic networks where crash data and traffic data are both available. These calibration efforts fall outside the scope of this paper and can be investigated in future research.
This study was carried on in a geographically limited area: Some intersections in the urban network of Salerno. Many parameters and model characteristics were set at default values instead of performing an accurate calibration on a limited data set. The TTC and PET analyses carried out in this paper are also based on many parameters that assume default values. In fact, the original default values adopted in the SSAM software have not been obtained with a real calibration on an extensive set of traffic and crash data.
However, for a more consolidated verification of the methodology proposed a more extended data set (possibly created with more numerous intersections of different cities) could become the base to obtain more general and more reliable results. Therefore, further research and data gathering will help to establish which safety indicator offers the more accurate crash estimations.
The proposed Zombie Driver methodology might also be usefully applied on mixed traffic scenarios with traditional and autonomous vehicles, once accurate information on the specification and development of autonomous vehicles is available. In other words, given the calibrated error frequency of both autonomous vehicles and human driven vehicles, the Zombie Driver procedure could be useful to evaluate the safety of different road traffic layouts. Information on autonomous vehicles could be obtained from traffic sequence charts [98] that could provide "a concise, intuitive specification language for capturing expected and forbidden behaviors of autonomous vehicles in the space of all possible traffic situations." It must be noted that while some of the observed crashes during the 5 years may be caused by adverse weather and poor road surface conditions, our database did not include this information. When one or more variables are omitted in a database of crashes, the so-called "unobserved heterogeneity" phenomenon across data might occur. However, according to the international literature, the unobserved heterogeneity may be captured by using the so-called random-parameter models in the statistical analysis of crashes. This is the case of our study, in which a statistical analysis based on random-parameter modes is used (for a more in depth knowledge, please see Milton et al. [87]); in this sense, our approach (and also the classic SSAM indicators) considers (indirectly) weather and road surface conditions risks. This is because the occurrence of deviated trajectory could originate from these adverse conditions, and a vehicle crash which is in general due to this kind of adverse condition would more likely occur where there is exposure in terms of traffic flows or out-of-scale safety indicators. The reproduction of the 8760 yearly hours of traffic with traffic simulation is possible, given enough traffic and weather data, and is one among the several possible future developments of the topics discussed in this research.