A Comprehensive Assessment of the Existing Accident and Hazard Prediction Models for the Highway-Rail Grade Crossings in the State of Florida

Accidents at highway-rail grade crossings can cause fatalities and injuries, as well as significant property damages. In order to prevent accidents, certain upgrades need to be made at highway-rail grade crossings. However, due to limited monetary resources, only the most hazardous highway-rail grade crossings should receive a priority for upgrading. Hence, accident/hazard prediction models are required to identify the most hazardous highway-rail grade crossings for safety improvement projects. This study selects and evaluates the accident and hazard prediction models found in the highway-rail grade crossing safety literature to rank the highway-rail grade crossings in the State of Florida. Three approaches are undertaken to evaluate the candidate accident and hazard prediction models, including the chi-square statistic, grouping of crossings based on the actual accident data, and Spearman rank correlation coefficient. The analysis was conducted for the 589 highway-rail grade crossings located in the State of Florida using the data available through the highway-rail grade crossing inventory database maintained by the Federal Railroad Administration. As a result of the performed analysis, a new hazard prediction model, named as the Florida Priority Index Formula, is recommended to rank/prioritize the highway-rail grade crossings in the State of Florida. The Florida Priority Index Formula provides a more accurate ranking of highway-rail grade crossings as compared to the alternative methods. The Florida Priority Index Formula assesses the potential hazard of a given highway-rail grade crossing based on the average daily traffic volume, average daily train volume, train speed, existing traffic control devices, accident history, and crossing upgrade records.


Introduction
The intersection of a railway and a roadway is generally referred to as a highway-rail grade crossing. As reported by the Florida Department of Transportation (FDOT) in the year of 2011, the State of Florida had a total of 4503 highway-rail grade crossings, 79% of which were controlled by the State (i.e., public highway-rail grade crossings), while the rest were owned and maintained by private entities [1]. All of these 4503 highway-rail grade crossings represent a potential for accident occurrence between railway traffic and highway traffic. A highway-rail grade crossing accident may have serious consequences, such as fatalities, injuries, property damage, spillage of hazardous materials, and delays Sustainability 2020, 12, 4291 3 of 27 hazard prediction models including the chi-square statistic, grouping of crossings based on the actual accident data, and Spearman rank correlation coefficient. Based on the conducted analysis, the most promising model will be recommended to rank/prioritize the highway-rail grade crossings in the State of Florida.
The remainder of this manuscript is organized as follows. Section 2 presents a concise overview of the research efforts on highway-rail grade crossing accidents. Section 3 selects the candidate accident and hazard prediction models from the ones which have been widely used in the highway-rail grade crossing safety literature. Section 4 provides a detailed description of the evaluation methodology adopted in this study. Section 5 presents the findings from the evaluation of the candidate accident and hazard prediction models for the highway-rail grade crossings in the State of Florida. Finally, Section 6 concludes this study and discusses future prospects.

Review of the Previous Efforts
This section of the manuscript presents an overview of the research efforts on the accident prediction models and the hazard prediction models, which have been used over the years to prioritize highway-rail grade crossings for upgrades. The scientific literature and the efforts undertaken by the state Departments of Transportation (DOTs) will be further reviewed throughout this section.

Review of Scientific Literature
A number of scientific articles on highway-rail grade crossing safety have been published to date [2,4,[8][9][10][11][12][13][14][15][16][17][18][19][20][21]. For example, Austin and Carson [10] argued that the Peabody Dimmick Formula, the New Hampshire Index, and the National Cooperative Highway Research Program (NCHRP) Report 50 Accident Prediction Formula have a limited descriptive potential, as these formulae rely on a few explanatory variables only. It was also outlined that the U.S. DOT Accident Prediction Formula had complexity issues because of its three-stage structure, and it was losing accuracy over time. Hence, the study presented an alternative model which was based on the negative binomial regression. It was underlined that the proposed model was simpler than the U.S. DOT Accident Prediction Formula and had great potential. Saccomanno et al. [11] proposed a risk-based model to identify the highway-rail grade crossings with high vulnerability to accidents. The developed model incorporated two prediction components, including the following: (a) accident frequency prediction; and (b) accident consequence. The developed methodology was applied for the highway-rail grade crossings, which are located in Canada.
Oh et al. [13] developed several statistical models to establish relationships between highway-rail grade crossing accidents and crossing characteristics. It was found that the number of accidents at highway-rail grade crossings increased with increasing total traffic volume and average daily train volume. The proximity of highway-rail grade crossings to commercial areas and the distance of the train detector from crossings also substantially influenced the accident occurrence. Yan et al. [15] developed a hierarchical tree-based regression model to analyze train-vehicle accidents at passive highway-rail grade crossings. The FRA's database and 27 years of the train-vehicle accident history in the U.S. (from 1980 to 2006) were considered in the study. The results demonstrated that installation of stop signs at passive highway-rail grade crossings can significantly improve the safety level. Chadwick et al. [4] conducted a literature review regarding highway-rail grade crossing accidents for high-speed passenger rail and heavy freight rail in the U.S. It was highlighted that adoption of high-speed passenger rail services on the existing freight railroads most likely would cause additional safety issues.
Hao and Daniel [16] assessed the influence of existing protection on driver injury severity in highway-rail grade crossing accidents. The ordered probit models were developed for both passive and active protection. The results indicated that speeds of vehicles and trains had a significant impact on driver injury severity. In particular, vehicles and trains traveling at speeds greater than Sustainability 2020, 12, 4291 4 of 27 50 mph substantially increased the probability of fatal accidents for both passive and active crossings. Lu and Tolliver [18] applied various modeling techniques using the same datasets in order to tackle the under-dispersion issue (i.e., sample mean is greater than sample variance). The study results indicated that the existing protection, train volume, traffic volume, maximum train speed, highway pavement, number of tracks, and accident history were the critical factors influencing the occurrence of accidents at highway-rail grade crossings. Khan et al. [20] proposed a binary logit regression model which was validated with 2000-2016 highway-rail grade crossing accident data and collected for the State of North Dakota. Several important predictors were highlighted, including the number of daily trains, number of through railroad tracks, maximum typical train speed, number of highway/traffic lanes, and presence of pavement markings.
Many of the aforementioned studies aimed to improve safety of highway-rail grade crossings in the U.S. However, there are many studies that investigated safety issues and evaluated different safety improvements for highway-rail grade crossings in other countries as well, including Australia [3,5,[22][23][24][25][26][27], Finland [28], Great Britain [29], Hungary [30], and Taiwan [31], among others. Therefore, accidents at highway-rail grade crossings can be considered as an important issue not only in the U.S. but around the globe.

Review of the State DOT Efforts
The Virginia Highway & Transportation Research Council analyzed nationally recognized accident/hazard prediction models [32]. The study identified 13 accident/hazard prediction models, five of which were selected for evaluation, including: (1) the U.S. DOT Accident Prediction Formula; (2) the Peabody-Dimmick Formula; (3) the NCHRP Report 50 Accident Prediction Formula; (4) the Coleman-Stewart Model; and (5) the New Hampshire Formula. The chi-square test and the power factor analysis revealed that the U.S. DOT Accident Prediction Formula was superior to the other formulae in terms of ranking the most hazardous highway-rail grade crossings in the State of Virginia. Bowman [33] performed a survey among the rail-highway safety program coordinators in each state of the U.S., excluding Hawaii. The following accident/hazard prediction formulae were identified by the survey: (1) the NCHRP Report 50 Accident Prediction Formula-adopted by one state; (2) the Peabody Dimmick Formula-adopted by two states; (3) the New Hampshire Hazard Index Formula-adopted by six states; and (4) the U.S. DOT Accident Prediction Formula-adopted by 11 states. Custom accident/hazard prediction formulae were used by 13 states that participated in the conducted survey. The survey results outlined that about 83% of the states that relied on the New Hampshire Hazard Index Formula were generally satisfied with its performance. Moreover, about 82% of the states were satisfied with the U.S. DOT Accident Prediction Formula.
The Illinois DOT and the Illinois Transportation Research Center performed a study with an objective to evaluate various accident and hazard prediction formulae for the State of Illinois [34]. A total of 32 states participated in the survey that was undertaken as a part of the project. The key predictors used in various accident and hazard prediction formulae were discussed. Furthermore, the study conducted a comprehensive regression analysis to identify the factors that influence the accident occurrence the most based on the highway-rail grade crossing data for the State of Illinois. The Modified Expected Accident Frequency Formula was proposed by the study to prioritize highway-rail grade crossings. The University of Missouri-Columbia/Rolla conducted a study in collaboration with the Missouri DOT, aiming to evaluate a total of seven accident/hazard prediction models [35]. A new exposure index formula was also proposed, which relied on the features of the Kansas Design Hazard Rating Formula. The Spearman rank correlation coefficient factor was adopted in the study to quantify the difference between the predicted rankings and the baseline rankings (i.e., the rankings that were based on the actual accident data). The analysis results demonstrated that the Illinois Hazard Index Formula was the most accurate for active highway-rail grade crossings. On the other hand, the California Hazard Rating Formula was the most accurate for passive highway-rail grade crossings. Weissmann et al. [36] performed a study aiming to design a new methodology for prioritizing public highway-rail grade crossings for safety improvement projects in the State of Texas. The Revised Texas Priority Index was developed in order to address the existing drawbacks of the previously used Texas Priority Index Formula. The results from the analysis, conducted using 2011 accident data for 9108 highway-rail grade crossings in the State of Texas, showed a clear superiority of the Revised Texas Priority Index over the original Texas Priority Index Formula in terms of ranking the most hazardous highway-rail grade crossings. A benefit-cost analysis had been used by the Iowa DOT to prioritize the highway-rail grade crossings for safety improvement projects in the State of Iowa [37]. Furthermore, the Iowa DOT, in collaboration with the Institute for Transportation at Iowa State University, presented a weighted-index method along with a Microsoft Excel spreadsheet-based tool in order to prioritize the most hazardous public highway-rail grade crossings [38]. A number of factors were identified which were deemed critical by the stakeholders, including the truck traffic volume, traffic volume, proximity to schools, proximity to emergency medical services, and road system type, as well as out-of-distance travel.
Ryan and Mielke [39] identified the most common factors used in the existing accident/hazard prediction formulae with the aim to prioritize the highway-rail grade crossings in the State of Nevada. Based on the collected data, it was found that the train volume and the highway traffic volume were the key components for each one of the considered accident/hazard prediction formulae. It was also stated that the Nevada Hazard Index Model, which was used by the Nevada DOT at that time, should include a factor for train speed. Sperry et al. [40] evaluated the U.S. DOT Accident Prediction Formula against several alternative formulae for the highway-rail grade crossings in the State of Ohio. A total of six formulae were considered, including the following: (1) the Texas Priority Index Formula; (2) the North Carolina Investigative Index Formula; (3) the Missouri Exposure Index Formula; (4) the Florida Accident Prediction and Safety Index Formula; (5) the NCHRP Report 50 Accident Prediction Formula; and (6) the New Hampshire Hazard Index Formula. The U.S. DOT Accident Prediction Formula was found to be superior to the alternative formulae.

Literature Summary and Contributions of This Work
A review of the previous research efforts suggests that accident and hazard prediction formulae could become outdated. Moreover, an accident/hazard prediction formula can be accurate for one state but may not work for the other states. Besides, the values of different predictors may not be available in the state highway-rail grade crossing accident databases for certain accident/hazard prediction models. Considering the significant number of accidents at the highway-rail grade crossings in the State of Florida, this study aims to evaluate different accident/hazard prediction formulae for the highway-rail grade crossings in the State of Florida using the publicly available information provided by the FRA's highway-rail grade crossing accident database and the FRA's highway-rail grade crossing inventory database. Three evaluation approaches, identified from the reviewed literature, will be undertaken in this study, including the following: (1) chi-square statistic; (2) grouping of crossings based on the actual accident data; and (3) Spearman rank correlation coefficient. Based on the results, a final model will be recommended for the highway-rail grade crossings in the State of Florida.

Selection of the Candidate Accident and Hazard Prediction Models
The accident/hazard prediction formulae, identified throughout review of the highway-rail grade crossing safety literature, were classified into two groups: (1) accident prediction formulae; and (2) hazard prediction formulae. The accident prediction models can be used to forecast the expected number of accidents over a certain time period. On the other hand, the hazard prediction models can be used to forecast the expected vulnerability of highway-rail grade crossings to accidents without specifying the number of predicted accidents. A total of 21 accident/hazard prediction formulae have been identified as a result of the conducted literature review. Out of 21 identified formulae, six formulae or 29% can be categorized as accident prediction formulae. The remaining 15 formulae or As a result of a detailed literature review, the following predictors were found to be the most common in the identified accident and hazard prediction models: (1) total number of trains per day; (2) total number of vehicles per day; (3) existing protection (i.e., type of warning devices used at a highway-rail grade crossing); (4) accident history; (5) train speed; (6) number of tracks; (7) sight distance; (8) number of traffic lanes; (9) highway vehicular speed; and (10) location (i.e., urban or rural designation). These predictors have been widely used by different state DOTs for ranking highway-rail grade crossings for safety improvement projects in the respective states.
Note that some of the accident and hazard prediction formulae could not be evaluated for the State of Florida due to the limited data available in the FRA's highway-rail grade crossing accident database and the FRA's highway-rail grade crossing inventory database. For example, the FRA's highway-rail grade crossing accident database and the FRA's highway-rail grade crossing inventory database do not provide the information regarding the sight distance at highway-rail grade crossings, which serves as a predictor for a number of hazard prediction models (e.g., the Kansas Design Hazard Rating Formula, the Missouri Exposure Index Formula, the New Mexico Hazard Index Formula, the South Dakota Hazard Index Formula). Furthermore, the North Carolina Investigative Index Formula requires the information regarding the average number of school bus passengers [34], which is not provided in the FRA's highway-rail grade crossing inventory database. The Nevada Hazard Index Model relies on the total number of near misses within the past three years in order to estimate a hazard index for a given highway-rail grade crossing [39], which is not available in the FRA's highway-rail grade crossing accident database.
The Florida Accident Prediction and Safety Index Formula was not evaluated in this study due to the fact that it requires information regarding the sight distance at highway-rail grade crossings. However, the existing state highway-rail grade crossing inventory databases do not provide any up-to-date information regarding the sight distance at the highway-rail grade crossings in Florida. Hence, the Florida Accident Prediction and Safety Index Formula had to be withdrawn from the analysis. Therefore, four accident prediction formulae and six hazard prediction formulae were selected for the final analysis due to the data availability issue. A detailed description of the candidate accident prediction models and the candidate hazard prediction models that were considered in this study is provided in the Appendix A. Note that the description of the models was compiled based on the available literature [34][35][36][37][38][39][40][41][42].

Evaluation Methodology
This section of the manuscript focuses on the methodology and criteria that were used to evaluate the candidate accident/hazard prediction models.

Input Data and Key Assumptions
As indicated earlier, the FRA's highway-rail grade crossing accident database and the FRA's highway-rail grade crossing inventory database were used as the primary data sources to evaluate the candidate accident/hazard prediction models for the highway-rail grade crossings in the State of Florida. A number of assumptions were adopted throughout the evaluation process. The key assumptions include the following:

1.
A given highway-rail grade crossing will be excluded from the analysis if there is a missing value (i.e., "empty cell") in the FRA's highway-rail grade crossing accident database and/or the FRA's highway-rail grade crossing inventory database for a certain predictor which is directly used by a given accident/hazard prediction model.

2.
If the FRA's highway-rail grade crossing inventory database provides "zero" values for certain predictors that are associated with a given highway-rail grade crossing (including AADT, total number of trains per day, maximum train time table speed, number of main tracks, number of main and other tracks, and number of traffic lanes), they will be reset to "1." Such a modification in the predictor values is required to assure that no abnormal accident or hazard prediction values (e.g., "-∞", "+∞") will be returned by the candidate accident/hazard prediction models. For example, if the AADT value and/or the total number of trains per day are reported to be "zero" in the FRA's highway-rail grade crossing inventory database for a given highway-rail grade crossing (i.e., less than one vehicle and/or less than one train traverse a given highway-rail grade crossing per day), there will be some issues when estimating the additional parameter (K) for the Peabody-Dimmick Formula. The issue consists of the fact that the unbalanced accident factor (l u ) will become "zero," which is outside the allowable boundaries as the l u value cannot be lower than "0.5" throughout estimations of the additional parameter K (see the Appendix A for more details). However, rounding the AADT value and the total number of trains per day to "1" would resolve the issue.

3.
If a given accident/hazard prediction model does not provide the protection factor values for certain protection types, the worst-case values for protection factors will be used in the analysis. For example, the New Hampshire Hazard Index Formula does not recommend any particular protection factor value for the highway-rail grade crossings that are equipped with crossbucks.
Since the New Hampshire Hazard Index Formula assumes the worst-case protection factor value for stop signs (i.e., PF = 1.00), the protection factor for the highway-rail grade crossings with crossbucks will be set to PF = 1.00. Such an assumption is required in order to avoid a significant elimination of highway-rail grade crossings from the analysis due to absence of the protection factor values for certain protection types.

4.
The accident data will be excluded from the analysis throughout the validation of a given candidate accident/hazard prediction model if these data were used to develop that model. For example, if a given candidate accident/hazard prediction model was developed using the 2012-2016 accident data, that model can be validated using the 2017 accident data (or the data collected for the years after 2016).

5.
The actual accident data collected for the highway-rail grade crossings in the State of Florida for the year of 2017 will be used to develop the baseline ranking of highway-rail grade crossings. The exposure will serve as a secondary ranking criterion for the highway-rail grade crossings (e.g., if the same number of accidents were recorded for two highway-rail grade crossings over the considered time horizon, the highway-rail grade crossing that has a higher exposure value will receive a higher rank). The exposure of a given highway-rail grade crossing will be estimated based on the product of the number of trains per day and the number of vehicles per day. 6.
The estimated values of the Spearman rank correlation coefficient will be multiplied by a factor of "5" in order to accentuate the degree of correlation between the predicted rankings and the baseline rankings for each one of the considered accident/hazard prediction models. Such an approach is line with the methodology that was adopted by Qureshi et al. [35] to evaluate different accident/hazard prediction models for the highway-rail grade crossings in the State of Missouri.
Note that only one year of the actual accident data (i.e., the year of 2017) was selected for evaluation of the accident and hazard prediction models due to the following reasons: • First, based on the information provided by the FDOT [1], the majority of accidents at the highway-rail grade crossings in the State of Florida have similar features, including the following: (1) the accidents are caused by risky behavior of drivers; (2) the accidents are recorded for public highway-rail grade crossings; (3) the accidents primarily involve motor vehicles; and (4) the accidents are recorded for the locations that have active warning devices. Furthermore, the number of accidents at the highway-rail grade crossings in the State of Florida was reported to be within the same range (i.e., ≈ 90 ÷ 100 accidents per year) over the past five years based on the information provided by the FRA [7]. Therefore, the year of 2017 can be considered as a representative year for evaluation of the accident and hazard prediction models. • Second, the present study was initiated at the beginning of 2019. The FRA may update the past records for the accidents that occurred several months ago in the highway-rail grade crossing accident database. Therefore, the 2018 highway-rail grade crossing accident database was still not finalized at the moment of conducting the present study, while the 2019 highway-rail grade crossing accident database was not even completed. Therefore, the 2017 highway-rail grade crossing accident database was the most accurate for evaluation of the accident and hazard prediction models at the moment of conducting the present study. • Third, the adopted approach (i.e., selection of one representative year of the accident data for evaluation of the candidate accident and hazard prediction models for highway-rail grade crossings in a given state) was found to be common in the highway-rail grade crossing safety literature and was used by previously conducted studies [33][34][35][36].

Considered Highway-Rail Grade Crossings
The candidate accident/hazard prediction models were applied to the most hazardous highway-rail grade crossings located in the State of Florida. A detailed analysis of the highway-rail grade crossing accident data showed that at least one accident had been recorded for a total of 586 highway-rail grade crossings in the State of Florida between the year of 2007 and the year of 2017. However, the candidate accident/hazard prediction models could be evaluated only for 489 out of the 586 highway-rail grade crossings, since the FRA's highway-rail grade crossing inventory database did not have sufficient information for the 97 highway-rail grade crossings that experienced at least one accident between the year of 2007 and the year of 2017. Furthermore, 50 passive highway-rail grade crossings and 50 active highway-rail grade crossings, which did not have any accidents between the year of 2007 and the year of 2017 but had the highest exposure values, were considered throughout the analysis as well. Therefore, a total of 589 highway-rail grade crossings were further analyzed using the candidate accident/hazard prediction models. Such an approach (selecting a diverse group of highway-rail grade crossings only from all the existing highway-rail grade crossings to evaluate the candidate accident/hazard prediction models in a given state) had been widely used in the highway-rail grade crossing safety literature [33][34][35]. Figure 1 presents the distribution of the selected highway-rail grade crossings by protection type. It can be observed that active protection devices (such as bells, wigwags, highway traffic signals, flashing lights, gates, or other active devices) were installed at 494 highway-rail grade crossings (or 83.9% of the highway-rail grade crossings considered in the analysis). On the other hand, passive protection devices (such as crossbucks, stop signs, or other passive signs or signals) were installed at 79 highway-rail grade crossings (or 13.4% of the highway-rail grade crossings considered in the analysis). A total of 16 highway-rail grade crossings (or 2.7% of the highway-rail grade crossings considered in the analysis) did not have any signals or signs.
Sustainability 2020, x, x FOR PEER REVIEW 9 of 27 analysis). A total of 16 highway-rail grade crossings (or 2.7% of the highway-rail grade crossings considered in the analysis) did not have any signals or signs. The analysis of the highway classification data showed that a total of 447 roadways (or 75.9% of roadways) at the highway-rail grade crossings selected for evaluation of the candidate accident/hazard prediction models were categorized as urban roadways. A total of 142 roadways at the considered highway-rail grade crossings (or 24.1% of roadways) were categorized as rural roadways. Based on the available accident data, it was found that more accidents were recorded for the highway-rail grade crossings that were located in urban areas as compared to the ones that were located in rural areas. In particular, a total of 323 highway-rail grade crossings (or 54.8% of the highway-rail grade crossings considered in the analysis) experienced accidents between the year of 2007 and the year of 2016 in urban areas. On the other hand, a total of 115 highway-rail grade crossings (or 19.5% of the highway-rail grade crossings considered in the analysis) experienced accidents between the year of 2007 and the year of 2016 in rural areas. A total of 537 highway-rail grade crossings (or 91.2%) had paved roadways, while 52 highway-rail grade crossings (or 8.8%) had unpaved roadways. Descriptive characteristics of some other operational features of the selected highway-rail grade crossings are presented in Table 1, including the average values (mean), median values (median), standard deviation values (STD), maximum values (max), and minimum values (min).  The analysis of the highway classification data showed that a total of 447 roadways (or 75.9% of roadways) at the highway-rail grade crossings selected for evaluation of the candidate accident/hazard prediction models were categorized as urban roadways. A total of 142 roadways at the considered highway-rail grade crossings (or 24.1% of roadways) were categorized as rural roadways. Based on the available accident data, it was found that more accidents were recorded for the highway-rail grade crossings that were located in urban areas as compared to the ones that were located in rural areas. In particular, a total of 323 highway-rail grade crossings (or 54.8% of the highway-rail grade crossings considered in the analysis) experienced accidents between the year of 2007 and the year of 2016 in urban areas. On the other hand, a total of 115 highway-rail grade crossings (or 19.5% of the highway-rail grade crossings considered in the analysis) experienced accidents between the year of 2007 and the year of 2016 in rural areas. A total of 537 highway-rail grade crossings (or 91.2%) had paved roadways, while 52 highway-rail grade crossings (or 8.8%) had unpaved roadways. Descriptive characteristics of some other operational features of the selected highway-rail grade crossings are presented in Table 1, including the average values (mean), median values (median), standard deviation values (STD), maximum values (max), and minimum values (min).

Approaches for the Evaluation of the Candidate Accident/Hazard Prediction Models
The candidate accident/hazard prediction models were evaluated using the following approaches: (1) chi-square statistic; (2) grouping of crossings based on the actual accident data; and (3) Spearman rank correlation coefficient.

Chi-Square Statistic
The chi-square statistic had been used by some of the previously conducted studies on the highway-rail grade crossing safety. In particular, the chi-square statistic was adopted by Faghri and Demetsky [32] to evaluate a number of accident prediction models (i.e., the U.S. DOT Accident Prediction Formula, the Coleman-Stewart Model, the Peabody-Dimmick Formula, and the NCHRP Report 50 Accident Prediction Formula) for the highway-rail grade crossings that are located in the State of Virginia. The chi-square formula, which will be further used to quantify the goodness of fit of the candidate accident prediction models, can be calculated using the following equation [32,43,44]: where: AO x = the number of accidents observed at highway-rail grade crossing x; AC x = the number of accidents estimated using a given candidate accident prediction model for highway-rail grade crossing x; n = the number of highway-rail grade crossings.
The chi-square statistic value (χ 2 ) quantifies the goodness of fit or correlation between the estimated data and the observed data. If the chi-square statistic value is low, then the expected data (i.e., the accident prediction value, provided by a given candidate accident prediction model for the considered highway-rail grade crossing) fit the observed data well (i.e., the actual number of accidents recorded for the considered highway-rail grade crossing). On the contrary, if the chi-square statistic value is high, then the expected data do not fit the observed data well. Note that the chi-square formula was used to assess accuracy of the accident prediction models and will not be applied for the hazard prediction models (since the hazard prediction models do not specify the number of predicted accidents), which is line with the methodology that was adopted by Faghri and Demetsky [32].

Grouping of Crossings Based on the Actual Accident Data
Based on the second approach, the actual or real-world accident data are used to validate the ability of accident/hazard prediction models to rank highway-rail grade crossings for safety improvement projects. A number of states adopted the latter approach in the past (e.g., the State of Alabama [33]; the State of Illinois [34]; the State of Texas [36]; the State of Ohio [40]). The FRA's highway-rail grade crossing accident database was further used to retrieve the actual accident data. The highway-rail grade crossings in the State of Florida were categorized into the top 15%, 20%, 25%, 30%, 40%, and 50% of the most hazardous highway-rail grade crossings using the actual accident data. After that, the candidate accident/hazard prediction models were deployed to rank the highway-rail grade crossings in the State of Florida using the data which were available in the FRA's highway-rail grade crossing accident database and the FRA's highway-rail grade crossing inventory database. The ranking of the highway-rail grade crossings was performed using the predicted number of accidents provided by the candidate accident prediction models. Similarly, the hazard values, provided by the candidate hazard prediction models, were also used to perform the ranking of the highway-rail grade crossings.
The number of highway-rail grade crossings-captured by a given candidate accident/hazard prediction model for the top 15%, 20%, 25%, 30%, 40%, and 50% of the most hazardous highway-rail grade crossings-was adopted as the key performance indicator for the model evaluation. The accident prediction model or the hazard prediction model that captures the largest number of highway-rail grade crossings for these hazardous highway-rail grade crossing categories can be further considered as the most effective model. This approach has some similarities with the power factor method, which was previously applied by Faghri and Demetsky [32] for the highway-rail grade crossings in the State of Virginia. The objective of the power factor analysis is to estimate the percentage of accidents that were recorded for the most hazardous highway-rail grade crossings, which were determined by the candidate accident prediction model or the candidate hazard prediction model [32]. As it was discussed earlier, the accident data were excluded from the analysis throughout the validation of a given candidate accident/hazard prediction model if these data were used to develop that model. For example, if a given candidate accident/hazard prediction model was developed using the 2012-2016 accident data, then that model could be validated using the 2017 accident data.

Spearman Rank Correlation Coefficient
Based on the third approach, the performance of the candidate accident/hazard prediction models was assessed using the Spearman rank correlation coefficient. The Spearman rank correlation coefficient had been previously used by Qureshi et al. [35] to evaluate certain accident/hazard prediction models for the highway-rail grade crossings in the State of Missouri. The Spearman rank correlation coefficient can be defined as a nonparametric measure for rank correlation. The values of the Spearman rank correlation coefficient can range between −1 and +1. A value of −1 corresponds to a perfect negative correlation between the predicted ranking set, which was derived by the candidate model (i.e., the ranking of highway-rail grade crossings obtained using a given candidate accident/hazard prediction model), and the baseline ranking set (i.e., the ranking of highway-rail grade crossings obtained using the actual accident data). A value of +1 for Spearman rank correlation coefficient shows that there is a perfect positive correlation between the predicted ranking set and the baseline ranking set. On the other hand, a value of 0 demonstrates that no correlation exists between the considered datasets [35].
The Spearman rank correlation coefficient can be calculated using the following equation [35,45]: where: r s = the Spearman rank correlation coefficient; P x = the rank of highway-rail grade crossing x, proposed by a given candidate accident/hazard prediction model; P = the average ranking value of highway-rail grade crossings, proposed by a given candidate/hazard accident prediction model; B x = the baseline rank of highway-rail grade crossing x; B = the average baseline ranking value; n = the number of highway-rail grade crossings.
The estimated values of the Spearman rank correlation coefficient will be multiplied by a factor of 5 in order to accentuate the degree of correlation between the predicted rankings and the baseline rankings for each one of the considered accident/hazard prediction models. The latter approach is in line with the methodology that was used by the study conducted by Qureshi et al. [35] for the highway-rail grade crossings in the State of Missouri. The FRA's highway-rail grade crossing accident database was used to develop the baseline ranking of the highway-rail grade crossings in the State of Florida based on the actual accident data.

Case Study
This section of the manuscript provides the results that were revealed from the analysis of the candidate accident and hazard prediction models for the highway-rail grade crossings in the State of Florida. Along with the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, and the canonical Texas Priority Index Formula, this study evaluated the modified versions of the aforementioned formulae. The modified formulae will be further referred to as the Modified Connecticut Hazard Rating Formula, the Modified California Hazard Rating Formula, and the Modified Texas Priority Index Formula, respectively. The total number of accidents in the last five years, the last 10 years, and the last 5 years are used within the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, and the canonical Texas Priority Index Formula, respectively. The canonical versions of the formulae have a major drawback since they do not consider upgrades at highway-rail grade crossings that generally cause significant changes in the crossing operational characteristics. In order to address this drawback, the Modified Connecticut Hazard Rating Formula, the Modified California Hazard Rating Formula, and the Modified Texas Priority Index Formula consider the total number of accidents which were observed in the last years (i.e., the last 5 years, the last 10 years, and the last 5 years, respectively) or after the year of the most recent upgrade (in case if the crossing was upgraded).
The accident data, downloaded from the FRA's highway-rail grade crossing accident database for the year of 2017, were used throughout evaluation of the candidate accident/hazard prediction models. On the other hand, the accident data between 2007 and 2016 were used by the candidate accident/hazard prediction models to develop the predicted rankings of highway-rail grade crossings. The baseline rankings of highway-rail grade crossings, derived using the actual accident data for the year of 2017, were compared against the rankings of highway-rail grade crossings, derived using the candidate accident/hazard prediction models. MATLAB [46] was used to conduct all the statistical analyses throughout this study.

Analysis of the Accident Prediction Models Based on the Chi-Square Statistic
The chi-square test was performed, where the observed number of accidents was obtained from the FRA's highway-rail grade crossing accident database for the year of 2017 and the predicted number of accidents was obtained using the candidate accident prediction models. The results of the conducted analysis are provided in Figure 2. It can be observed that the Peabody-Dimmick Formula had the closest fit to the observed number of accidents that were recorded for the 589 highway-rail grade crossings. In particular, the lowest value of chi-square statistic (χ 2 = 482.74) was estimated for the Peabody-Dimmick Formula. On the other hand, the chi-square statistic values comprised 1341.68, 1800.79, and 17,099.01 for the Coleman-Stewart Model, the U.S. DOT Accident Prediction Formula, and the NCHRP Report 50 Accident Prediction Formula, respectively (see Figure 2). Therefore, the worst performance was demonstrated by the NCHRP Report 50 Accident Prediction Formula for the considered highway-rail grade crossings in the State of Florida. As it was indicated earlier, the chi-square statistic values were not computed for the hazard prediction models as the predicted number of accidents is necessary in order to conduct the chi-square test.
the FRA's highway-rail grade crossing accident database for the year of 2017 and the predicted number of accidents was obtained using the candidate accident prediction models. The results of the conducted analysis are provided in Figure 2. It can be observed that the Peabody-Dimmick Formula had the closest fit to the observed number of accidents that were recorded for the 589 highway-rail grade crossings. In particular, the lowest value of chi-square statistic ( 2 = 482.74) was estimated for the Peabody-Dimmick Formula. On the other hand, the chi-square statistic values comprised 1341.68, 1800.79, and 17,099.01 for the Coleman-Stewart Model, the U.S. DOT Accident Prediction Formula, and the NCHRP Report 50 Accident Prediction Formula, respectively (see Figure 2). Therefore, the worst performance was demonstrated by the NCHRP Report 50 Accident Prediction Formula for the considered highway-rail grade crossings in the State of Florida. As it was indicated earlier, the chisquare statistic values were not computed for the hazard prediction models as the predicted number of accidents is necessary in order to conduct the chi-square test.   Figure 3 presents the number of the most hazardous highway-rail grade crossings, captured by the candidate accident/hazard prediction models. The results from the conducted analysis show that the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula typically performed better as compared to the other candidate accident/hazard prediction models since they were able to capture more highway-rail grade crossings in the groups that represent the top 15%, 20%, 25%, 30%, 40%, and 50% of the most hazardous highway-rail grade crossings in the State of Florida. Furthermore, it can be observed that less highway-rail grade crossings were generally captured by the U.S. DOT Accident Prediction Formula for the considered highway-rail grade crossing groups. The Connecticut Hazard Rating Formula and the Modified Connecticut Hazard Rating Formula also showcased a quite weak performance as compared to the other candidate accident/hazard prediction models which were considered throughout the analysis.

Analysis of the Accident/Hazard Prediction Models Based on the Crossing Groups
The candidate accident prediction models were typically outperformed by the candidate hazard prediction models for the considered highway-rail grade crossings in the State of Florida. Such a finding can be explicated by the nature of the accident prediction models. Specifically, the accident prediction models are based on many coefficients which have to be calibrated using the historical data that contain the operational and physical characteristics of the highway-rail grade crossings in a given state. Changes in the operational and physical characteristics of highway-rail grade crossings are unavoidable over time. Hence, some of the coefficients become outdated for certain accident prediction models. Moreover, if the coefficient values were calibrated for the highway-rail grade crossings of a particular state or the entire U.S., these values may not be appropriate for the highway-rail grade crossings that are located in the State of Florida. On the other hand, the hazard prediction models are typically more generic and do not use a large number of coefficients which have to be continuously updated over a certain period of time. The hazard prediction models assess the crossing hazard based on the key operational and physical characteristics (e.g., number of trains per day, number of vehicles per day, existing protection, train speed, accident history, number of traffic lanes, number of tracks, etc.).
Furthermore, the analysis results show that the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, and the canonical Texas Priority Index Formula were typically outperformed by the Modified Connecticut Hazard Rating Formula, the Modified California Hazard Rating Formula, and the Modified Texas Priority Index Formula, respectively. Such a finding can be explicated by the fact that the Modified Connecticut Hazard Rating Formula, the Modified California Hazard Rating Formula, and the Modified Texas Priority Index Formula consider the total number of accidents which were observed in the last years (i.e., the last 5 years, the last 10 years, and the last 5 years, respectively) or after the year of the most recent upgrade (if the crossing was upgraded). On the other hand, the canonical versions of these hazard prediction formulae ignore the upgrades that were previously implemented at the considered highway-rail grade crossings. Application of different highway-rail grade crossing upgrades may substantially change their operational and physical characteristics and negatively influence the accuracy of the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, and the canonical Texas Priority Index Formula.  The scope of this study also included an additional analysis, aiming to determine the number of common highway-rail grade crossings that were selected by all the considered accident and hazard prediction models for each one of the highway-rail grade crossing groups (each group was analyzed individually). The considered accident and hazard prediction models were able to identify a total of 10, 24, 31, 36, 68, and 148 common highway-rail grade crossings in the groups that represent the top 15%, 20%, 25%, 30%, 40%, and 50% of the most hazardous highway-rail grade crossings in the State of Florida, respectively. The common highway-rail grade crossings that were identified by all the considered accident and hazard prediction models for a given highway-rail grade crossing group (e.g., top 15% of the most hazardous highway-rail grade crossings) should receive more attention from the relevant stakeholders throughout selection of safety improvement projects.  The scope of this study also included an additional analysis, aiming to determine the number of common highway-rail grade crossings that were selected by all the considered accident and hazard prediction models for each one of the highway-rail grade crossing groups (each group was analyzed individually). The considered accident and hazard prediction models were able to identify a total of 10, 24, 31, 36, 68, and 148 common highway-rail grade crossings in the groups that represent the top 15%, 20%, 25%, 30%, 40%, and 50% of the most hazardous highway-rail grade crossings in the State of Florida, respectively. The common highway-rail grade crossings that were identified by all the considered accident and hazard prediction models for a given highway-rail grade crossing group (e.g., top 15% of the most hazardous highway-rail grade crossings) should receive more attention from the relevant stakeholders throughout selection of safety improvement projects. Figure 4 presents the Spearman rank correlation coefficient values which were estimated for the candidate accident/hazard prediction models. The results from the conducted analysis demonstrate that the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula had the closest match with the rankings of highway-rail grade crossings, which were obtained based on the actual accident data. In particular, the Spearman rank correlation coefficient values comprised 3.636, 3.641, and 3.732 for the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula, respectively. Note that the Spearman rank correlation coefficient value of −5 indicates a perfect negative correlation between the predicted rankings and the baseline rankings, while the value of +5 implies a perfect positive correlation. Hence, the Spearman rank correlation coefficients of the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula show a strong positive relationship between the predicted rankings and the baseline rankings.

Analysis of the Accident/Hazard Prediction Models Based on the Spearman Rank Correlation Coefficient
Sustainability 2020, x, x FOR PEER REVIEW 15 of 27 that the Spearman rank correlation coefficient value of −5 indicates a perfect negative correlation between the predicted rankings and the baseline rankings, while the value of +5 implies a perfect positive correlation. Hence, the Spearman rank correlation coefficients of the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula show a strong positive relationship between the predicted rankings and the baseline rankings. It was found that the U.S. DOT Accident Prediction Formula, the Connecticut Hazard Rating Formula, and the Modified Connecticut Hazard Rating Formula had the lowest Spearman rank correlation coefficient values (i.e., 1.500, 2.377, and 2.384, respectively), which show a fairly weak positive relationship between the predicted rankings and the baseline rankings. Moreover, the candidate accident prediction models generally had lower Spearman rank correlation coefficient values as compared to the candidate hazard prediction models (see Figure 4). The latter finding confirms that the predicted rankings of the highway-rail grade crossings in the State of Florida, provided by the candidate hazard prediction models, were more accurate as compared to the ones that were provided by the candidate accident prediction models (which is in line with the results that were revealed from the analysis of the candidate accident/hazard prediction models based on the crossing groups-see section 5.2 of the manuscript for more details). The conducted analysis also showcases that the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, and the canonical Texas Priority Index Formula had lower Spearman rank correlation coefficient values as compared to the Modified Connecticut Hazard Rating Formula, the Modified California Hazard Rating Formula, and the Modified Texas Priority Index Formula, respectively. The latter finding confirms that the modified versions of the Connecticut Hazard Rating Formula, the California Hazard Rating Formula, and the Texas Priority Index Formula provided more accurate rankings of the highway-rail grade crossings in the State of Florida as compared to the ones that were provided by the canonical versions of the formulae (which is in line with the results that were revealed from the analysis of the candidate accident/hazard prediction models based on the crossing groups-see section 5.2 of the manuscript for more details). The accuracy of the modified versions of the aforementioned hazard prediction formulae can be justified by the fact that they directly account for the upgrades at highway-rail grade crossings in order to estimate the number of accidents that occurred in the past.

Final Model Recommendation
A detailed evaluation of the 13 candidate accident/hazard prediction models demonstrated that the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan It was found that the U.S. DOT Accident Prediction Formula, the Connecticut Hazard Rating Formula, and the Modified Connecticut Hazard Rating Formula had the lowest Spearman rank correlation coefficient values (i.e., 1.500, 2.377, and 2.384, respectively), which show a fairly weak positive relationship between the predicted rankings and the baseline rankings. Moreover, the candidate accident prediction models generally had lower Spearman rank correlation coefficient values as compared to the candidate hazard prediction models (see Figure 4). The latter finding confirms that the predicted rankings of the highway-rail grade crossings in the State of Florida, provided by the candidate hazard prediction models, were more accurate as compared to the ones that were provided by the candidate accident prediction models (which is in line with the results that were revealed from the analysis of the candidate accident/hazard prediction models based on the crossing groups-see Section 5.2 of the manuscript for more details).
The conducted analysis also showcases that the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, and the canonical Texas Priority Index Formula had lower Spearman rank correlation coefficient values as compared to the Modified Connecticut Hazard Rating Formula, the Modified California Hazard Rating Formula, and the Modified Texas Priority Index Formula, respectively. The latter finding confirms that the modified versions of the Connecticut Hazard Rating Formula, the California Hazard Rating Formula, and the Texas Priority Index Formula provided more accurate rankings of the highway-rail grade crossings in the State of Florida as compared to the ones that were provided by the canonical versions of the formulae (which is in line with the results that were revealed from the analysis of the candidate accident/hazard prediction models based on the crossing groups-see Section 5.2 of the manuscript for more details). The accuracy of the modified versions of the aforementioned hazard prediction formulae can be justified by the fact that they directly account for the upgrades at highway-rail grade crossings in order to estimate the number of accidents that occurred in the past.

Final Model Recommendation
A detailed evaluation of the 13 candidate accident/hazard prediction models demonstrated that the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula outperformed the other candidate accident/hazard prediction models in terms of the adopted performance indicators for the highway-rail grade crossings in the State of Florida. In particular, the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula were able to capture more highway-rail grade crossings in the groups that represent the top 15%, 20%, 25%, 30%, 40%, and 50% of the most hazardous highway-rail grade crossings in the State of Florida. Moreover, the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula returned the highest values of the Spearman rank correlation coefficient (i.e., 3.636, 3.641, and 3.732, respectively).
The Texas Priority Index Formula and the Modified Texas Priority Index Formula are more advantageous as compared to the Michigan Hazard Index Formula since they directly account for the accident history at highway-rail grade crossings. However, the canonical Texas Priority Index Formula does not consider any upgrades throughout estimations of the hazard index for highway-rail grade crossings and had a lower Spearman rank correlation coefficient value as compared to the Modified Texas Priority Index Formula. Therefore, this study recommends the Modified Texas Priority Index Formula, which will be further referred to as the "Florida Priority Index Formula", for prioritizing the highway-rail grade crossings for safety improvement projects in the State of Florida. The Florida Priority Index can be calculated for the highway-rail grade crossings using the following equation: where: FPI = the Florida Priority Index; V = average daily traffic volume; T = average daily train volume; S = train speed, mph; PF = protection factor; 1.00 for passive; 0.70 for mast-mounted flashing lights; 0.15 for cantilever flashing lights; and 0.10 for gates; A = accident history parameter = the number of accidents in the past 5 years or the number of accidents since the latest upgrade, if the latest upgrade was made within the past 5 years (default = 1).

Conclusions and Future Work
A significant number of accidents are reported at highway-rail grade crossings across the United States (U.S.) every year. For example, a total of 1649 highway-rail accidents which resulted in 268 deaths and 755 injuries were reported for the highway-rail grade crossings in the State of Florida between January 2000 and December 2018. The state Departments of Transportation (DOTs) implement certain countermeasures to prevent accidents, such as upgrading of highway-rail grade crossing surface, replacement of warning signs, installation of reflective strips, replacement of gate mechanism, installation of median barrier systems, and others. However, applying countermeasures to all the highway-rail grade crossings in a region is not economically feasible. Therefore, highway-rail grade crossings have to be prioritized for safety improvement projects. This study focused on evaluation of the accident and hazard prediction formulae which have been widely used in the literature for the highway-rail grade crossings in the State of Florida.
A total of 21 accident/hazard prediction models were discovered throughout review of the relevant literature. Only four accident prediction models and six hazard prediction models were selected for a detailed evaluation due to the limited data, which are available in the Federal Railroad Administration's (FRA's) highway-rail grade crossing accident database and the FRA's highway-rail grade crossing inventory database. In addition, modified versions of the Connecticut Hazard Rating Formula, the California Hazard Rating Formula, and the Texas Priority Index Formula were developed and assessed. The key difference between the canonical Connecticut Hazard Rating Formula, the canonical California Hazard Rating Formula, the canonical Texas Priority Index Formula, and their modified versions consists in the approach for estimating the number of accidents which occurred in the last years (i.e., the modified versions of the formulae directly account for the upgrades and estimate the number of accidents since the latest upgrade). The candidate accident/hazard prediction models were applied to the 589 most hazardous highway-rail grade crossings located in the State of Florida. A total of three performance indicators were used to assess the performance of the candidate accident/hazard prediction models, including the following: (1) chi-square statistic; (2) grouping of crossings based on the actual accident data; and (3) Spearman rank correlation coefficient.
A detailed evaluation of the 13 candidate accident/hazard prediction models demonstrated that the Texas Priority Index Formula, the Modified Texas Priority Index Formula, and the Michigan Hazard Index Formula outperformed the other candidate accident/hazard prediction models in terms of the adopted performance indicators for the highway-rail grade crossings in the State of Florida. The Modified Texas Priority Index Formula (referred to as the "Florida Priority Index Formula") was found to be methodologically superior to the canonical Texas Priority Index Formula as well as the Michigan Hazard Index Formula and was recommended for prioritizing the highway-rail grade crossings for safety improvement projects in the State of Florida. The Florida Priority Index Formula assesses a potential hazard of a given highway-rail grade crossing based on the average daily traffic volume, average daily train volume, train speed, protection factor, and accident history parameter. Unlike the canonical Texas Priority Index Formula, the Florida Priority Index Formula computes the accident history parameter based on the total number of accidents in the last five years or since the year of the last improvement (if there was an upgrade).
It is expected that the findings of this study and the developed Florida Priority Index Formula will assist the Florida DOT (FDOT) with an accurate prioritization of the highway-rail grade crossings in the State of Florida for safety improvement projects. This study can be extended further in several ways. First, a hazard prediction methodology for different severity categories (e.g., fatality, injury, property damage only, etc.) should be devised as an extension of the Florida Priority Index Formula. Second, additional factors could be considered in the proposed hazard prediction formula (e.g., vehicle composition, percentage of heavy vehicles, highway type, posted highway speed limit). Third, more advanced statistical methods could be developed for hazard prediction of the highway-rail grade crossings in the State of Florida and then compared to the Florida Priority Index Formula. Fourth, a resource allocation model, which directly relies on the Florida Priority Index values, should be developed to identify the highway-rail grade crossings that must be upgraded and select the appropriate upgrading type. Exact and heuristic methods should be considered to solve the developed resource allocation model. Fifth, the developed Florida Priority Index Formula should be evaluated for other states in the U.S., as it may be effective for ranking highway-rail grade crossings for safety improvement projects in the other states as well (not just Florida). Sixth, the protection factor values of the Florida Priority Index Formula should be calibrated for different types of warning devices (e.g., crossbucks, wigwags, bells).

Appendix A
This Appendix provides a detailed description of the candidate accident prediction models and the candidate hazard prediction models that were considered in this study.
Candidate Accident Prediction Formulae This section of the manuscript provides a detailed description of the candidate accident prediction models, which were further evaluated for the highway-rail grade crossings in the State of Florida.
Coleman-Stewart Model. Sources: Faghri and Demetsky [32], Elzohairy and Benekohal [34] log 10 A = B 0 + B 1 ·log 10 C + B 2 ·log 10 T + B 3 ·(log 10 T) 2 (A1) where: Tables A1 and A2 present the values of coefficients for the accident prediction equation (derived based on a multiple linear regression analysis) and the associated R-squared values, which were reported by Coleman and Stewart. Table A1 provides the information for urban highway-rail grade  crossings, while Table A2 reports the data for rural highway-rail grade crossings. Note that Tables A1 and A2 present the values of the Coleman-Stewart Model coefficients based on protection type (i.e., automatic gates, flashing lights, crossbucks, other active protection types, stop signs, and no protection) and number of tracks (i.e., single track or multiple tracks). For example, the value of the B 0 coefficient comprises −2.17 for single-track urban highway-rail grade crossings with automatic gates (see Table A1). However, the value of the B 0 coefficient was found to be higher in rural settings and comprises −1.42 for single-track rural highway-rail grade crossings with automatic gates (see Table A2). Similarly, the values of the B 0 coefficient comprise −2.58 and −1.63 for multiple-track highway-rail grade crossings with automatic gates that are located in urban and rural settings, respectively.  where:

Single-Track Multiple-Track
A = factor based on a 10-year annual average daily traffic (AADT) (see Table A3); B = factor based on the existing warning devices and urban/rural classification (see Table A4); T = current train volume per day. where: A 5 = expected number of accidents in 5 years; V = annual average daily traffic factor; T = average daily train traffic factor; P = protection coefficient; The values of different factors which are required to estimate the expected number of accidents in five years (A 5 ) can be determined from Table A5 and the set of curves presented in Figures A1-A3. Furthermore, the additional parameter (K) is estimated based on the unbalanced accident factor (l u ). The unbalanced accident factor (l u ) can be calculated using the following equation:    U.S. DOT Accident Prediction Formula. Sources: Qureshi et al. [35], U.S. DOT [41], FRA [42], Chadwick et al. [4], Ryan and Mielke [39].
The U.S. DOT Accident Prediction Formula is based on three stages, including the following: (1) estimation of the initial accident prediction; (2) estimation of the second accident prediction; and (3) estimation of the final accident prediction.
The Initial Accident Prediction where: = the initial accident prediction, accidents per year at a highway-rail grade crossing; = formula constant; = factor for exposure index based on the product of highway and train traffic;  U.S. DOT Accident Prediction Formula. Sources: Qureshi et al. [35], U.S. DOT [41], FRA [42], Chadwick et al. [4], Ryan and Mielke [39].
The U.S. DOT Accident Prediction Formula is based on three stages, including the following: (1) estimation of the initial accident prediction; (2) estimation of the second accident prediction; and (3) estimation of the final accident prediction.
The Initial Accident Prediction where: = the initial accident prediction, accidents per year at a highway-rail grade crossing; = formula constant; = factor for exposure index based on the product of highway and train traffic; = factor for the number of main tracks; Figure A3. Relationship between Additional Parameter, K and Unbalanced Accident Factor, l u .
The U.S. DOT Accident Prediction Formula is based on three stages, including the following: (1) estimation of the initial accident prediction; (2) estimation of the second accident prediction; and (3) estimation of the final accident prediction.

The Initial Accident Prediction
where: a = the initial accident prediction, accidents per year at a highway-rail grade crossing; K = formula constant; EI = factor for exposure index based on the product of highway and train traffic; MT = factor for the number of main tracks; DT = factor for the number of through trains per day during daylight; HP = factor for highway paved (yes or no); MS = factor for maximum timetable speed; HT = factor for highway type; HL = factor for the number of highway lanes. Table A6 presents the values of the highway-rail grade crossing characteristic factors for the highway-rail grade crossings with different types of protection (i.e., passive, flashing lights, and gates). Notes: = annual average number of highway vehicles per day (total of both directions); = average total train movements per day; = number of main tracks; = average number of thru trains per day during daylight; ℎ = highway paved; 1.0 for paved; and 2.0 for unpaved; = maximum timetable speed, mph; ℎ = highway type factor value (see Table A7); ℎ = number of highway lanes. The Second Accident Prediction Notes: c = annual average number of highway vehicles per day (total of both directions); t = average total train movements per day; mt = number of main tracks; d = average number of thru trains per day during daylight; hp = highway paved; 1.0 for paved; and 2.0 for unpaved;ms = maximum timetable speed, mph; ht = highway type factor value (see Table A7); hl = number of highway lanes. The Second Accident Prediction where: B = the second accident prediction, accidents per year at a highway-rail grade crossing; a = initial accident prediction, accidents per year at a highway-rail grade crossing; N T = accident history prediction, accidents per year, where N is the number of observed accidents in T years at a highway-rail grade crossing; T 0 = formula weighting factor = 1 0.05+a .
The second accident prediction formula will yield the most accurate results when all the available accident history is considered. However, the accident history, collected for more than five years, can be misleading as a result of the changes in the highway-rail grade crossing characteristics that occur over time. If a given highway-rail grade crossing was upgraded within the last five years (e.g., installation of flashing lights at a passive highway-rail grade crossing), the accident history after upgrades should be considered in the estimation of the second accident prediction.
The Final Accident Prediction The final accident prediction (A) relies on the application of a normalizing constant in order to consider the current accident trends. The normalizing constant should be estimated for each category of highway-rail grade crossings (crossings with passive traffic control, crossings with flashing lights, and crossings with gates) by setting the sum of the number of predicted accidents multiplied by the corresponding normalizing constant equal to the number of accidents, which were recorded over a given time period.
Candidate Hazard Prediction Formulae This section of the manuscript provides a detailed description of the candidate hazard prediction models, which will be further evaluated for the highway-rail grade crossings in the State of Florida.
New Hampshire Hazard Index Formula. Sources: Chadwick et al. [4], Ryan and Mielke [39].  where: CoHI = the Connecticut Hazard Index; AADT = annual average daily traffic; T = number of trains per day; PF = protection factor (see Table A8); A = accident history (the total number of accidents in the last 5 years).  Flashing-light signals with cantilever arms and traffic signal interconnect 0.24 Flashing-light signals with half-roadway gates 0.11 Flashing-light signals with cantilever arms and half-roadway gates 0.08 Flashing-light signals with cantilever arms, half-roadway gates, and traffic signal interconnection 0.05 The addition of warranted motion sensor or predictor circuitry further reduces PF by 0.02.
The New Hampshire Hazard Index Formula had been used by the Michigan DOT to prioritize the highway-rail grade crossings for safety improvement projects. The key difference between the methodology used by the Michigan DOT and the canonical New Hampshire Hazard Index Formula consists of changes in the protection factor (PF) values. Table A9 presents the values of the protection factor adopted by the State of Michigan for various types of countermeasures. If the value of Michigan Hazard Index exceeds 4000 for a given highway-rail grade crossing, which may already have stop signs, crossbuck signs, yield signs, wigwag signals, bells, or manual warning, a system of flashing lights can be recommended for installation at that highway-rail grade crossing. Texas