Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways

Dhakal, Bishal; Al-Kaisy, Ahmed

doi:10.3390/su18042008

Open AccessArticle

Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways

by

Bishal Dhakal

^1,2

and

Ahmed Al-Kaisy

^2,*

¹

WSP USA Inc., Seattle, WA 98154, USA

²

Western Transportation Institute, Montana State University, Bozeman, MT 59717, USA

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(4), 2008; https://doi.org/10.3390/su18042008

Submission received: 18 November 2025 / Revised: 8 February 2026 / Accepted: 9 February 2026 / Published: 15 February 2026

(This article belongs to the Collection Accident Prevention and Risk Management for Safe and Sustainable Transportation)

Download

Browse Figures

Versions Notes

Abstract

Effective network screening methods play a significant role in highway safety management programs and contribute to sustainable mobility by facilitating the reduction in all crashes, including fatalities and injuries across the transportation system. This study presents a comprehensive analysis comparing the effectiveness of three new network screening techniques for pinpointing safety improvement locations on rural roads. The proposed methods are the Global Risk Scoring (GRS), the Crash Risk Index (CRI), and the Predicted Empirical Bayes (P-EB) methods. The analysis utilized 10 years of roadway geometry, traffic, and crash data from state-owned rural highways in Oregon, with the first five years (2011–2015) used for model development and the subsequent five years (2016–2020) for validation. Comparative tests assessed consistency with historical crash rankings and temporal stability across observation periods. The analysis revealed distinct strengths among the screening methods. The GRS method demonstrated a high level of consistency with historical crash data, while the P-EB method exhibited superior consistency across different time periods, suggesting its value for long-term safety planning. The CRI method demonstrated reasonable consistency in performance, irrespective of the test carried out. While no single method outperforms the others in all scenarios, each has unique advantages and data requirements that can better suit the agency’s needs, given available resources. This research provides actionable insights for improving safety management strategies and advancing sustainable mobility.

Keywords:

network screening; rural road safety; sustainable mobility; highway safety improvement programs; comparative analysis

1. Introduction

Highway Safety Improvement Programs (HSIPs) serve as a fundamental means for enhancing safety across entire highway networks, thereby supporting the objectives of sustainable mobility. By accurately identifying high-risk locations through effective network screening, HSIPs enable transportation agencies to allocate limited resources where they yield the greatest safety benefits. This precision not only improves economic efficiency but also promotes social equity by prioritizing interventions in areas—such as rural corridors—where vulnerable populations face disproportionate crash risks. The HSIP is a federally funded initiative designed to assist states in advancing roadway safety through a data-driven and strategic framework [1]. The primary goal of this program is to minimize the occurrence of fatalities and severe injuries across the roadway network. Within the HSIP framework, network screening represents a fundamental analytical step aimed at identifying roadway segments that exhibit a higher potential for safety enhancement. Network screening results in a list of sites that are prioritized for detailed engineering studies that seek to identify crash patterns, contributing factors, and potential countermeasures. A robust network screening method facilitates the identification of locations where safety benefits can be maximized, optimizing the allocation of limited safety improvement resources. Traditional screening methods are reactive and rely heavily on crash history and traffic data [2]. Conversely, proactive safety approaches, which are not exclusively dependent on historical crash data, generally require comprehensive information on roadway geometric design elements and roadside conditions [2].

According to the fatality facts 2021 published by the Insurance Institute for Highway Safety (IIHS), rural areas accounted for about 40% of traffic fatalities, despite only twenty percent of the U.S. population living there, and thirty-two percent of total vehicle miles traveled (VMT) occurred [3]. These statistics highlight the urgent need to enhance rural road safety and to ensure it is a priority in current state-level safety improvement programs. One of the main challenges in improving safety on rural highways is that crashes tend to be scattered across hundreds of miles, unlike the more concentrated pattern observed in urban areas.

Although numerous studies have evaluated network screening approaches— ranging from Empirical Bayes approaches to crash modification factor-based and risk factors-based methods—most require extensive data or are primarily reactive, limiting their applicability to rural, low-resource settings [4]. The reliability of predictive models is often questioned. One strategy to address this concern is to assess the accuracy of a model during its development phase. Therefore, examining the influential variables in the accuracy of crash count estimations can enhance decision-making processes and overall road safety [5,6,7,8]. Moreover, very limited research has compared multiple screening methods using different crash-severity groupings—such as fatal-and-injury versus all-crash datasets—to assess the stability and robustness of method performance across severity levels [4]. These gaps restrict the ability of agencies, particularly those with limited technical expertise, to select effective and practical network-screening tools.

This study seeks to evaluate and compare the effectiveness of three proposed network screening methods—Global Risk Scoring (GRS), Crash Risk Index (CRI), and Predicted Empirical Bayes (P-EB) methods—developed specifically for crash risk prediction on rural roads. While all three methods are considered proactive, requiring minimal data and technical resources, their foundational frameworks are different. The GRS approach was developed completely using empirical data on risk factors, along with using crash history and traffic exposure. The CRI method development, on the other hand, was largely influenced by the published crash modification factors for various roadway and roadside variables. Lastly, the P-EB method was developed using predictive regression models for the EB expected number of crashes, a well-established metric in safety analyses. The analysis is guided by the hypothesis that the P-EB method is expected to exhibit greater temporal stability due to its regression-based foundation, whereas the GRS method is expected to demonstrate stronger consistency with crash history because of its integration of empirical risk factors. Using historical crash records, traffic data, and roadway geometry data from Oregon’s rural two-lane highway network, this research aims to guide transportation agencies—particularly local agencies with limited resources—in selecting network screening approaches that optimize safety outcomes while advancing the broader objectives of sustainable mobility. Unlike emerging machine learning or AI-driven predictive tools, which often require large datasets, advanced computational resources, and specialized expertise, the methods examined in this study require limited access to data and technical expertise.

2. Literature Review

Network screening is a fundamental component of highway safety management, aimed at identifying sites with the greatest potential for crash reduction. Over the past decades, numerous methods have been developed, ranging from simple crash-frequency-based approaches to advanced statistical techniques such as Empirical Bayes (EB). While these methods have been widely applied, their comparative performance and practical applicability under rural highway conditions remain insufficiently explored [4]. This section reviews existing research on network screening methods, highlights comparative evaluations, and identifies gaps that motivate the current study.

2.1. Comparative Studies and Performance Metrics

Persaud and Hauer [9] conducted a comparative study evaluating the Empirical Bayes approach and a nonparametric approach for correcting bias in before-and-after crash analyses. The analysis of multiple datasets indicated that the Bayesian approach generally provided more accurate and reliable estimates. In another study, Persaud and Hauer [10] also introduced the concept of false identifications to assess the performance of various hotspot identification (HSID) methods. The research resulted in a tool to examine the performance of various identification procedures based on measures of performance that are easy to understand. Higle and Hecht [11] performed a simulation-based experiment to assess and compare different techniques for identifying hazardous roadway sites based on crash rate information. As a result, a variation in the Bayesian method exhibits a tendency to perform well, producing low numbers of both false negative and false positive errors. Maher and Mountain [12] also used a simulation-based approach to compare methods, including ranking of sites on the basis of annual accident totals and accident reduction potential.

Kwon et al. [13] conducted an assessment of three network screening techniques aimed at identifying locations with high collision concentrations on highways. The techniques that were used in this study include Continuous Risk Profile (CRP), Peak Searching (PS), and Sliding Moving Window (SMW). By using the Empirical Bayes adjustment, the study estimated excess expected average crash frequency, considering two sets of Safety Performance Functions (SPFs). The results from each method were then compared with observed high collision locations. The study concluded that the Continuous Risk Profile method achieved low rates of false positives, demonstrating superior effectiveness in identifying sites requiring safety investigation.

Ambros et al. [14] conducted a study to examine the differences in the results of network screening when using multivariate versus simple crash prediction models, utilizing data from the road network in South Moravia, Czech Republic. The analysis involved comparing segment rankings produced by the two models using Spearman’s rank correlation coefficient. Additionally, the study evaluated the statistical distribution of Potential for Safety Improvement (PSI) values and determined the percentage of segments that appeared on both model-generated lists. The findings from the study showed that the results from both techniques were comparable and corresponded well with each other. In another study by Ambros et al. [15] focusing on a rural-road network spanning approximately 1000 km in South Moravia in the Czech Republic, various models were developed for network screening, and the performance of these models was evaluated. The models were a traditional accident-based approach, the Empirical Bayes accident prediction model, and simple proactive safety index methods. It was found that the simple safety index models, incorporating traffic volume, segment length, and curvature change rate, perform well for effective network screening.

An Italian study [16] evaluated the performance of various network screening methods based on multiple criteria, including their effectiveness in detecting hazardous locations over time, detecting sites with potential safety improvement efficiently, and maintaining ranking consistency. The evaluation emphasized factors such as site consistency, method consistency, overall rank discrepancies, and combined score assessments. The results indicated that the EB method performed better than all other evaluated methods. In another Italian study [17], simulated datasets were employed to assess the accuracy of different hotspot identification methods. Using Monte Carlo simulations, synthetic crash data—reflecting empirical trends—were generated to predefine hazardous locations and test the performance of each method. This study examined EB estimates, Potential for Safety Improvement (PSI), historical crash numbers, and crash rate, concluding that the EB and PSI approaches were more effective than traditional crash frequency and crash rate methods in identifying high-risk locations.

Elvik [18] performed a comparative analysis of five different techniques (accident count, accident rate, accident rate and count, EB estimate of accidents, and EB dispersion criterion) for identifying hazardous road locations using Norwegian road data. The performance of the five techniques was evaluated in relation to epidemiological criteria (sensitivity and specificity). The study concluded that applying the Empirical Bayes estimate results in the most reliable identification of hazardous road locations.

Dhakal and Al-Kaisy [19] evaluated an alternative approach that employed a predictive model to estimate the Empirical Bayes (EB) expected number of crashes using categorized representations of roadway geometry and roadside features rather than exact quantitative data. The model was applied to identify potential safety improvement locations on rural two-lane highways, utilizing data from the state of Oregon. The results indicated that the prediction result from the new method was not in favor compared to the PSI and the conventional EB approaches. The analysis was performed using crash frequency and crash density metrics. In another study by Dhakal and Al-Kaisy [20], a proposed method utilizing a heuristic scoring scheme showed better performance than that of the EB method, particularly on lower traffic volume segments. However, for higher traffic volume roads, the EB method demonstrated superior performance.

Ghadi and Torok [21] investigated the impact of different road network segmentation approaches on the performance of various hotspot identification methods (HSIDs). The study compared four HSID techniques—Empirical Bayes (EB), excess EB, crash frequency, and crash rate—alongside four segmentation strategies: spatial clustering, equal segment length, equal traffic volume, and the standard segmentation method recommended by the Highway Safety Manual (HSM).

Khattak et al. [22] compared the performance of Empirical Bayes (EB) and Potential for Safety Improvement (PSI) methods for crash hotspot identification using advanced variants of the Negative Binomial model. By estimating both a random-parameter NB model and a varying-dispersion NB model, the authors showed that accounting for heterogeneity significantly improves crash prediction and hotspot ranking. Their evaluation using generalized consistency and ranking tests demonstrated that EB—especially when paired with the random-parameter NB model—outperforms PSI and provides more reliable hotspot identification, particularly for out-of-sample data. The study reinforces EB as a robust, computationally efficient standard for safety management applications.

Saedi et al. [23] assessed the reliability of the Empirical Bayes method for predicting crash frequency on two-lane rural highways by incorporating uncertainty and Monte Carlo-based reliability analysis. Using data from 64 segments in Iran, they showed that EB estimates can vary significantly across sites and identified which model variables contribute most to prediction uncertainty, offering a framework to evaluate the robustness of EB-based safety assessments.

Carvalho et al. [24] developed a mixed-effects crash prediction model to examine how roadway geometry and environmental conditions influence crash frequency along a mountainous segment of Brazil’s BR-116 highway. Using a decade of segment-level monthly data, the authors applied a Negative Binomial Generalized Linear Mixed Model (GLMM) to account for overdispersion and unobserved heterogeneity across space and time. The study demonstrates the value of integrating geometric, traffic, and climatic variables in predictive modeling and provides a robust framework for identifying high-risk locations in complex terrain, contributing to more sustainable and data-driven road safety management in Brazil.

Islam et al. [25] evaluated machine learning approaches for predicting crash severity and identifying crash hotspots in Al-Ahsa, Saudi Arabia. Using crash data from 2016 to 2018, the authors compared logistic regression, gradient boosting, and random forest models, finding that random forest achieved the highest predictive accuracy and identified key factors such as crash cause, collision type, distracted driving, and speeding as major contributors to severe injuries. The study also incorporated spatial autocorrelation and hotspot analysis (Getis-Ord Gi* and Moran’s I), revealing strong spatial clustering of severe crashes. Their combined machine learning and GIS framework demonstrates an effective approach for both severity prediction and hotspot detection in urban road networks.

Research on network screening methods has consistently highlighted the effectiveness of the Empirical Bayes (EB) method. However, smaller agencies lack technical expertise and data to use the more sophisticated EB method. Further, most of the evaluation studies of existing screening methods have focused on urban roadway networks with little research, if any, in comparing alternative screening methods in rural contexts where crashes are sparse, and roadways are operated by small agencies with limited resources. These gaps underscore the need to identify network screening (i.e., crash risk prediction) approaches that can perform effectively in rural contexts and be used by agencies with limited access to data and technical expertise.

2.2. Research Gap and Contribution of the Study

This study seeks to address the research gap discussed earlier by conducting a comprehensive evaluation of three proposed network screening methods (GRS, CRI, and P-EB methods) under rural highway conditions with limited data availability. The study examines multiple evaluation metrics, including consistency with crash history and temporal stability, to provide a nuanced understanding of method reliability over time. By systematically comparing these methods and clarifying their strengths and limitations, this research contributes to the development of data-driven strategies that support long-term safety planning and sustainable mobility objectives.

3. An Overview of the Proposed Network Screening Methods

To enhance clarity and facilitate comparison, Table 1 provides a comparative overview of the three network screening methods evaluated in this study—Global Risk Scoring (GRS), Crash Risk Index (CRI), and Predicted Empirical Bayes (P-EB). The table summarizes the key inputs and ranking basis, offering readers a concise reference to understand the distinguishing features of each approach.

The following subsections provide a descriptive overview of the proposed methods.

3.1. The Global Risk Scoring Method (GRS)

The GRS method applies two heuristic-based scoring systems to independently evaluate intersections, and roadway stretches separately along rural two-lane highways [18]. The scoring schemes were based on the Highway Safety Manual. The proposed method takes into account a variety of factors, including roadway and roadside features, observed crashes, and traffic volume. By assigning individual scores to variables related to roadway features, traffic conditions, and crash data, the method yields a total score for each site, which can then be used to rank sites across the network. The proposed scoring system for the roadway segments is shown in Table 2. Detailed information about this method can be found in a study by Al-Kaisy and Raza [26] and is not included here due to space constraints.

3.2. The Crash Risk Index Method (CRI)

A Crash Risk Index (CRI) was developed and proposed to serve as a multi-criteria risk assessment tool, integrating three key elements: geometric and roadside features, crash history, and traffic exposure [19]. The CRI is defined by the equation below.

C R I = W_{G} * X_{G} + W_{C} * X_{C} + W_{T} * X_{T}

(1)

where CRI is the crash risk index,

W_{G} and X_{G}

are the weight and numerical score of geometric and roadside features,

W_{C} and X_{C}

are the weight and score for the contribution of the crash history, and

W_{T} and X_{T}

are the weight and score for the contribution of traffic exposure. Detailed equations for calculating the scores X_G, X_C, and X_T, along with the underlying variables, can be found in a study by Al-Kaisy et al. [27]. These models/equations were developed using extensive crashes, traffic, and roadway data on rural two-lane highways in Oregon.

3.3. The Predicted Empirical Bayes Method (P-EB)

The P-EB approach was introduced for application on rural two-lane highways [28]. The primary advantage of this method lies in its simplicity of implementation, particularly for local transportation agencies that may lack access to comprehensive roadway databases or advanced analytical expertise. In contrast to more complex techniques such as the traditional Empirical Bayes (EB) method, which depend on precise quantitative inputs for multiple roadway attributes, the P-EB approach employs categorized representations of geometric and roadside features. These categorical variables can be readily used by local entities without detailed data inventories or major field measurements, thereby enhancing the practicality and accessibility of the method for smaller jurisdictions. Table 3 shows the explanatory variables for the proposed model.

Using data collected from sample roadway segments, the approach involves developing an EB-based predictive model for the target network through multivariate linear regression analysis. Once established, this model can be subsequently applied to rank the roadway segments based on their safety improvement needs. The original model formulated using sample roadway segment data from Oregon is presented in Equation (2).

L n E x p = - 8.2709 + 0.7292 * \frac{1}{F O} + 0.02717 * D D + 0.98309 * L n (V) + 0.10126 * D C + 0.94290 * L n (S L)

(2)

where Exp represents the predicted Empirical Bayes expected number of crashes; SL denotes the length of the segment in miles; V refers to traffic exposure (AADT); FO indicates fixed object rating; DD stands for driveway density; and DC represents the degree of curvature. More information on variable classification and other details on method development can be found in a study by Al-Kaisy and Huda [28].

4. Study Design

This study compared the performance of three network screening methods, which are discussed below.

4.1. Study Data

Given the critical importance of data availability and accessibility for ensuring methodological rigor and transparency, this study focused on a state-owned rural two-lane highway network in Oregon. The methodology—based on segment-level analysis and network-wide sampling—was designed to be adaptable and replicable in other contexts where similar roadway and traffic data are available. Specifically, a sample consisting of roadway segments with a total length of 1495 miles was used in this study. The study sample comprised roadways from various parts across the state to ensure adequate geographic coverage, as shown in Figure 1.

All the state-owned rural two-lane highways were located using online geographic information system (GIS) data [29]. Data collection for roadway segments occurred in increments of 0.05 miles to ensure accurate coverage of physical characteristics, thereby minimizing the risk of overlooking changes in roadway attributes between consecutive observation points. The study sample included only roadway segments with a posted speed limit of 55 mph, explicitly excluding intersections. Additionally, 0.05-mile segments located upstream and downstream of the approaches to intersections were also not included in the dataset.

Data on traffic volumes, roadway geometry, roadside features, and crashes were compiled using the Oregon Department of Transportation (ODOT) online databases and video logs [31,32].

Crash data covering a 10-year period (2011–2020) were obtained from the ODOT online crash database, and corresponding traffic data for the same time frame were compiled separately. The ODOT online database was reviewed to confirm that no significant changes in roadway geometry occurred within the study segments during the analysis period [33]. Further, the initial study sample was reviewed and validated by Oregon DOT personnel to ensure that no modifications had occurred on any of the selected segments.

The final dataset was divided into uniform roadway segments using key variables such as traffic volume (AADT), lane type, lane width, shoulder type, and shoulder width. The change in any of these variables shows the end of one segment and the beginning of the next.

Following this segmentation process, a total of 377 segments covering 1495 miles were identified. To maintain the consistency of the analysis, segments with incomplete data were removed from the sample to avoid potential bias in the study outcomes.

4.2. Methods

The first five-year data (2011–2015) is used for model development, and the following five years (2016–2020) are used for the evaluation. Three ranking approaches were applied to identify high-risk segments: the GRS method, CRI method, and P-EB method. In the GRS method, segments are ranked based on crash density, derived by dividing the final score by the total length of the segment considered, whereas in the CRI method, segments are ranked based on the crash rate obtained by dividing the risk index by the total segment length and traffic exposure. Similar to the GRS method, in the EB prediction method, segments are ranked based on crash density.

Following the ranking process, two tiers of consistency testing were performed to assess and compare the reliability of the screening methods. Test 1 evaluated each method’s alignment with historical crash patterns and was conducted separately for:

(a): Fatal and injury crashes only;
(b): All reported crashes.

For each crash category, method performance was assessed using the Spearman rank correlation coefficient, average rank difference, root mean square error (RMSE), and true positive identification of high-risk segments. This dual-level evaluation allowed for a more detailed understanding of how each method responds to variations in crash severity distributions.

Test 2 evaluated consistency across methods through method consistency tests, total rank difference, and average rank difference. This systematic approach ensured a robust comparison of ranking methodologies and provided insights into the reliability and effectiveness of the proposed approaches for network screening.

The study approach in performing comparative analysis is shown in the flowchart in Figure 2.

5. Analysis and Results

The metrics used for evaluating the methods are discussed in detail in this section. Specifically, two different types of analysis are performed to compare the performance of different methods: consistency check of different methods with crash history, and consistency check within methods (across time periods). The comparison is performed using different upper tail segment groups (20, 40, 60, 80, 100) and the total study samples of 377 segments. The roadway segments are ranked in descending order, with high-risk sites being at the top of the list. Upper tail groups are defined as subsets of roadway segments selected from the highest-ranked portion of this ordered distribution. Upper tails refer to the segment groups that are high in the ranking list. For example, upper tail 20 includes the top 20 segments after the ranking is performed. Thus, an upper tail group represents a progressively larger set of roadway segments with comparatively higher crash density values. The crash data characteristics associated with each upper tail group are summarized using descriptive statistics of five-year crash density, including the mean, median, standard deviation, minimum, and maximum values, as shown in Table 4. These statistics characterize both the central tendency and variability of crash density within each upper tail group and facilitate comparison across different ranking thresholds.

As shown in Table 4, upper tail groups exhibit substantially higher crash density values than the full sample. For example, the top 20 segments have a mean crash density of 30.18 crashes per mile over five years, compared to 6.46 crashes per mile for the full dataset. Crash density decreases monotonically as the upper tail threshold expands, indicating that the ranking procedure effectively concentrates the highest-risk segments in the upper portion of the distribution.

The crash density distributions are right-skewed across all upper tail groups, as evidenced by mean values exceeding median values and relatively large standard deviations. The maximum crash density remains constant across upper tail groups because the highest-ranked segment is included in all upper tail thresholds. Elevated maximum values are partly attributable to short roadway segments for which a small number of crashes yields high crash density.

5.1. Consistency with Crash History

Figure 3 illustrates scatterplots with ranking of the sites using the three different methods versus observed crashes for the two crash severity groupings: (i) fatal and injury crashes only and (ii) all reported crashes. The 45-degree lines shown in orange represent 100% correlation between the two respective ranks. A thorough study of the scatterplots for the fatal and injury crash analysis (Figure 3a–c) shows a few worth noting observations. First, the two rankings correspond well to each other for most roadway segments using the GRS method, which is followed by the P-EB method, and is indicated by the clustered datapoint within the diagonal line. However, Figure 3a, which shows a scatterplot for the CRI method versus crash history, suggests that the rankings do not correspond well to each other. The scatterplots also reveal a subtle downward shift in data points below the diagonal line, which becomes more noticeable with the increase in ranks. This trend is primarily due to multiple sites having identical ranks based on crash history but differing ranks when evaluated using the respective method, particularly in cases where segments recorded zero crashes during the evaluation period. This is backed by the observation that the total distinct rank number obtained from observed crashes is less than that of other proposed methods. Overall, the close alignment of points within the diagonal line in Figure 3b,c—corresponding to the GRS and P-EB methods, respectively—suggests a strong agreement between the two ranking approaches. This is reinforced by correlation coefficients of 0.786 for the P-EB method and 0.82 for the GRS method. On the other hand, the CRI method exhibited weaker correlation as evidenced by the scattered data points and a correlation coefficient of 0.62. A perfect correlation means the data will be clustered as a straight diagonal line and a correlation coefficient of

1

.

A similar pattern is observed for the all-crash dataset, shown in Figure 3d–f with a different ordering of performance for the CRI and P-EB methods. Again, the GRS method provides the strongest correlation with historical crash ranks, followed by the CRI and P-EB methods. The consistent ordering of performance across both crash datasets indicates that the GRS method maintains the highest level of agreement with observed crash frequencies, regardless of crash severity classification.

Spearman’s rank correlation coefficient is also computed between the three different methods and the historical crash rankings, as illustrated in Figure 4. The graph presents Spearman’s rank correlation coefficients calculated for the entire study sample, along with subsets consisting of the top 20, 40, 60, 80, and 100 highest-ranked sites (i.e., the upper tail of the ranked list). Across both crash-severity groupings—fatal and injury crashes only (Figure 4a) and all crashes (Figure 4b)—the results clearly show that the GRS method consistently outperforms the CRI and P-EB methods across all levels of the upper tail analysis. The GRS method exhibits higher Spearman’s rank correlation coefficients across the study, reflecting a stronger relationship and better alignment with observed crashes. For instance, in the case of upper tail (twenty) segments using fatal and injury crash analysis—representing 5.31% of the total sample—the GRS method achieved a correlation coefficient of 0.912, significantly higher than 0.08 for the CRI method and 0.133 for the P-EB method. Similarly, for the hundred upper tail segments, the GRS method yielded a coefficient of 0.90, outperforming the CRI (0.55) and P-EB (0.45) methods. When the entire dataset was considered, Spearman’s rank correlation coefficients for the GRS method (0.82) demonstrated consistently strong performance across all segments, followed by P-EB (0.71) and CRI (0.62) methods. The greater correlation value signifies a strong agreement among the rankings obtained by using the GRS method and the crash history.

When analysis with all crashes recorded is considered, the findings are somewhat similar to those for the fatal and injury crash analysis, with minor differences. Again, for every upper tail considered, the GRS method outperforms the other two methods in terms of Spearman’s rank correlation coefficient. However, using the total study sample, the GRS method outperforms the CRI method as well, where the coefficient values for these two methods are 0.821 and 0.776, respectively.

Table 5 shows the rank root mean square error (RMSE) and the average rank difference in the three methods relative to the observed crash history, evaluated separately for fatal and injury crashes and all crashes. Metrics were calculated using upper tail segments comprising 20, 40, 60, 80, and 100 segments, as well as for the entire study sample. A brief review of Table 5 shows that, in general, with the increase in the number of upper tail segments, both the mean difference in rank and RMSE values tend to increase across all methods. This pattern suggests a greater difference in rankings with a larger number of segments studied. For the fatal and injury crash evaluation, the GRS method continuously demonstrated superior performance, yielding lower average rank differences and RMSE values compared to the other two methods. Specifically, the average difference in rank for the GRS method is 1.45 compared to 20.70 for the CRI method and 18.55 for the P-EB method, when the upper tail of 20 is considered.

The GRS method still demonstrates better performance than other methods in every upper tail segment considered using all recorded crash analysis. For example, the RMSE value for the upper tail 100 is 20.224 for the GRS method, followed by the CRI method with a value of 30.226 and the P-EB method with a value of 64.176. However, it is noteworthy that the value for average difference in rank and RMSE for the total study sample is lowest for the CRI method compared to the other two methods when analyzed using all crashes.

Figure 5 illustrates the rank differences between the historical crash and the three evaluated methods for the top 30 highest-ranking segments within the study sample using (a) fatal and injury crash analysis and (b) all-crash analysis. The y-axis indicates the rank difference, while the x-axis corresponds to the rank of that segment based on observed crash history. Specifically, the GRS method demonstrates the smallest and most stable rank deviations across all thirty segments, indicating the strongest consistency with historical crash patterns. Notably, the P-EB method exhibits smaller deviations than the CRI method for the majority of segments, placing P-EB in the second position for this crash-severity grouping. The CRI method, by contrast, shows larger and more frequent fluctuations, including several spikes exceeding 200 rank positions, suggesting lower stability when rankings are based solely on severe crashes. For example, the difference in ranking of a subject segment is less than 25 for the GRS method, but up to 100 for the P-EB method and above 300 for the CRI method.

In all-crash analysis, as shown in Figure 5b, while GRS again shows the highest consistency, the CRI method performs better than the P-EB method, with generally smaller deviations and fewer outliers. Specifically, the difference in rank is less than 19 for the GRS method compared to 27 for the CRI method and 164 for the P-EB method. Overall, study results suggest that the GRS method is more consistent with crash history, but the relative performance of CRI and P-EB depends on the dataset.

Figure 6 presents a comparison of three proposed methods—CRI, GRS, and P-EB—across various upper tail segments for the fatal-and-injury crash dataset (Figure 6a) and all-crash dataset (Figure 6b), in terms of detecting true positive sites based on observed crash. The comparison includes the percentage of common segments identified by each method.

Overall, the GRS method detected more true positive sites across all upper tail segment groups in both analyses, except for the upper tail (20) group analyzed using all crashes, where the performance was comparatively less favorable. In the fatal-and-injury crash analysis (Figure 6a), the GRS method correctly identified 100% of the segments across all upper tails except for upper tails 20, where it identified 85% of the true positive sites.

In all-crash analysis (Figure 6b), the GRS method accurately identified all the segments as those pinpointed by the historical crash ranking for upper tail groups of 60 or more. For the upper tail group of 40, it correctly identified 97.5% of the true positive segments. Although the CRI method achieved 80% or higher overlap with observed crash rankings across all upper tail groups, the GRS method consistently showed stronger alignment with historical crash rankings.

5.2. Consistency Check Within Network Screening Methods

A test was carried out to assess the consistency of rankings for identified safety improvement locations between two distinct time periods: 2016–2017 and 2018–2020. This evaluation test is performed for all three network screening methods. The term “within-method consistency” was employed to differentiate this assessment from the consistency of ranked lists using different screening methods with crash history discussed in Section 5.1.

A method consistency test, which evaluates the performance of a method by measuring the number of the same hotspots identified over two time periods, is proposed in the literature [16,34] and used in this study. Since no improvements or upgrades were carried out during the study period, road sections are expected to remain in the same operational state, and thus their expected safety performance remains virtually unaltered over the two periods. A good network screening method will identify the same set of safety improvement locations across two study periods. A greater overlap in identified hot spots between periods indicates higher reliability and consistency in the performance of the method. This within-method consistency test involves comparing and evaluating two ranked lists of the same network screening method in two consecutive periods (i and i + 1) using the following evaluation criteria:

M C T = {k_{1}, k_{2}, \dots \dots ., k_{n}}_{j},_{i} \cap {k_{1}, k_{2}, \dots \dots ., k_{n}}_{j},_{i + 1}

where

M C T

is the number of segments identified in both time periods;

k_{i}

is the

i^{t h}

ranked site identified as a safety improvement location;

n

is the number of upper tails considered;

j

is the network screening method being compared.

In this test, the intersection of segments identified as high risk in two subsequent study periods are assessed across different methods. The method resulting in the largest overlap of sites is considered the most consistent. The method consistency test is carried out for multiple upper tail segment groups, specifically 20, 40, 60, 80, and 100.

Figure 7 shows the percentage of similarly identified safety improvement sites by alternative network screening methods across two evaluation periods. The P-EB method demonstrated superior performance in this evaluation by identifying the highest number of consistent sites across all upper tail groups, with 20, 38, 59, 76, and 96 sites for the upper tails of 20, 40, 60, 80, and 100, respectively. Specifically, the P-EB method identified 96 percent of sites in the 2016–2017 period that were also identified in the 2018–2020 period for the upper tail of 100. The GRS method ranked second, slightly outperforming the CRI method, by identifying 70, 80, 71, 77, and 81 percent of consistent sites for the corresponding upper tails. The CRI method ranked last, yielding the fewest consistent sites between the two study periods. However, it is worth mentioning that the results obtained from the CRI method and the GRS method are highly comparable. The results obtained from this test may seem initially unexpected, given what was obtained when comparing the performance of network screening with crash history as a reference, where the P-EB method showed the worst performance. However, this can partly be explained by the nature of crash occurrence, which is sporadic and random, and due to the regression-to-mean effect, which is implicitly accounted for by the P-EB method.

A total rank difference test was also performed to compare the performance of the three different network screening methods. The evaluation is performed by computing the total sum of rank differences in the segments identified across the two analysis periods. The smaller total rank difference indicates higher consistency of the subjected method, signifying stable site rankings over time. The total rank difference for a method being compared is calculated as:

T R D = \sum_{k = 1}^{n} {R (k_{j},_{i}) - R (k_{j},_{i + 1})}

where

T R D

is the total rank difference;

R

is the rank for site

k

in period

i

for method

j

;

i + 1

is the subsequent time period;

n

is the number of upper tails considered.

The total rank difference and average difference in rank of the different upper tail groups using the three different methods—CRI Method, GRS Method, and P-EB Method—over two periods are calculated and presented in Table 6. This table clearly shows that the P-EB method is vastly superior in this test. Specifically, in every upper tail group used in analysis, the P-EB method has a significantly smaller rank difference. For example, for the upper tail 20, the total rank difference for the P-EB is 86.75% lower than the GRS method and 94.52% lower than the CRI method. Similarly, for the upper tail 100, the average rank difference for the P-EB is 74.66% less compared to the GRS method, and 86.45% less compared to the CRI method. These results suggest that the P-EB method is the best network screening method (of the three evaluated here) for ranking sites consistently from period to period.

5.3. Compararison Findings

The comparative results observed across the GRS, CRI, and P-EB methods can be explained by the fundamental scientific principles underlying each approach. The GRS method is strongly influenced by empirical crash counts and heuristic risk-factor scoring. Because it directly incorporates observed crash history and assigns substantial weight to high-risk geometric and roadside features, GRS is highly responsive to actual crash patterns. This responsiveness explains its strong consistency with historical crash rankings; however, it also makes the method more susceptible to random year-to-year crash variability, particularly in rural environments where crashes are sparse and stochastic.

The CRI method integrates crash modification factor-based weights to represent the relative influence of geometric, roadside, and exposure-related variables. This structure provides a more systematic representation of risk compared to GRS, but it may reduce sensitivity to localized geometric nuances that are not fully captured by published CMFs. As a result, CRI tends to produce moderate but stable performance across both historical and temporal consistency tests, reflecting its hybrid nature that balances reactive and proactive elements.

The P-EB method relies on regression-based predictions of Empirical Bayes expected crashes using categorized roadway and roadside variables. This modeling framework inherently smooths random variation in crash counts and emphasizes long-term geometric and exposure-related risk factors. Consequently, P-EB exhibits the highest temporal stability among the three methods, as its predictions are less influenced by short-term fluctuations in crash occurrence. However, this smoothing effect also reduces its ability to replicate historical crash rankings with the same fidelity as GRS.

Taken together, these distinctions clarify why GRS excels in historical consistency, P-EB demonstrates superior temporal stability, and CRI maintains intermediate performance across both dimensions. Understanding these methodological differences is essential for selecting a screening approach that aligns with agency priorities, data availability, and long-term safety management objectives.

6. Discussion

Because of the random nature of crash occurrence and regression-to-mean effect, it is obvious that areas experiencing unusually high crash rates in one period are likely to see a reduction in crash rates in subsequent periods and vice versa due to chance alone. This regression to the mean effect suggests that even without any intervention or improvement in safety measures, the crash rates in these areas are likely to stabilize or decrease over time simply due to random variation.

To assess the performance of the proposed network screening methods, two distinct tests were conducted. The first test involved evaluating the performance of the methods based on historical crash data, essentially using past crash occurrences as a reference to identify candidate sites for safety improvements. The second test, however, focused on assessing the consistency of the methods in identifying hotspot locations across two distinct periods, irrespective of crash history. Since no safety improvements were implemented during the study period, a reliable network screening method should consistently identify the same sites as hotspot locations in both periods.

A temporal test was conducted to understand the nature of crash occurrence, focusing solely on crash history. Sites were ranked based on crash density alone for two consecutive periods (2016–2017 and 2018–2020), assessing whether sites identified as safety improvement locations in the first period were also identified in the second period. The results obtained from various metrics were calculated and presented in Table 7.

The results of the temporal test revealed the random occurrence of crashes. Sites identified as hotspot locations in one period did not consistently retain their hotspot status in the subsequent period. For example, using 20 upper tail segments, only seven segments identified as safety improvement locations in period (2016–2017) are identified as such in (2018–2020), which represent 35% overlap between the two groups. Similarly, when 100 upper tail segments are considered, only 65% of the sites identified as hotspots in period 1 are also identified as such in period 2. Also, the average and total rank difference are high, which indicate irregularity. This inconsistency highlights the transient nature of crash occurrences and emphasizes the importance of considering factors beyond historical crash data alone when evaluating the performance of network screening methods.

To provide deeper insight, the analysis of prediction deviations within threshold intervals demonstrates that smaller segment sets (e.g., upper tail 20) exhibit greater instability, while broader sets (e.g., upper tail 100) improve consistency but still fall short of reliability. This finding suggests that agencies relying exclusively on crash history may misallocate resources by targeting sites that do not persist as high-risk over time. Consequently, proactive methods are essential for identifying locations with sustained risk potential, thereby supporting long-term safety planning and sustainable mobility objectives.

Overall, these results emphasize the importance of integrating predictive approaches into network screening. Methods that account for traffic exposure, roadway characteristics, and statistical rigor can mitigate the limitations of crash-history-based rankings and enable more effective, data-driven safety interventions.

7. Summary and Conclusions

Road safety management program is a vital component in promoting sustainable mobility, and the accurate detection of safety improvement sites through effective network screening is essential to success. The study presented in this paper has evaluated the performance of the three proposed network screening methods for use on rural roadway networks: the Global Risk Scoring, the Crash Risk Index, and the EB prediction methods. The research first assessed the performance of the three network screening methods through a consistency check with crash history as a reference using two crash severity definitions: (a) fatal and injury crashes and (b) all reported crashes. Then, the research evaluated the performance based on the temporal consistency test, where the consistency in ranking is checked within the network screening method for two different observation periods. The major findings of the study are summarized below.

The GRS method was found to be more consistent with observed crash history across both fatal and injury crashes and all reported crashes. This result was consistently supported by higher Spearman rank correlation coefficients, higher true positive identification rates, lower average rank differences, and lower rank-based root mean square errors compared with the CRI and EB prediction methods.
Differences in relative performance between the CRI and EB prediction methods were observed depending on the crash severity considered. When fatal and injury crashes were used as the reference, the EB prediction method exhibited stronger consistency with crash history than the CRI method. In contrast, when all reported crashes were analyzed, the CRI method outperformed the EB prediction method and ranked second after GRS. These findings indicate that method performance is sensitive to the crash severity definition used in the evaluation and highlight the importance of aligning screening objectives with the appropriate crash dataset.
In examining the temporal consistency of network screening methods, contrary to the poor results for consistency with crash history, the EB prediction method exhibited the highest level of consistency in rankings over the two observation periods used in the study. In this test, the EB prediction method was followed by the GRS and the CRI methods, respectively.

The comparative analyses presented in this paper highlighted the strengths and limitations of the three methods investigated. The results allow for improved action strategies in terms of sustainability and safety by guiding agencies toward methods that best align with their objectives and resource constraints. For instance, the strong correlation of the GRS method with crash history supports that it is particularly well-suited for agencies seeking to identify locations with a documented pattern of severe crashes. Because GRS directly incorporates empirical crash counts and high-risk geometric features, it can help practitioners quickly pinpoint segments where safety interventions are likely to yield immediate and measurable benefits. This makes GRS a practical choice for short-term project selection, especially in rural areas where severe crashes tend to be geographically dispersed and difficult to detect using traditional frequency-based approaches. Meanwhile, the superior temporal stability of the P-EB method enables agencies to prioritize sites that remain high-risk across planning horizons, fostering sustainable investment strategies. Because P-EB predictions are less sensitive to short-term crash fluctuations, the method is well-positioned to support multi-year investment strategies, network-level risk monitoring, and proactive identification of emerging safety concerns. This is especially important for rural agencies that may not experience enough crashes annually to rely on reactive methods alone. The ability of P-EB to operate effectively with categorical roadway data also makes it accessible to smaller jurisdictions with limited data inventories.

Due to the simplicity and minimal data requirements, all three methods offer practical solutions for network screening on rural highway networks, especially for local agencies—such as counties, townships, and tribal governments—that may have limited access to data or technical resources. Furthermore, these methods can be readily adapted to other rural networks or low-resource contexts beyond Oregon, provided basic inputs are available, making them replicable to other contexts. By selecting the most appropriate method based on performance and context, transportation agencies can enhance roadway safety while advancing the broader goals of sustainable and resilient mobility systems.

8. Limitations and Future Research

While this study provides valuable insights into the comparative performance of network screening methods, several limitations should be acknowledged. First, the analysis was limited to rural two-lane highways within a single state, which may restrict the generalizability of findings to other roadway types or regions with different traffic and crash characteristics. Second, the evaluation relied on available crash and roadway inventory datasets, which may contain reporting inconsistencies or unobserved factors that could influence the resulting performance assessments. Future research should explore the application of these methods to urban networks and multilane facilities, incorporate additional risk factors such as roadway environment and driver behavior, and assess the effectiveness of prioritized sites through before-and-after safety studies. Integrating advanced data sources—such as real-time traffic monitoring and connected vehicle data—could further enhance predictive accuracy and support proactive safety management strategies aligned with sustainability goals.

Author Contributions

Conceptualization: A.A.-K.; data compilation: B.D.; analysis and interpretation of results: B.D. and A.A.-K.; draft manuscript preparation: B.D. and A.A.-K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by US DOT Small Urban, Rural, and Tribal Center on Mobility (SURTCOM).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are openly available in fig share at https://doi.org/10.6084/m9.figshare.28119944.v1.

Conflicts of Interest

Author(s) [Bishal Dhakal] was employed by [WSP USA Inc.]. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Herbel, S.; Laing, L.; McGovern, C. Highway Safety Improvement Program Manual: The Focus Is Results; Publication FHWA-SA-09-029; Federal Highway Administration: Washington, DC, USA, 2010. [Google Scholar]
AASHTO. Highway Safety Manual, 1st ed.; American Association of State Highway and Transportation: Washington, DC, USA, 2010; ISBN 978-1-56051-477-0. [Google Scholar]
Fatality Facts. IIHS-HLDI Crash Testing and Highway Safety. Available online: https://www.iihs.org/topics/fatality-statistics/detail/urban-rural-comparison (accessed on 20 July 2023).
Cheng, W.; Washington, S.P. Experimental Evaluation of Hotspot Identification Methods. Accid. Anal. Prev. 2005, 37, 870–881. [Google Scholar] [CrossRef]
Rashidi, M.H.; Keshavarz, S.; Pazari, P.; Safahieh, N.; Samimi, A. Modeling the Accuracy of Traffic Crash Prediction Models. IATSS Res. 2022, 46, 345–352. [Google Scholar] [CrossRef]
Song, Y.; Kou, S.; Wang, C. Modeling Crash Severity by Considering Risk Indicators of Driver and Roadway: A Bayesian Network Approach. J. Saf. Res. 2021, 76, 64–72. [Google Scholar] [CrossRef] [PubMed]
Gooch, J.P.; Gayah, V.V.; Donnell, E.T. Safety Performance Functions for Horizontal Curves and Tangents on Two Lane, Two Way Rural Roads. Accid. Anal. Prev. 2018, 120, 28–37. [Google Scholar] [CrossRef] [PubMed]
Khattak, M.W.; Pirdavani, A.; De Winne, P.; Brijs, T.; De Backer, H. Estimation of Safety Performance Functions for Urban Intersections Using Various Functional Forms of the Negative Binomial Regression Model and a Generalized Poisson Regression Model. Accid. Anal. Prev. 2021, 151, 105964. [Google Scholar] [CrossRef] [PubMed]
Persaud, B.; Hauer, E. Comparison of Two Methods for Debiasing Before-and-After Accident Studies. Transp. Res. Rec. 1984, 975, 43–49. [Google Scholar]
Hauer, E.; Persaud, B.N. Problem of Identifying Hazardous Locations Using Accident Data. Transp. Res. Rec. 1984, 975, 36–43. [Google Scholar]
Higle, J.L.; Hecht, M.B. Comparison of Techniques for the Identification of Hazardous Locations. Transp. Res. Rec. 1989, 1238, 10. [Google Scholar]
Maher, M.; Mountain, L. The Sensitivity of Estimates of Regression to the Mean. Accid. Anal. Prev. 2009, 41, 861–868. [Google Scholar] [CrossRef]
Kwon, O.H.; Park, M.J.; Yeo, H.; Chung, K. Evaluating the Performance of Network Screening Methods for Detecting High Collision Concentration Locations on Highways. Accid. Anal. Prev. 2013, 51, 141–149. [Google Scholar] [CrossRef]
Ambros, J.; Valentová, V.; Janoška, Z. Investigation of Difference Between Network Screening Results Based on Multivariate and Simple Crash Prediction Models. In Proceedings of the Transportation Research Board 94th Annual Meeting Transportation Research Board, Washington, DC, USA, 11–15 January 2015. [Google Scholar]
Ambros, J.; Havránek, P.; Valentová, V.; Křivánková, Z.; Striegler, R. Identification of Hazardous Locations in Regional Road Network—Comparison of Reactive and Proactive Approaches. Transp. Res. Procedia 2016, 14, 4209–4217. [Google Scholar] [CrossRef]
Montella, A. A Comparative Analysis of Hotspot Identification Methods. Accid. Anal. Prev. 2010, 42, 571–581. [Google Scholar] [CrossRef] [PubMed]
Cafiso, S.; Di Silvestro, G. Performance of Safety Indicators in Identification of Black Spots on Two-Lane Rural Roads. Transp. Res. Rec. 2011, 2237, 78–87. [Google Scholar] [CrossRef]
Elvik, R. Comparative Analysis of Techniques for Identifying Locations of Hazardous Roads. Transp. Res. Rec. 2008, 2083, 72–75. [Google Scholar] [CrossRef]
Dhakal, B.; Al-Kaisy, A. A New Approach for Identifying Safety Improvement Sites on Rural Highways: A Validation Study. Appl. Sci. 2024, 14, 1413. [Google Scholar] [CrossRef]
Dhakal, B.; Al-Kaisy, A. An Empirical Evaluation of a New Heuristic Method for Identifying Safety Improvement Sites on Rural Highways: An Oregon Case Study. Sustainability 2024, 16, 2047. [Google Scholar] [CrossRef]
Ghadi, M.; Török, Á. A Comparative Analysis of Black Spot Identification Methods and Road Accident Segmentation Methods. Accid. Anal. Prev. 2019, 128, 1–7. [Google Scholar] [CrossRef]
Khattak, M.W.; De Backer, H.; De Winne, P.; Brijs, T.; Pirdavani, A. Comparative Evaluation of Crash Hotspot Identification Methods: Empirical Bayes vs. Potential for Safety Improvement Using Variants of Negative Binomial Models. Sustainability 2024, 16, 1537. [Google Scholar] [CrossRef]
Saedi, H.; Kordani, A.A.; Behnood, H.R. Reliability Analysis of the Empirical Bayes Method in Estimating Crash Frequency on Two-Lane, Two-Way Rural Highways. Transp. Res. Rec. J. Transp. Res. Board 2026, 03611981251409203. [Google Scholar] [CrossRef]
Carvalho, F.L.d.; Larocca, A.P.C.; Albarracin, O.Y.E. Predictive Modeling of Crash Frequency on Mountainous Highways: A Mixed-Effects Approach Applied to a Brazilian Road. Sustainability 2026, 18, 395. [Google Scholar] [CrossRef]
Islam, M.K.; Reza, I.; Gazder, U.; Akter, R.; Arifuzzaman, M.; Rahman, M.M. Predicting Road Crash Severity Using Classifier Models and Crash Hotspots. Appl. Sci. 2022, 12, 11354. [Google Scholar] [CrossRef]
Al-Kaisy, A.; Raza, S. A Novel Network Screening Methodology for Rural Low-Volume Roads. J. Transp. Technol. 2023, 13, 599–614. [Google Scholar] [CrossRef]
Al-Kaisy, A.; Ewan, L.; Hossain, F. Identifying Candidate Locations for Safety Improvements on Low-Volume Rural Roads: The Oregon Experience. Transp. Res. Rec. 2019, 2673, 690–698. [Google Scholar] [CrossRef]
Huda, K.T.; Al-Kaisy, A. Network Screening on Low-Volume Roads Using Risk Factors. Future Transp. 2024, 4, 257–269. [Google Scholar] [CrossRef]
ODOT TransGIS. Available online: https://gis.odot.state.or.us/transgis (accessed on 5 June 2023).
Esri. ArcGIS Pro, version 3.1; Environmental Systems Research Institute: Redlands, CA, USA, 2023.
Oregon Department of Transportation. Road Assets and Mileage: Data & Maps: State of Oregon. Available online: https://www.oregon.gov/odot/Data/pages/road-assets-mileage.aspx (accessed on 5 June 2023).
TDS—Crash Reports. Available online: https://tvc.odot.state.or.us/tvc/ (accessed on 5 June 2023).
Oregon Department of Transportation. Project Information: Transparency, Accountability and Performance: State of Oregon. Available online: https://www.oregon.gov/odot/TAP/Pages/ProjectInformation.aspx (accessed on 20 July 2023).
Cheng, W.; Washington, S. New Criteria for Evaluating Methods of Identifying Hot Spots. Transp. Res. Rec. 2008, 2083, 76–85. [Google Scholar] [CrossRef]

Figure 1. Study area description: (a) Oregon state highlighted in the map of the United States; (b) sample highway considered in the State of Oregon [29,30].

Figure 2. Data flowchart for study design.

Figure 3. Comparison of rankings using three proposed methods versus observed crash history for fatal and injury crashes and all crashes.

Figure 4. Comparison of Spearman’s rank correlation coefficient between three proposed methods and crash history: (a) fatal and injury crashes and (b) all crashes.

Figure 5. Difference in ranking for the three compared methods for various analysis periods. (a) fatal and injury crash analysis. (b) all-crash analysis.

Figure 6. True positive identification of the three compared methods: (a) fatal and injury crash evaluation and (b) all-crash evaluation.

Figure 7. Method Consistency Test Results of Compared Methods for Different Upper Tails.

Table 1. Overview of the proposed network screening methods.

Proposed Methods	Key Inputs	Ranking Basis
GRS Method	Crash data, traffic volume, Roadway and roadside characteristics	Global Risk Score
CRI Method	Crash history, Traffic exposure, Roadway geometry, and roadside features	Weighted Risk Index
P-EB Method	Traffic exposure, Categorized roadway and roadside variables	Predicted crashes using EB-based regression model

Table 2. Proposed scoring schemes for roadway segments [26].

Safety Related Questions	If Yes, Add:
Risk Factors
Total Width (TD)
TD ≤ 20 ft?	7
20 ft < TD ≤ 24 ft?	4
Horizontal curve [Radius (R)]
Flatter curve (R ≥ 300 ft)	30
Sharper curve (R < 300 ft)	60
Grade steeper than 4%?	3
Six or more driveways per mile?	5
Side slope steeper than 1 V:3 H?	4
Fixed objects within 15 ft of travel lane?	4
Unpaved road?	14
Poor pavement conditions? (Rutting, potholes, etc.)	7
Crash History Available?
Number of fatal or serious injury crashes (N1)	N1 X 80
Number of other crashes (N2)	N2 X 5
Relative risk Compound Scores (RRCS)
Speed ≥ 50 mph?	RRCS X 1.25
Got Annual Daily Traffic (ADT)?
ADT ≤ 300	RRCS X 1.0
300 < ADT ≤ 600	RRCS X 3.0
600 < ADT ≤ 1000	RRCS X 5.0
ADT ≥ 1000	RRCS X 7.0
Global Risk Score (GRS)

Table 3. Explanatory variables of the proposed model [28].

Risk Factors	Approximate Ranges of Variables	Categories	Terms
Segment Length (SL)	Exact Length
Lane Width (LW)	LW < 11	1	Narrower
Lane Width (LW)	LW ≥ 11	2	Wider
Shoulder Width (SW)	SW < 1.8	1	Narrower
Shoulder Width (SW)	SW ≥ 1.8	2	Wider
Degree of Horizontal Curvature (DC)	DC = 0	0	Straight
	DC < 10	1	Mild
	10 ≤ DC < 27	2	Moderate
	DC ≥ 27	3	Sharp
Grade (G)	G < 4	0	Mild
Grade (G)	G ≥ 4	1	Steep
Driveway Density (DD) (driveways per mile)	Exact Number
Side Slope (SS)	Steep	1	Steep
	Moderate	2	Moderate
	Flat	3	Flat
Fixed Objects (FO)	Many	1	Many
	Some	2	Some
	Few	3	Few
Volume (V)	Exact Volume

Table 4. Descriptive Statistics of Five-Year Crash Density (All Reported Crashes) by Upper Tail Group.

Upper Tail Group	Segment Proportion	Mean Crash Density (Crash/Mile)	Median Crash Density	Standard Deviation	Minimum Value	Maximum Value
Upper Tail (20)	5.31%	30.18	25.12	20.95	19.33	120
Upper Tail (40)	10.61%	22.05	19.15	16.53	12	120
Upper Tail (60)	15.92%	18.08	13.85	14.62	9.33	120
Upper Tail (80)	21.22%	15.45	11.30	13.18	7.8	120
Upper Tail (100)	26.53%	13.88	9.96	12.32	6.61	120
Full Sample (377)	100%	6.46	4.44	8.62	0	120

Table 5. Average Rank Difference and Root Mean Square Error for Different Evaluations.

Segment Range (#)	Segment Proportion (%)	Average Rank Difference			Root Mean Square Error
Segment Range (#)	Segment Proportion (%)	CRI Method	GRS Method	P-EB Method	CRI Method	GRS Method	P-EB Method
Fatal and Injury Crash Evaluation
Upper Tail (20)	5.31%	20.700	1.450	18.550	30.396	3.801	48.303
Upper Tail (40)	10.61%	24.425	2.375	17.500	34.171	4.558	38.061
Upper Tail (60)	15.92%	30.083	3.567	21.367	44.584	6.382	45.051
Upper Tail (80)	21.22%	30.825	5.400	26.238	43.491	9.721	45.640
Upper Tail (100)	26.53%	37.580	7.070	28.070	55.793	13.396	46.222
Total Sample (377)	100%	66.154	46.239	63.692	92.566	76.615	90.010
All-Crash Evaluation
Upper Tail (20)	5.31%	8.600	6.500	54.000	10.266	7.053	87.711
Upper Tail (40)	10.61%	10.850	7.475	60.875	14.866	10.652	92.701
Upper Tail (60)	15.92%	13.933	9.933	55.817	20.210	15.965	85.393
Upper Tail (80)	21.22%	16.363	12.275	60.875	23.687	16.862	90.826
Upper Tail (100)	26.53%	24.650	14.190	60.800	42.079	20.575	88.354
Total Sample (377)	100%	48.971	54.594	75.252	70.412	81.408	92.564

Table 6. Total and Average Rank Difference Results for Various Compared Methods.

Segment Range (#)	Segment Proportion (%)	Average Rank Difference			Total Rank Difference
Segment Range (#)	Segment Proportion (%)	CRI Method	GRS Method	P-EB Method	CRI Method	GRS Method	P-EB Method
Upper Tail (20)	5.31%	18.250	7.550	1.000	365	151	20
Upper Tail (40)	10.61%	48.950	11.225	2.000	1958	449	80
Upper Tail (60)	15.92%	43.650	14.867	2.800	2619	892	168
Upper Tail (80)	21.22%	39.375	18.350	3.738	3150	1468	299
Upper Tail (100)	26.53%	38.680	20.680	5.240	3868	2068	524

Table 7. Temporal Test for Crash History Ranking in Two Periods.

Segment Range (#)	Segment Proportion (%)	# of Common Segments	% of Common Segments	Average Rank Difference	Total Rank Difference
Upper Tail (20)	5.31%	7	35	40.65	813
Upper Tail (40)	10.61%	25	62.5	44.975	1799
Upper Tail (60)	15.92%	38	63.33	47.633	2858
Upper Tail (80)	21.22%	49	61.25	50.25	4020
Upper Tail (100)	26.53%	65	65	54.17	5417

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dhakal, B.; Al-Kaisy, A. Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways. Sustainability 2026, 18, 2008. https://doi.org/10.3390/su18042008

AMA Style

Dhakal B, Al-Kaisy A. Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways. Sustainability. 2026; 18(4):2008. https://doi.org/10.3390/su18042008

Chicago/Turabian Style

Dhakal, Bishal, and Ahmed Al-Kaisy. 2026. "Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways" Sustainability 18, no. 4: 2008. https://doi.org/10.3390/su18042008

APA Style

Dhakal, B., & Al-Kaisy, A. (2026). Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways. Sustainability, 18(4), 2008. https://doi.org/10.3390/su18042008

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of the Effectiveness of Three Proposed Network Screening Methods for Safety Improvement Sites on Rural Highways

Abstract

1. Introduction

2. Literature Review

2.1. Comparative Studies and Performance Metrics

2.2. Research Gap and Contribution of the Study

3. An Overview of the Proposed Network Screening Methods

3.1. The Global Risk Scoring Method (GRS)

3.2. The Crash Risk Index Method (CRI)

3.3. The Predicted Empirical Bayes Method (P-EB)

4. Study Design

4.1. Study Data

4.2. Methods

5. Analysis and Results

5.1. Consistency with Crash History

5.2. Consistency Check Within Network Screening Methods

5.3. Compararison Findings

6. Discussion

7. Summary and Conclusions

8. Limitations and Future Research

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI