Development of a Network-Level Road Safety Assessment Procedure Based on Human Factors Principles

: Road safety is a central issue in the management and development of a road network. Road agencies must try to identify the most dangerous sections of their network and act on them to improve safety. The most used procedure for this purpose is about considering the indicators based on crashes. However, a mature road safety management system must be able to assess the safety of a road section before accidents occur. The European community is moving in this direction with the update of Directive 2008/96/EC (Directive 1936/2019). This paper proposes a new methodology for carrying out a network-wide road safety assessment on rural single-carriageways and two-lane two-way roads. This procedure accounts for the influence of road characteristics on drivers’ perceptions. The methodology has been developed based on the human factors concepts from PIARC, and it includes a series of checklists that guide an inspector in carrying out a visual inspection of single-carriageway roads. The results from the checklist are then processed into an algorithm, and the level of risk in the analyzed section is provided. The objectives of the procedure are (a) to account for the perceptive aspects that are one of the major causes of road accidents, (b) to provide a proactive procedure in line with the requirements of the European Directive, and (c) to provide a useful instrument that can be easily implemented by road agencies and integrated with other analysis procedures. The procedure has been applied and tested on a case study of six different stretches of two-lane, two-way rural highways in Italy, Germany, and Slovenia (about 65 km). The results show a high degree of concordance with a risk classification based on the accident rate, mainly considering high-risk sections. Therefore, the procedure demonstrated its potential to be a useful instrument to be included in network safety assessments. Road agencies should consider the use of this procedure in their network safety analysis and ranking.


Introduction 1.Network Safety Assessments
Road traffic accidents represent a major public health issue and one of the leading causes of death, injury, and disability all over the world.Approximately 1.35 million people die each year because of road traffic crashes [1].It is therefore crucial to continuously make all the necessary efforts to improve the safety of our roads.This must be the objective of road administrations and road practitioners.
Examining road safety involves various methods aimed at assessing the safety levels of specific roads or entire networks.These analyses vary in their nature and approach, leading to different methodologies and terminology.For instance, network screening involves statistically analyzing the crashes within specific road segments, entire roads, or networks, considering both the number and characteristics of the accidents.Elvik [2] defines network screenings as the identification of road sections with safety issues-be it an unusually high number of accidents, a prevalence of severe accidents, or a particular accident type.This approach, which is based on occurred accidents, is termed "reactive".Contrastingly, a "proactive approach" aims to identify risky spots before accidents happen.Examples include Road Safety Inspections (RSIs).An RSI aims to identify problem features that are not yet apparent from the accident history or to identify new problems introduced by engineering changes or the ways they are used.RSIs typically rely on visual inspections, with outcomes subject to some level of evaluator subjectivity.
The process of analyzing the safety of the road network is the so-called Network Safety Assessment (NSA).The objective of the process is "to identify sites for further investigation and potential treatment" [3] or "to identify sections of the network that should be targeted by more detailed road safety inspections and to prioritize investment according to its potential to deliver network-wide safety improvements" [4].The identification of the riskiest sections of a road and the classification of all the sections belonging to the road network allow road agencies to prioritize the interventions, choosing the site that has the highest impact on road safety.Nowadays, NSA procedures mainly rely on accident indices, such as the accident frequency, the accident density, and the accident rate [5].Accident data are an important source of information to improve road safety analysis, but unfortunately, they are sometimes prone to errors or even just missing (e.g., regression to the mean).To overcome this issue, Accidents Prediction Models (or Crash Prediction Models) with Empirical Bayes adjustment were introduced.These models relate crash expectations to specific road features through statistical analyses [6].However, drawbacks include potential oversight of site-specific details and limited representativeness across countries due to varying factors' influences [7].The use of Accidents Prediction Models has proven to be a consistent method for making reliable crash frequency predictions, and thus improve a reliable NSA, but the implementation of an Accidents Prediction Model requires a large amount of data.Those data are not always available, mainly in low-and middle-income countries.Once an Accidents Prediction Model has been defined, this does not ensure that it is suitable for every analysis.Sometimes, it is necessary to adapt existent AMPs to local conditions [8,9].
An interesting proactive approach to an NSA procedure was proposed by Cafiso et al. [10] and considered by Erieba et al. [11].This approach relies on RSI and provides a safety index suitable for the ranking of the analyzed sections.Moreover, another study from Cafiso et al. [12] demonstrated that RSI, which is carried out following a systematic and structured procedure, produces the same results, even if carried out by different inspectors.Moreover, some drawbacks related to the time required to implement RSIs can be solved using fast procedures and new technologies [13].
Recent years saw the development of new methodologies attempting to combine Accidents Prediction Models' efficiency with the detailed analysis of standard RSIs.Examples include iRAP methodology [14][15][16], the Australian National Risk Assessment Model "ANRAM" [17], and the method proposed in the "Network Wide Road Safety Assessment-Methodology and Implementation Handbook" [18].Moreover, among different ways of carrying out road safety analysis at the network levels, some consider the use of new technologies [19][20][21], the use of surrogate safety measures [22,23], or focus more on speed consistency [24].
Finally, it must be noted that the updated European Community directive 2008/96/EC now mandates Network-Wide Road Safety Assessment in member states.It requires a visual examination of road design features and analysis of sections operational for over three years with a high proportion of serious accidents relative to traffic flow [4].
For these reasons, it is crucial to identify assessment indices.Such indices can be qualitative, such as the one identified in [10].Qualitative indices generally help to identify the main risk level of a section.However, it is difficult to rank different sections within the same risk level and thus give priority to one or the other.For this reason, quantitative indices are also crucial.Such types of indices also allow us to classify the road sections analyzed, as demonstrated in [10,14,16,[25][26][27].Road agencies can plan their intervention based on those indices, identifying the most critical sections of the road and scheduling when and how to intervene.

Human Factors
It has been proved that most accidents are a consequence of drivers' behavior [28,29] and that the behavior is strictly related to the road, both to its geometrical elements and its environment [30].Drivers are the core of the road system.Drivers are humans, and they must not be considered machines.They cannot strictly obey specific geometrical rules nor analyze all aspects of a complex situation in a few seconds in a perfect, mechanical way.For this reason, the concept of human factors (HF) should be applied to road safety.Many studies identify HFs as a key factor in correct road perception and thus in accident occurrence [31].Some of these studies consider, for example, bend curvature perception [32,33], reaction time and perception of the potentially critical locations [34,35], human workload and decision making [36], road familiarity and influence of experience [37,38], and the influence of road marginal elements on speed and behavior [39][40][41].Understanding the factors that influence the road-driver relationship, which are the triggering factors of accidents, helps to avoid driver mistakes.A road that does not lead drivers to make mistakes can be defined as a self-explaining road [42].The basic notion of a self-explaining road, which originated in the Netherlands, is a "traffic environment which elicits safe behavior simply by its design" [43,44].By this concept, the driver should clearly understand the road they are driving, its elements, and its features, changing their driving behavior according to the road elements.The self-explaining roads concept could appear utopistic; nevertheless, designers should tend to make roads on a human scale.

Research Objective and Overcoming the Gaps
The main knowledge gap addressed by this research is to translate the already known human factors principles [45,46] into a systematic procedure capable of quantitatively measuring the level of safety of a specific road stretch.This objective also covers the space of the updated Directive 2008/96/EC, which requires a procedure able to visually investigate the road safety level proactively.Moreover, as also highlighted by other authors [47,48], segmentation is still an issue in carrying out NSAs.Different road segmentations may often create different results, with a lack of consistency against different segmentations.This procedure tries to overcome this gap by proposing three different levels of segmentation: a first level with long sections for a fast first screening of the stretch, a second level with short segments for the detailed analysis and evaluation of the stretch, and again a much longer section to group the results of the shorter segments into a more practical segmentation for management for road agencies.
The new procedure allows us to carry out an NSA based on visual inspections, which is based on human factor principles.The procedure has been developed for rural singlecarriageway roads.The procedure described in this paper was already introduced in the work of Paliotto et al. [49] and recently finalized [50].The current version has been strengthened and amended to make it more suitable for its use.Specifically, the version presented in [49] presents some limitations: -During the first step of the procedure, the evaluation of Potentially Critical Locations (PCLs, see Section 2.2.2) was made without following a systematic procedure and was made only considering all different expectation-related aspects together (i.e., general expectations, punctual expectations, and visibility), with a high possibility of subjective judgments.-During the second step, the Human Factor Tool from PIARC [51] was used with only a few adjustments for the score calculation.

-
The procedure was applied to three road stretches.
The new procedure aims at improving these points by achieving the following: -Consider a systematic and automated process to ensure an objective and fast first screening of the PCLs.This assures the reduction in time required to analyze all the PCLs in the second step.
-Improve the Human Factor Tool by introducing and modifying some requirements, developing guidelines for its application (achievable in the material attached to the paper), and developing a digital tool (in Excel) to expedite its application.-Apply the procedure to the already analyzed stretch (for a comparison of the results) and three new stretches.

Materials and Methods
The following paragraphs present the steps to follow to carry out the procedure.Please refer to the paragraph "List of acronyms and definition" for the meanings of each acronym.

Structure of the Procedure
The procedure is divided into three different steps according to the conceptual scheme shown in Figure 1.
-Consider a systematic and automated process to ensure an objective and fast first screening of the PCLs.This assures the reduction in time required to analyze all the PCLs in the second step.-Improve the Human Factor Tool by introducing and modifying some requirements, developing guidelines for its application (achievable in the material attached to the paper), and developing a digital tool (in Excel) to expedite its application.-Apply the procedure to the already analyzed stretch (for a comparison of the results) and three new stretches.

Materials and Methods
The following paragraphs present the steps to follow to carry out the procedure.Please refer to the paragraph "List of acronyms and definition" for the meanings of each acronym.

Structure of the Procedure
The procedure is divided into three different steps according to the conceptual scheme shown in Figure 1.The first step (top-down process) allows us to make a first screening of the road to identify the Potentially Critical Locations (PCLs) that have a high possibility of being critical.The PCLs considered [49] are listed in Section 2.2.2.In the first step, the road analyzed is divided into Expectation Sections (EXSEs), which are sections of several km on average.EXSEs are crucial to identify road characteristics that allow us to determine if a PCL is expected or not as clearly explained in Section 2.2.1.In the second step (evaluation process), the Human Factors Evaluation Tool (HFET) is applied to each Human Factors Evaluation Segment (HFES), which are shorter segments of about 300-400 m on average.The third step (bottom-up process) allows us to organize the results so that they are suitable for a network classification (longer sections).This means grouping many HFESs into single Network Assessment Sections (NASs).For each NAS, a risk code RC is calculated.The RC allows us to identify four different levels of risk and to make a ranking of the NAS.The obtained level of risk is calculated without considering accident data.For this reason, the procedure can be considered a full proactive procedure based on visual inspections.

Step 1 of the Procedure
The main aim of the first step is to identify the characteristics of the roads analyzed while considering their influence on driver expectations.In this step, EXSEs are identified.EXSEs are road sections where the driver has specific similar driving demands, like a The first step (top-down process) allows us to make a first screening of the road to identify the Potentially Critical Locations (PCLs) that have a high possibility of being critical.The PCLs considered [49] are listed in Section 2.2.2.In the first step, the road analyzed is divided into Expectation Sections (EXSEs), which are sections of several km on average.EXSEs are crucial to identify road characteristics that allow us to determine if a PCL is expected or not as clearly explained in Section 2.2.1.In the second step (evaluation process), the Human Factors Evaluation Tool (HFET) is applied to each Human Factors Evaluation Segment (HFES), which are shorter segments of about 300-400 m on average.The third step (bottom-up process) allows us to organize the results so that they are suitable for a network classification (longer sections).This means grouping many HFESs into single Network Assessment Sections (NASs).For each NAS, a risk code RC is calculated.The RC allows us to identify four different levels of risk and to make a ranking of the NAS.The obtained level of risk is calculated without considering accident data.For this reason, the procedure can be considered a full proactive procedure based on visual inspections.

Step 1 of the Procedure
The main aim of the first step is to identify the characteristics of the roads analyzed while considering their influence on driver expectations.In this step, EXSEs are identified.EXSEs are road sections where the driver has specific similar driving demands, like a curvy section with a similar radius or an interurban section with a logical consistency of design elements and speed.At the same time, the roadside gives the driver a consistent impression that contributes to an overall impression of the road section.So, the driver subconsciously builds up a specific expectation of how the road alignment develops and which driving program is appropriate.PCLs are qualitatively evaluated by judging the level of compliance between possible expectations induced by the road and the real road development and configuration.This is achieved by assigning to each PCL a risk level for the general expectations (GEXs) (three risk levels: high, medium, and low), visibility (VIS) (three risk levels: high, medium, and low), and punctual expectations (PEXs) (two risk level: high and low).The assignment is made while taking care of the following: -GEX: how the PCL is expected considering the EXSE to which it belongs (e.g., a pedestrian crossing cannot be expected within a forest where no buildings are visible); -VIS: evaluate if sufficient decision sight distance is present based on the expected speed and alertness that the driver is expected to have while driving that specific EXSE that includes the analyzed PCL (the driver needs more time to see, understand, and react to the pedestrian crossing considered in the previous example); -PEX: punctual expectations generated by the specific configuration of the road and the field of view close to the PCL; they are evaluated through fast visual inspection of the road by human factors experts.Contrary to GEX and VIS, which can be calculated, PEX is derived from the qualitative judgments of inspectors.For this reason, only two levels of risk are considered for PEX, which means: "some main issues are present" (high risk level) or "no or very low issues are present" (low risk level).

Identification of PCLs and EXSEs
The steps to follow for the identification of PCLs and EXSEs are the same as defined in Paliotto et al. [49].PCLs are any area where drivers must adapt their driving program by changing speed, braking, steering, or changing lanes.Normally, they are junctions, intersections, stops of public transport, exits, driveways, curves, carriageway width reductions, or pedestrian/cyclist crossings.PCLs are directly identified by driving through the road, considering the road database, looking at the design documents, or satellite images.An example of PCL identification is provided in Figure 2.
Infrastructures 2024, 9, x FOR PEER REVIEW 5 of 29 curvy section with a similar radius or an interurban section with a logical consistency of design elements and speed.At the same time, the roadside gives the driver a consistent impression that contributes to an overall impression of the road section.So, the driver subconsciously builds up a specific expectation of how the road alignment develops and which driving program is appropriate.PCLs are qualitatively evaluated by judging the level of compliance between possible expectations induced by the road and the real road development and configuration.This is achieved by assigning to each PCL a risk level for the general expectations (GEXs) (three risk levels: high, medium, and low), visibility (VIS) (three risk levels: high, medium, and low), and punctual expectations (PEXs) (two risk level: high and low).The assignment is made while taking care of the following: -GEX: how the PCL is expected considering the EXSE to which it belongs (e.g., a pedestrian crossing cannot be expected within a forest where no buildings are visible); -VIS: evaluate if sufficient decision sight distance is present based on the expected speed and alertness that the driver is expected to have while driving that specific EXSE that includes the analyzed PCL (the driver needs more time to see, understand, and react to the pedestrian crossing considered in the previous example); -PEX: punctual expectations generated by the specific configuration of the road and the field of view close to the PCL; they are evaluated through fast visual inspection of the road by human factors experts.Contrary to GEX and VIS, which can be calculated, PEX is derived from the qualitative judgments of inspectors.For this reason, only two levels of risk are considered for PEX, which means: "some main issues are present" (high risk level) or "no or very low issues are present" (low risk level).

Identification of PCLs and EXSEs
The steps to follow for the identification of PCLs and EXSEs are the same as defined in Paliotto et al. [49].PCLs are any area where drivers must adapt their driving program by changing speed, braking, steering, or changing lanes.Normally, they are junctions, intersections, stops of public transport, exits, driveways, curves, carriageway width reductions, or pedestrian/cyclist crossings.PCLs are directly identified by driving through the road, considering the road database, looking at the design documents, or satellite images.An example of PCL identification is provided in Figure 2. EXSEs are road sections characterized by the same expectations for the driver.Thus, driving performances are related to driver expectations about the road and their consistency with real road development.EXSEs are defined based on the section's main characteristics, which are the road category, the road winding, and the road perception of possible interaction (PPI).EXSEs are road sections characterized by the same expectations for the driver.Thus, driving performances are related to driver expectations about the road and their consistency with real road development.EXSEs are defined based on the section's main characteristics, which are the road category, the road winding, and the road perception of possible interaction (PPI).
The broad category includes motorway, rural highway, rural local, urban arterial, urban connectors, and urban local.This paper focuses on the application of the procedure only related to rural highways.
Road winding is defined based on the value of the curvature change rate (CCR), following Equation (1).
where the following apply: CCR = curvature change rate [gon/km]; γ i = angular change of the geometric element "i" [gon]; L i = length of the geometric element "i" [km]; n = number of geometric elements of the road section (tangents, circular curves, spirals) [-].
For rural highways, three levels of road windings have been identified based on the value of the CCR.

•
High winding: CCR > 350 gon/km; The PPI represents how much the driver expects interactions with other crossing-road users.Two levels of PPI (details on how those levels are calculated can be found in [50] together with some examples) are defined for rural highways: low and medium.

•
Low level: only rural areas can be considered at a low level.A low level means very few or no perceived possible interactions.The surrounding environment is almost natural, without any trace of anthropization, if not the road itself.

•
Medium level: this level represents the upper level of rural areas.Such PPI is often representative of suburban areas, where the density of houses and commercial activities is reduced.Medium level can also address rural road stretches that pass through small villages or groups of houses along the road or which pass through an area with many driveways and at-grade intersections due to the presence of many activities and factories.
A different EXSE starts when there is a change in the winding level or PPI level.The consistency of each PCL with its EXSE must then be evaluated, as explained in detail in Section 2.2.2.The actual procedure differs from that proposed in the previous work from Paliotto et al. [49] because it considers different risk levels for each aspect analyzed (i.e., GEX, VIS, PEX).An example of EXSEs identification is provided in Figure 3.The value of the computed CCR and the identified level of PPI are shown in the same figure.
Based on the winding level and PPI level, the levels of expected speed (V E ) and alertness can be defined.V E is the range of speed the driver is expected to travel at.Alertness defines the level of attention of the driver, and thus, indirectly, it influences the activated level of resources to process information from the environment.High alertness corresponds to high available resources.High alertness translates into a reduced time required to process information and respond to environmental stimuli.
All the different levels of V E and alertness corresponding to different combinations of winding and PPI for rural highways are presented in Table 1.The relationships in Table 1 highlight, for example, that a driver who is driving on a rural highway with a high level of winding and a medium level of PPI is expected to have a high level of alertness and to drive at a relatively moderate speed (50-80 km/h).On the other hand, if a driver is driving on a rural highway with low winding and low PPI, they are not ready to react to any unexpected event (low level of alertness), and their speed is expected to be relatively high (80-100 km/h).At the end of the EXSE identification process, all EXSEs are characterized by a level of winding, a level of PPI, a level of V E , and a level of alertness.Based on the winding level and PPI level, the levels of expected speed (VE) and alertness can be defined.VE is the range of speed the driver is expected to travel at.Alertness defines the level of attention of the driver, and thus, indirectly, it influences the activated level of resources to process information from the environment.High alertness corresponds to high available resources.High alertness translates into a reduced time required to process information and respond to environmental stimuli.
All the different levels of VE and alertness corresponding to different combinations of winding and PPI for rural highways are presented in Table 1.The relationships in Table 1 highlight, for example, that a driver who is driving on a rural highway with a high level of winding and a medium level of PPI is expected to have a high level of alertness and to drive at a relatively moderate speed (50-80 km/h).On the other hand, if a driver is driving on a rural highway with low winding and low PPI, they are not ready to react to any unexpected event (low level of alertness), and their speed is expected to be relatively high (80-100 km/h).At the end of the EXSE identification process, all EXSEs are characterized by a level of winding, a level of PPI, a level of VE, and a level of alertness.

GEX Evaluation
The evaluation for GEX is automatically made considering the criteria in Table 2.Such criteria were derived both considering a literature review of design standards of several countries (Italy, Germany, England, Slovenia, Portugal, Australia, Canada, Austria, Switzerland) and a survey conducted on a sample of about 50 people with a valid driver license of different ages and gender.Detailed data about the design standards and the survey are provided in [50].

PCLs Type PCLs
Winding/PPI Level (H = High, M = Medium, L = Low) In Table 2, it is possible to derive, for example, that a relatively sharp curve (curve20, which requires a speed deceleration of about 20 km/h) is not expected on a section with low curvature and low PPI (thus a high risk).On the opposite, the same curve is expected on a high winding section with medium PPI (thus a low risk).The maximum level of PPI for rural roads is medium (M).

VIS Evaluation
To assess the VIS level, the visibility of the PCL must be checked.The visibility of locations along the road is commonly used in engineering; however, most of the time, the distance checked is the stopping sight distance.For specific locations, many design standards also account for the decision sight distance, for example, [52].Stopping sight distance is considered an instinctive reaction to facing a sudden problem and avoiding an accident, while the decision sight distance accounts also for the time needed to correctly perceive and plan how to react to a specific situation on the road.PIARC defines a general decision sight distance considering subdividing the space (and time) approaching the PCL into four different sections [46].From the one closer to the PCL, the sections are the maneuver section (where the braking action is mainly carried out), the response section (2-3 s, mainly necessary to the driver to set the maneuver), the anticipation section (2-3 s, necessary to comprehend the location), and the warning section (4 s, necessary under specific conditions to advise drivers about oncoming location).A well-designed road should have the maneuver, response, and anticipation sections; for this reason, PIARC defines the First Rule of Human Factors (4-6 s rule).Consequently, the assessment of the VIS level considers that optimal conditions are when the PCL is visible from a distance traveled in more than 6 s.This means that the maneuver, response, and anticipation sections are present (assuming that the maneuver section hardly takes more than 2 s; otherwise, it means that a long emergency braking is required, and stopping sight distance must be considered).The traveled distance D 2 is calculated considering the higher speed in the range of the V E , multiplied by t 2 , as defined in (2).The time t 2 is considered as 6 s for EXSE with a low alertness, 5 s for EXSE with a medium alertness, and 4 s for EXSE with a high alertness (considering 6 s or less as the "optimal" threshold may appear not precautionary; however, when calculating the distance, the considered speed is the maximum of V E range.This compensates for the choice of a relatively reduced time.It must also be noted that detailed and more precise calculations are possible but are not suggested at this stage because a fast screening is required now to exclude the less dangerous PCLs from a detailed analysis (that is carried out in Step 2)).The higher the alertness of the driver, the less time necessary to perceive and detect the PCL.
where the following apply: Table 3 shows the outcome of the calculations for the two V E levels for rural highways.Within that distance, the PCL must be clearly and continuously visible.If the visibility is higher than D 2 , it can be assumed that the PCL has no visibility problem, and thus the VIS level of risk is low.On the other hand, if the available sight distance is less than D 1 , a high risk is present concerning VIS.D 1 is calculated considering Equation (3).Available sight distance between D 2 and D 1 leads to a medium risk for VIS.
where the following apply:

PEX Evaluation
PEX levels consider the composition of the road and the road environment, and thus the composition of the field of view and its influence on the right perception of the road.These conditions must be evaluated close to the PCL, starting from about 6-10 s before the PCL.Unfortunately, it is not possible to define objective criteria to calculate the PEX level; thus, this process is up to the inspector.To assess the PEX level of each PCL, the inspector must drive along the road stretch in both directions, trying to figure out if some PEX issues are present.Because of the subjectivity of this evaluation, it has been decided to consider only two levels of PEX: low risk and high risk.A low risk level means that there are not any issues or only a few.High risk means that many issues are present, or a few big issues.
The evaluation should be carried out without a deep analysis.The main aspects to consider are reported in Table 4.The table is divided into three main investigation topics ("density and shape of the field of view", "elements in the lateral roadside environment support optimal lane keeping", and "depth of the field of view") with their relative subsections.The contents of this table represent a sort of checklist for the inspector.Generally, it can be assumed that if inspectors find some issues concerning two out of three of the investigation topics, then the PCL should be classified as a high level of PEX.However, this is not a rule; the inspector must try to understand if the found issues are relevant.For this reason, to carry out the assessment, the inspector must be trained in human factors.

DEPTH OF FIELD VIEW Dominant eye-catching objects support the detection of the challenging location Presence of optical illusion
The course of the road is visible PCL evaluation is not mandatory.However, this first screening of PCLs is highly recommended to ensure that only the locations with the higher possibility of being dangerous are analyzed in step 2 with the Human Factors Evaluation Tool (HFET).The detailed analysis of all the PCLs with the HFET would require more time than the fast evaluation of the PCLs described in Step 1.

Step 2 of the Procedure
In the second step, the detailed analysis of the road is carried out and the Human Factors Evaluation Tool (HFET) is applied to the Human Factors Evaluation Segments (HFESs).The HFESs are segments composed of one or more challenging location (CHL) and their relative challenging transition (CHTs).The updated version of the HFET allows us to analyze at the same time all the CHLs belonging to the same HFES.The HFET is available together with the guidelines for its application in the Supplementary Materials of this paper (GUIDELINE for the application of the HUMAN FACTORS EVALUATION TOOL).The second step is divided into three main sub-steps:

Identification of CHLs
CHLs are PCLs that are not clearly perceived by the driver because of some problems concerning VIS, GEXs, and/or PEXs.The consequence is that the driver does not change his driving program or tries to change it too late, causing hazardous maneuvers.A PCL is promoted to CHL when at least one risk level related to expectations (VIS, GEX, or PEX) is high and one is medium.This concept is clarified in Table 5, where all the possible combinations of VIS, GEX, and PEX levels are presented, together with the outcome of each combination.
A PCL could be CHL only for one direction of travel.For this reason, in the analysis, the direction in which the location is challenging must be specified.

Identification of Challenging Transitions (CHTs) and Human Factors Evaluation Segments (HFESs)
Once the CHLs have been identified, the area to evaluate with the HFET must be chosen.This area is the road stretch preceding and including the CHL, and it is called challenging transition (CHT).CHTs typically start 10-12 s before the CHLs and can include other elements of the road that are not CHLs or even other CHLs.If the latter is the case, the two overlapping CHTs (one for each CHL) must be merged, creating a single CHT.This final CHT is then called the Human Factors Evaluation Segment (HFES) because this is the segment that is evaluated with the HFET.
The scheme in Figure 4 summarizes the three steps required.CHLS and CHTs must be considered first in one direction and then in the other direction because, based on the direction of travel, they can change.Consequently, HFESs are also different for each direction.

PEX H H H H H H H H H
A PCL could be CHL only for one direction of travel.For this reason, in the analysis, the direction in which the location is challenging must be specified.

Identification of Challenging Transitions (CHTs) and Human Factors Evaluation Segments (HFESs)
Once the CHLs have been identified, the area to evaluate with the HFET must be chosen.This area is the road stretch preceding and including the CHL, and it is called challenging transition (CHT).CHTs typically start 10-12 s before the CHLs and can include other elements of the road that are not CHLs or even other CHLs.If the latter is the case, the two overlapping CHTs (one for each CHL) must be merged, creating a single CHT.This final CHT is then called the Human Factors Evaluation Segment (HFES) because this is the segment that is evaluated with the HFET.
The scheme in Figure 4 summarizes the three steps required.CHLS and CHTs must be considered first in one direction and then in the other direction because, based on the direction of travel, they can change.Consequently, HFESs are also different for each direction.As said before, the typical length of a single CHT is the distance traveled in 10-12 s [46] in addition to the length of the CHL itself.The speed required to calculate the distance As said before, the typical length of a single CHT is the distance traveled in 10-12 s [46] in addition to the length of the CHL itself.The speed required to calculate the distance can be considered as the maximum speed in the EXSE's V E range, reduced by 10 km/h, as shown in (4).
where the following apply: Table 6 shows the calculated distances for the two different V E levels.Nevertheless, those distances do not need to be exact; they must be used as references.Inspectors may decide to reduce or increase those distances if the operating speed seems to be very different than the speed considered in (4).Distances less than 150 or higher than 300 m are discouraged.Figure 5 shows an example of two CHLs with their CHT from road SR2 in Tuscany (red line).One CHL is represented by a pedestrian crossing (yellow); the other CHL is represented by a curve (green).The two CHTs are overlapping; thus, they are merged to create a single CHT, HFES (purple).The HFES in the example of Figure 5 is about 450 m long.

Application of the Human Factors Evaluation Tool (HFET)
The HFET is provided by three different sheets, each of them representing one of the rules of human factors.The tool should be applied following the guidelines for its application provided in the documentation attached to this paper.
The results provide the Human Factors Score (HFS), a numerical index that ranges between 0% and 100% and which assumes the following significance: HFS < 40% (highlights a high-risk HFES, 40% ≤ HFS ≤ 60% is medium risk, and HFS > 60% is low risk.The HFS is provided for each of the three rules of human factors [46] and considers all the rules together (Total HFS).

Application of the Human Factors Evaluation Tool (HFET)
The HFET is provided by three different sheets, each of them representing one of the rules of human factors.The tool should be applied following the guidelines for its application provided in the documentation attached to this paper.
The results provide the Human Factors Score (HFS), a numerical index that ranges between 0% and 100% and which assumes the following significance: HFS < 40% (highlights a high-risk HFES, 40% ≤ HFS ≤ 60% is medium risk, and HFS > 60% is low risk.The HFS is provided for each of the three rules of human factors [46] and considers all the rules together (Total HFS).

Step 3 of the Procedure
The third step is needed to group the results obtained from the analysis of HFESs.Longer sections are more useful while implementing a Network-wide Road Safety Assessment because they can be more easily represented and because road administrations often prefer to intervene on longer road stretches [16].These sections are called Network Assessment Sections (NASs).Moreover, the results obtained from the second step of the procedure for each HFES must be unified in a single result that is representative of the NAS.For these reasons, the third step of the procedure concerns the following:

•
Identification of NASs; • Calculation of the Risk Code (RC) to assign to each NAS.

Identification of NASs
NASs are road stretches taken as a reference by road agencies.The results of the entire procedure are provided for each NAS, and the results allow us to rank each NAS, allowing the road agency to define its intervention priority.The scope of the NASs is thus related to the scope of the road agency.For this reason, road agencies may choose the length of the NASs based on their requirements.Different segmentation criteria can be used by road agencies.A very common one is to consider sections with the same traffic level.Other criteria can be to have sections of a fixed length that can be easily compared (because of their same length) and that can also be easily associated with the km posts.Another possibility to have a specific length for each NAS is because some other evaluations have been made that have a specific segmentation length, and the road agency wants to compare the results of this procedure with the results of some others.All these possibilities are made possible because of the flexibility, at this step, in the choice of NAS's length.However, some limitations should be considered to account for the road category and its characteristics.On non-motorway roads, NASs with a length higher than 5 km are discouraged because the road can greatly differ within 5 km.Moreover, if the road is quite complex with a changing environment, lengths of a maximum of 2 km are suggested.Finally, it is recommended to divide the network into NASs of the same length as possible.However, while determining the NASs, each HFES should be wholly included in a NAS, and thus an exact length is not always assured.In two-lane, two-way rural highways, a length of 1 km is suggested [53].Such a length is long enough to be applied on the network, but it is still quite short to provide sufficiently focused results.An example of NAS identification is provided in Figure 6.
Another possibility to have a specific length for each NAS is because some other evaluations have been made that have a specific segmentation length, and the road agency wants to compare the results of this procedure with the results of some others.All these possibilities are made possible because of the flexibility, at this step, in the choice of NAS's length.However, some limitations should be considered to account for the road category and its characteristics.On non-motorway roads, NASs with a length higher than 5 km are discouraged because the road can greatly differ within 5 km.Moreover, if the road is quite complex with a changing environment, lengths of a maximum of 2 km are suggested.Finally, it is recommended to divide the network into NASs of the same length as possible.However, while determining the NASs, each HFES should be wholly included in a NAS, and thus an exact length is not always assured.In two-lane, two-way rural highways, a length of 1 km is suggested [53].Such a length is long enough to be applied on the network, but it is still quite short to provide sufficiently focused results.An example of NAS identification is provided in Figure 6.After NASs have been identified, all the road stretches that belong to NAS but do not belong to HFESs are classified as Inconspicuous Segments (INCSs).INCSs are considered segments with a HFS of 100%.

Calculation of the Risk Code (RC)
Focusing on the objectives of the new procedure [4], the code to assign to each NAS must allow for the identification of at least three safety levels and should allow for a ranking of the NAS.Therefore, each NAS is defined by an alphanumerical code that is divided into three parts, as shown by Figure 7: a first part composed of a letter and a number, a second part composed of a number, and a third part composed of two numbers.Focusing on the objectives of the new procedure [4], the code to assign to each NAS must allow for the identification of at least three safety levels and should allow for a ranking of the NAS.Therefore, each NAS is defined by an alphanumerical code that is divided into three parts, as shown by Figure 7: a first part composed of a letter and a number, a second part composed of a number, and a third part composed of two numbers.The first term identifies the safety level of the NAS.The second term gives a numerical value that represents the most critical HFES within the NAS.The last term is instead a measure of the variance of the results within the same NAS.It allows us to understand if, within the same NAS, a single critical area is present, but the remaining part of the NAS is in good condition (or the opposite).The alphanumerical values in Figure 7 have the following meanings: The RC refers to both directions considered together.The ranking of the NASs follows a two-level ranking.
The first ranking identifies the risk level of the NAS, and thus it is based into four The first term identifies the safety level of the NAS.The second term gives a numerical value that represents the most critical HFES within the NAS.The last term is instead a measure of the variance of the results within the same NAS.It allows us to understand if, within the same NAS, a single critical area is present, but the remaining part of the NAS is in good condition (or the opposite).The alphanumerical values in Figure 7 have the following meanings: The RC refers to both directions considered together.The ranking of the NASs follows a two-level ranking.
The first ranking identifies the risk level of the NAS, and thus it is based into four groups: • Very high risk: "AB" part of the code is equal to "R4"; • High risk: "AB" part of the code is equal to "R2" or "R3"; • Medium risk: "AB" part of the code is equal to "R1" or "Y4";

•
Low risk: all the remaining cases.
These four risk levels are consistent with the number of different risk levels required by the new European Directive [4].
High and very high levels both mean a high probability of accident occurrence, while medium and low risk can be associated with a low risk of accident.
The second level ranking considers instead a ranking within the same risk level.This second ranking is made following the criteria presented in Table 7.It must be considered that each parameter composing the code follows the order in which they are presented (AB-C-D/E).This means that considering the same results of part "A" of the code, priority is given to the NAS, which has a higher part "B".Generally, parts D and E of the code are not necessary for the ranking.However, they provide important information about the composition of the NAS, which can help the road agency in the following stage of intervention.At the end of step 3, the ranking is obtained for all the NASs belonging to the road network.

Validation of the Procedure
The procedure has been validated by comparing the results obtained from its application to six road stretches of rural two-lane two-way roads, with an accident index considered for the same roads.The accident index considered is the accident rate (AR), which is the number of accidents per vehicle kilometer traveled in one year.Three different safety levels, which are based on the AR results, have been considered.Finally, a comparison has been made between both the safety level results and the ranking obtained by the two indices (AR and RC), respectively, utilizing the Freeman-Halton extension of Fisher's test because of the low number of variables considered and utilizing Kendall's coefficient of concordance (W).Fisher's test [54] is used for categorical data that result from classifying objects in two different ways; it is used to examine the significance of the association between the two kinds of classification (thus a 2 × 2 contingency table) that is used to calculate exactly the significance of the deviation from a null hypothesis (e.g., p-value).The Freeman-Halton extension of the test [55] allows for the extension of the test to contingency tables greater than 2 × 2. In this research, a 3 × 3 contingency table has been used.Kendall's W (also known as Kendall's coefficient of concordance) is a non-parametric statistic for rank correlation [56].The coefficient is a measure of the agreement between several judges who have rank-ordered a set of entities.Kendall's W ranges from 0 (no agreement) to 1 (complete agreement).

Repeatability
The considered procedure is mainly based on a visual inspection of the road that is carried out in step 2 of the procedure while applying the HFET.While applying the HFET, the inspector is asked to judge some different aspects and characteristics of the road and its environment, and this may lead to possible differences in judgments if different inspectors carry out the analysis.A clear guideline for the application of the HFET, with clear criteria for judgments, is expected to greatly reduce the differences in judgments.To test the repeatability of the procedure, in 2022, another team of inspectors have been asked to analyze the same stretch of road SR2.The results are discussed in Section 2.7.

Calculation of the AR and AR levels
The AR performance measure has been chosen as the most representative for a comparison.Indeed, AR is a safety performance that quantifies the safety of a single vehicle driving along a road stretch.As defined in Equation ( 5), the AR for a road segment is defined as the number of accidents in the analysis period (i.e., accident frequency) divided by the number of vehicles that pass through that segment in the same period (in million vehicles) and divided by the length of the segment.To define the risk level based on AR, it has been decided to follow the procedure proposed by Miar [53], which developed upon the proposal from Norden et al. [57].Two thresholds have been identified: AR max and AR min .The risk levels are assigned as follows: AR i is the accident rate for section "i", calculated following Equation (5).AR min and AR max can be calculated following Equations ( 6) and (7).
where the following apply: K = constant of Poisson probability distribution function, taken as 1.282 (confidence interval of 90%) [58]; AR m = the average accident rate of the analyzed site (e.g., road stretch) calculated with Equation (8).
M i = the exposure momentum calculated with (9) for section "i".
n p = total number of accidents that occurred in the considered period "p"; t = total number of sections in the analyzed site; L i = length of the "i" section; AADT i,p = average annual daily traffic of section "i" in the whole considered period "p" (sum of the AADT i of each year).

Consistency
The procedure allows road administrations to choose the most suitable segmentation for NAS.To test the consistency against different segmentation of NASs, the two stretches of road SR2 and road B38 have been analyzed considering three different NAS segmentations: 1 km, 2 km, and based on traffic changes.

Test Roads
The road stretches considered for the validation of the procedures are two-lane, twoway rural roads, both primary and secondary roads.A total of about 65 km of roads have been considered.Because of the objective of being an international procedure, the stretches considered are from two Italian roads, three German roads, and one Slovenian road.
The two roads from Italy are the roads SR2 and SR206.SR2 and SR206 are two rural highways located in the center of Italy, which differ from each other for both geometrical and functional characteristics.The road SR2 stretch ranges from km 280.600 to km 292.400 (11.8 km total) in a hilly environment with many curves and few short tangents.The road SR206 stretch ranges from km 27.800 to km 42.400 (14.6 km total) in a plain terrain with many long tangents and few curves.The traffic database has been provided by the Tuscany Region.The analysis period considered for traffic is 2014-2018.The accident database was provided by ISTAT (Istituto Nazionale di Statistica).
The German stretches considered are from Road B38, road L3106, and road L3408.The stretch of road B38 analyzed ranges between km 1.200 after Section 6118-001 and Section 6118-036.The road develops through a plain and hilly terrain.The radii used are quite high, and the operating speeds are high (between 80 km/h and 100 km/h).The road L3106 stretch analyzed runs from Section 6218-045 to Section 6118-005 for a total length of about 7.5 km.The road develops through a hilly terrain and maintains a soft, curvy track.The road L3408 analyzed starts 1.600 km after Section 6418-217 and ends 0.4 km before Section 6418-207, for a total length of about 3.0 km.The road characteristics are similar to those of the road L3106.Geometrical, traffic, and accident data for the German roads have been provided by Hessen Mobil.For all the stretches, the accidents database refers to the period 2018-2020.
The Slovenian Road 106 stretch develops from the southern part of Ljubljana for about 16 km.The road stretch analyzed corresponds to Section 261.Road 106 is an important road that connects the capital, Ljubljana, with the southern part of the country.Thus, many vehicles of different types travel the road.From km post 3.200 to km post 5.200, the carriageway cross-section is composed of a 2 + 1 lane, with a double lane in the south direction.The road is mainly a fast road, developing first in plain terrain and then in hilly terrain.Traffic data have been provided as a single data set for each year in the period 2015-2019.The accident database has been provided by the Slovenian Infrastructure Agency and refers to the period 2015-2020.
Figure 8 shows the satellite images of each road stretch.The images are of different scales and are mainly provided to understand the geometry of the roads and the environment they pass through.south direction.The road is mainly a fast road, developing first in plain terrain and then in hilly terrain.Traffic data have been provided as a single data set for each year in the period 2015-2019.The accident database has been provided by the Slovenian Infrastructure Agency and refers to the period 2015-2020.
Figure 8 shows the satellite images of each road stretch.The images are of different scales and are mainly provided to understand the geometry of the roads and the environment they pass through.

Main Results
After the RC was calculated for each NAS, the AR was calculated for each NAS.Table 8 shows the list of identified NASs, the road they belong to, their length, their average AADT, the obtained RC with the consequent risk levels and ranking, and the AR for each

Main Results
After the RC was calculated for each NAS, the AR was calculated for each NAS.Table 8 shows the list of identified NASs, the road they belong to, their length, their average AADT, the obtained RC with the consequent risk levels and ranking, and the AR for each NAS with the consequent risk levels and ranking.The results are ordered from the worst section to the best one, considering the RC.The outcomes from the statistical analysis are as follows: • Freeman-Halton extension of Fisher's test: p-value significance of 0.004 < 0.05; thus, the null hypothesis that the variables assume these risk levels by chance can be rejected; • Kendall'W of 0.78, with a p-value of 0.001 < 0.05; thus, the null hypothesis that the variables assume this ranking by chance can be rejected.
All the statistics confirm a good correspondence, and the null hypothesis can be rejected for all the statistics.Moreover, it is possible to build the contingency table for risk level comparison, which is shown in Table 9.The greater the values on the matrix diagonal, the better the correspondence between the two classifications.In the table, it can be observed that there is a very good correspondence between the medium levels and a good correspondence between the high levels.In the same table, the NASs classified as low or medium risk for both indices are colored in green.These NASs can be considered "sections that do not require interventions".On the opposite, risky sections ("sections that require interventions") are colored in red.Considering the sum of numbers in the diagonal divided by the total number of NASs, the overall concordance is 56%.The results show that a better correspondence is present considering the high risk level.It must also be noted that only in one NAS, the difference among the results is of "two levels", which means that the section is classified as high risk for one index and as low risk for the other.In all other cases, if the risk level is not the same, it is the immediately preceding one (or immediately following one).Indeed, while considering only the "section that requires interventions", which are the high-risk level sections (red cells), and the "section that does not require interventions", which are the low-and medium-risk sections (green cells), the correspondence is 81%.Finally, a regression analysis of RC values vs. AR has been performed to test the existence of a numerical correlation.Because RC is a qualitative variable, it has been translated into numbers considering the different combinations of the first part of the RC as a number, following the criteria in Table 10.The scale used is linear, and the difference between each different RC has been set to 1.The relationship between AR and the associated value of the RC is plotted in Figure 9.The results show a low correlation when all the NASs are considered.However, some interesting distribution of the results can be observed.The most varying results are linked to the value "5", corresponding to RC = R1.Such RC means that one of the rules presents some major problems that need attention, but the other aspects are quite good.Such RC identifies a medium risk level.Because of the structure of the procedure that gives more weight to the worst situation, such results can be expected as collateral drawbacks because, in the R1 sections, it is possible to expect a high variance of accidents: the critical issue identified may cause some accidents, but if many other aspects are good, it is possible to also expect a small number of accidents.This confirms that the procedure can identify critical locations and locations that are not critical at all, but it is not very precise at the intermediate level.However, it is generally the same considering accident-based performance measures because the intermediate level is much more sensitive to accident variations over the different years of the analyzed period.So, it can be stated that in this higher variance around RC = R1, the two measures are in some way concordant.Finally, considering the possible variation of accidents over the years (that is a common issue considering accident statistics), it has been decided to test a linear correlation by making an average of the AR values of the NAS having the same RC.The correlation that was obtained is very good (r = 0.86; R 2 = 0.7391).The results are shown in Figure 10.This confirms the overall good prediction of the procedure.Nevertheless, additional analysis and evaluations (larger dataset) are suggested before defining a numerical relationship between the two variables.Finally, considering the possible variation of accidents over the years (that is a common issue considering accident statistics), it has been decided to test a linear correlation by making an average of the AR values of the NAS having the same RC.The correlation that was obtained is very good (r = 0.86; R 2 = 0.7391).The results are shown in Figure 10.This confirms the overall good prediction of the procedure.Nevertheless, additional analysis and evaluations (larger dataset) are suggested before defining a numerical relationship between the two variables.

Repeatability
The repeatability of the procedure has been investigated by the application of the procedure on road SR2 by another team.The result, which is shown in Table 11, demonstrates that even if applied by different inspectors, the proposed procedure leads to very similar results.Minor differences are present in the length of the sections, which derive from the difference in the definition of the CHLs and CHTs, and minor differences are present in the RC.This also creates little difference in the ranking.However, the most interesting thing is that the risk level is the same for each NAS, the number of NASs is the same, and the two critical sections are the same (the two sections classified as "high" or "very high" risk).These results are very encouraging, even if additional teams should apply the procedure on different roads.Name AADT [Veh./Day]

Repeatability
The repeatability of the procedure has been investigated by the application of the procedure on road SR2 by another team.The result, which is shown in Table 11, demonstrates that even if applied by different inspectors, the proposed procedure leads to very similar results.Minor differences are present in the length of the sections, which derive from the difference in the definition of the CHLs and CHTs, and minor differences are present in the RC.This also creates little difference in the ranking.However, the most interesting thing is that the risk level is the same for each NAS, the number of NASs is the same, and the two critical sections are the same (the two sections classified as "high" or "very high" risk).These results are very encouraging, even if additional teams should apply the procedure on different roads.

Consistency
The consistency of the procedure has been investigated, considering the possibility of different segmentation.Segmentation is a crucial task in road safety analysis, as highlighted by Cafiso et al. [47].Two different segmentations are proposed in addition to those considered during the test of the procedure (1 km segmentation): a fixed length of 2 km and a variable length based on traffic changes.This new application was made both on road SR2 and road B38 stretches.Figures 11 and 12 show the RC and the characteristics of the road SR2 and road B38 NASs obtained considering a segmentation of 1 km, 2 km, and a segmentation based on traffic characteristics (when the AADT changes, another segment is defined).Dark red indicates a very high-risk section, red indicates a high-risk section, yellow indicates a medium-risk section, and green indicates a low-risk section.In the traffic-based segmentation, HFESs containing intersections where a change in traffic is present have been included wholly within the same NAS; thus, for a very short segment close to the intersection, the traffic of the NAS assumes a different value.It must be noted that a segmentation based on traffic must be made only when traffic data are reliable.

Consistency
The consistency of the procedure has been investigated, considering the possibility of different segmentation.Segmentation is a crucial task in road safety analysis, as highlighted by Cafiso et al. [47].Two different segmentations are proposed in addition to those considered during the test of the procedure (1 km segmentation): a fixed length of 2 km and a variable length based on traffic changes.This new application was made both on road SR2 and road B38 stretches.Figures 11 and 12 show the RC and the characteristics of the road SR2 and road B38 NASs obtained considering a segmentation of 1 km, 2 km, and a segmentation based on traffic characteristics (when the AADT changes, another segment is defined).Dark red indicates a very high-risk section, red indicates a high-risk section, yellow indicates a medium-risk section, and green indicates a low-risk section.In the traffic-based segmentation, HFESs containing intersections where a change in traffic is present have been included wholly within the same NAS; thus, for a very short segment close to the intersection, the traffic of the NAS assumes a different value.It must be noted that a segmentation based on traffic must be made only when traffic data are reliable.Looking at the results, the following considerations can be made.

•
NAS length is related to the HFESs.Sometimes, it is not easy to define short sections (e.g., 1 km), and sometimes, it is not easy to define longer sections (e.g., 2 km).For this reason, the results should be consistent, even if NASs sometimes have a length that differs much from the reference one.

•
When a lack of traffic data is present or when traffic is constant for a long stretch, NAS defined, considering traffic, may become very long, missing much of its significance.This is the case of road SR2.In this case, even if traffic is chosen as a reference for NAS segmentation, it is suggested to consider a maximum section length (2 km is suggested).

•
It has been confirmed that longer NASs are likely classified as riskier than shorter NASs because they may include more "red results" (high risk) from the HFESs.However, the choice of considering the most critical results within the section to provide

Consistency
The consistency of the procedure has been investigated, considering the possibility of different segmentation.Segmentation is a crucial task in road safety analysis, as highlighted by Cafiso et al. [47].Two different segmentations are proposed in addition to those considered during the test of the procedure (1 km segmentation): a fixed length of 2 km and a variable length based on traffic changes.This new application was made both on road SR2 and road B38 stretches.Figures 11 and 12 show the RC and the characteristics of the road SR2 and road B38 NASs obtained considering a segmentation of 1 km, 2 km, and a segmentation based on traffic characteristics (when the AADT changes, another segment is defined).Dark red indicates a very high-risk section, red indicates a high-risk section, yellow indicates a medium-risk section, and green indicates a low-risk section.In the traffic-based segmentation, HFESs containing intersections where a change in traffic is present have been included wholly within the same NAS; thus, for a very short segment close to the intersection, the traffic of the NAS assumes a different value.It must be noted that a segmentation based on traffic must be made only when traffic data are reliable.Looking at the results, the following considerations can be made.

•
NAS length is related to the HFESs.Sometimes, it is not easy to define short sections (e.g., 1 km), and sometimes, it is not easy to define longer sections (e.g., 2 km).For this reason, the results should be consistent, even if NASs sometimes have a length that differs much from the reference one.

•
When a lack of traffic data is present or when traffic is constant for a long stretch, NAS defined, considering traffic, may become very long, missing much of its significance.This is the case of road SR2.In this case, even if traffic is chosen as a reference for NAS segmentation, it is suggested to consider a maximum section length (2 km is suggested).

•
It has been confirmed that longer NASs are likely classified as riskier than shorter NASs because they may include more "red results" (high risk) from the HFESs.How- Looking at the results, the following considerations can be made.
• NAS length is related to the HFESs.Sometimes, it is not easy to define short sections (e.g., 1 km), and sometimes, it is not easy to define longer sections (e.g., 2 km).For this reason, the results should be consistent, even if NASs sometimes have a length that differs much from the reference one.• When a lack of traffic data is present or when traffic is constant for a long stretch, NAS defined, considering traffic, may become very long, missing much of its significance.This is the case of road SR2.In this case, even if traffic is chosen as a reference for NAS segmentation, it is suggested to consider a maximum section length (2 km is suggested).

•
It has been confirmed that longer NASs are likely classified as riskier than shorter NASs because they may include more "red results" (high risk) from the HFESs.However, the choice of considering the most critical results within the section to provide the RC for the section has been made during the development of the procedure to ensure that the presence of high-risk HFESs is never hidden.This is very important; otherwise, some risky sections could be underestimated.

Discussions and Conclusions
The main purpose of this work was to develop and validate an innovative networkwide road safety assessment procedure based on human factors.Moreover, based on the requirements of the European Directive [4], the procedure should include visual inspections of the road, it should be a proactive procedure, and it should provide at least three levels of risk.All these requirements have been achieved.The procedure allows the identification of risky road sections of the network and can apply to rural two-lane, two-way roads.This research demonstrates the importance and influence of human factors in road safety.
The procedure has been validated by comparing the results to accident data (i.e., accident rate).The results are good overall.The main strengths and realizations of this work can be summarized as follows: • This research provides a validated instrument that analyzes the road using human factors.Such an instrument is innovative and highly required because most accidents occur because of human errors induced by the road.

•
It has been found that it has a high capacity to identify dangerous locations.The comparison with accident-based analysis shows a good statistical correspondence between the results.An average of 56% of NASs were considered with the same risk level for both analyses.Considering only two levels, sections that require intervention (high risk) and sections that do not require intervention (low and medium risk), the concordance rises to 81%.Overall, statistical analysis also demonstrates that such concordance is significant.• This procedure overcomes many of the segmentation issues that always burden road safety analysis.Those issues are very common in standard RSI analysis and accidentbased analysis.This has been performed while considering different segmentations for different types of analysis (step 1, step 2, and step 3 of the procedure).In steps 1 and 2, the segmentation is linked to the specific issues that need to be analyzed, while in step 3, the segmentation is made while considering the most efficient way for road agencies to use the results on a network level.

•
It is a proactive procedure because accident data are not required.Thus, it can be applied to those road stretches for which no accident data are available or when accident data are not reliable.

•
It provides information about the specific risks of the road, aiding in the decision of possible interventions (in step 2, it provides a detailed analysis of the issues of the segment).

•
It does not need much data, and the data needed can be easily found.

•
It allows the definition of intervention priority.The calculated RC allows the identification of four levels of risk and the order of the NAS within those levels from the most critical to the least critical.

•
It proves to be repeatable and easy to implement after short training courses.This has been proven by the application of the procedure from a different inspection team that was trained in human factors principles.The training course was a two-day course.
However, together with many achievements, some limitations are also present.The identified limitations are hence listed.

•
Despite an overall good concordance between the results, some sections show differences in the evaluation.Thus, a detailed analysis of those sections must be carried out with detailed inspections and/or the accidents of the successive years monitored.

•
Identifying EXSEs is a crucial task.The application of the procedure from a different inspection team shows that when longer sections of the road are considered, the PPI level can be ambiguous.Different evaluations of PPI are reflected in the evaluation of PCLs, both concerning GEXs and VISs.However, if an EXSE is judged as "medium" PPI instead of "low", but the winding level is the same, the risk of missing information is moderate, and medium-level CHLs are likely missed.On the opposite, if the PPI is judged as "low" instead of "medium", no information is missed, but more CHLs are identified with some additional computing time while applying the HFET.

•
Subjective judgments of PEX.Even if references and short checklists have been provided to evaluate PEXs, the judgments are still subjective; thus, different results in the evaluation may occur.However, the procedure shows that those differences are few.Moreover, the choice of giving greater importance to the most critical result for evaluating HFESs and NASs allows for the reduction in error of PEX judgments in step 1.That is because very critical locations for PEX are likely identified by all inspectors.The main difference in the evaluations is related to locations around the "medium level" because they can be ambiguous, and the inspector cannot be sure if they must be selected or not.

•
Limited sample for the study.Despite the overall good results, additional tests should be made considering additional road stretches and additional inspector groups.• The procedure must be tested strongly against the inspector's subjective judgments.Also, in this case, this research provides a first important step in that direction, but to be sure about the reliability and repeatability of the procedure, more inspectors should apply it to the same road (and also to others), and the results of the judgments should be compared.• The procedure is consistent against different NAS segmentations when considering the capacity to identify the most critical section of the road.On the other hand, when very long sections are considered, the probability of section risk overestimation increases, thus reducing the consistency of the overall judgment.For these reasons, a maximum length of 2 km is suggested.This measure should be validated in subsequent experiments.

•
It must be remembered that this procedure focuses on the identification of road stretches that are prone to cause accidents.Thus, it identifies those points of the road where an accident can likely occur.It does not consider the consequence of an accident; hence, it does not consider the severity of an accident.For this reason, to have a comprehensive analysis of all the safety aspects of the road, this procedure should be complementary to some others that can identify how severe the possible outcome could be.• The procedure does not account for the influence of traffic; thus, it cannot provide an index of risk accounting for the number of vehicles traveling along the stretch.However, this is only a partial limitation.The proposed procedure has been developed to analyze the safety level of a road without considering traffic exposure.The results from the procedure identify the risk of a single vehicle driving along the road to incur an accident because road-induced driver behavior does not comply with the road characteristics.This value of risk is fixed for a specific road stretch based on its characteristics.• The resources necessary to carry out the procedure on a network range.The application of the procedure requires time and trained inspectors.Fast analyses that consider only observed accidents are easier to implement.
A comparison has also been made considering the achievements of Paliotto et al. [50] and the actual achievements.In Section 1.3 of the Introduction, some limitations of the previous work have been highlighted, and the objective of this new research is listed.After INCS Inconspicuous Segment.Part of a road section that is easy to drive and without any obvious design deficiencies or human factors deficiencies.It does not have to be evaluated with the Human Factors Evaluation Tool.NAS Network Assessment Section.This section is considered to provide the result of the procedure.The whole analyzed network will be divided into many NASs.NASs will be the element to which the road agency will refer to decide where to intervene.NASs may include many HFESs.NSA Network safety assessment.

PCL
Potentially critical location.Any area where drivers must adapt their driving program by changing their speed, braking, steering, or changing lanes.Normally they are junctions, intersections, stops for public transport, exits, driveways, curves, carriageway-width reductions, or pedestrian/cyclist crossings.

PEX
Punctual expectation.Expectations the driver has about the road, derived from the surrounding location: the punctual road image (and the Gestalt) creates specific expectations about the specific location.

PPI
Perception of possible interaction.This is a quantification of how the road layout shows possibilities for interactions with other road users (e.g., the presence of intersections, accesses, and pedestrian crossings).RC Risk code, which summarizes the outcomes of the application of the procedure for each NAS.RSIs Road safety inspections.

V E
Expected speed has been introduced to provide a range of possible speeds for an EXSE.Based on the EXSE's characteristics it is expected that drivers traveling the EXSE will hold a speed within the range of the V E .

VIS
Visibility.When the VIS term is used, it means the available sight distance between the driver and the PCL, which allows the driver to see, perceive, and understand the PCL.

Figure 1 .
Figure 1.Conceptual scheme of the procedure.

Figure 1 .
Figure 1.Conceptual scheme of the procedure.

Figure 2 .
Figure 2. Example of PCL identification: PCLs of the road SR2 stretch.The numbers in brackets are the number of each PCL of that type along the stretch.

Figure 2 .
Figure 2. Example of PCL identification: PCLs of the road SR2 stretch.The numbers in brackets are the number of each PCL of that type along the stretch.

Figure 3 .
Figure 3. Example of EXSE identification on the road SR2 stretch.

D 2 =
threshold distance between the medium and low VIS level [m]; t 2 = 6 for low alertness, 5 for medium alertness, and 4 for high alertness [s].

D 1 =
threshold distance between the medium and high VIS level [m]; t 1 = 4 [s].

Figure 4 .
Figure 4. Scheme of the CHT and HFES definition process.

Figure 4 .
Figure 4. Scheme of the CHT and HFES definition process.

Figure 5
Figure 5  shows an example of two CHLs with their CHT from road SR2 in Tuscany (red line).One CHL is represented by a pedestrian crossing (yellow); the other CHL is represented by a curve (green).The two CHTs are overlapping; thus, they are merged to create a single CHT, HFES (purple).The HFES in the example of Figure5is about 450 m long.

Figure 6 .
Figure 6.Example of NAS composition: light blue and red lines identify HFESs in the ascending direction and descending direction, respectively (considering the Km posts).The green line is the road track.After NASs have been identified, all the road stretches that belong to NAS but do not belong to HFESs are classified as Inconspicuous Segments (INCSs).INCSs are considered segments with a HFS of 100%.

Figure 6 .
Figure 6.Example of NAS composition: light blue and red lines identify HFESs in the ascending direction and descending direction, respectively (considering the Km posts).The green line is the road track.

Figure 7 .
Figure 7. Example and format of NAS final Risk Code (RC).

•
A = letter representing the worst level of HFS present, including the HFS for each rule and the Total HFS (R = red, at least one score < 0.40 within the rules; Y = yellow, no red scores, and at least one score < 0.60 within the rules; G = green, all other results), for both directions; • B = The number of results of level "A" (see before), considering the worst results of the HFSs for each rule and the Total HFS (min = 0, max = 4); • C = The worst total result within the NAS; • D = Weighted Average of the Total HFS and length of each HFES and INCS; • E = Standard Deviations of the Total HFS of each HFES and INCS, considering the segment length.

Figure 7 .
Figure 7. Example and format of NAS final Risk Code (RC).

•
A = letter representing the worst level of HFS present, including the HFS for each rule and the Total HFS (R = red, at least one score < 0.40 within the rules; Y = yellow, no red scores, and at least one score < 0.60 within the rules; G = green, all other results), for both directions; • B = The number of results of level "A" (see before), considering the worst results of the HFSs for each rule and the Total HFS (min = 0, max = 4); • C = The worst total result within the NAS; • D = Weighted Average of the Total HFS and length of each HFES and INCS; • E = Standard Deviations of the Total HFS of each HFES and INCS, considering the segment length.
apply: AR = accident rate [accidents/(km × Mvehicles)]; n = number of accidents in the analysis period [accidents]; L = segment length [km]; AADT = average annual daily traffic value in the analysis period [vehicles/day].

Figure 8 .
Figure 8. Satellite view of the different roads.

Figure 8 .
Figure 8. Satellite view of the different roads.

Figure 9 .
Figure 9. Distribution and linear correlation between values assigned to the RC and accident rate values.

Figure 9 .
Figure 9. Distribution and linear correlation between values assigned to the RC and accident rate values.

Figure 10 .
Figure 10.Distribution and linear correlation between values assigned to the RC and accident rate values averaged within each RC.

Figure 10 .
Figure 10.Distribution and linear correlation between values assigned to the RC and accident rate values averaged within each RC.

Figure 11 .
Figure 11.Graphical representation of the risk level obtained for each NAS for each different NAS segmentation, road SR2.

Figure 12 .
Figure 12.Graphical representation of the risk level obtained for each NAS for each different NAS segmentation, road B38.

Figure 11 .
Figure 11.Graphical representation of the risk level obtained for each NAS for each different NAS segmentation, road SR2.

Figure 11 .
Figure 11.Graphical representation of the risk level obtained for each NAS for each different NAS segmentation, road SR2.

Figure 12 .
Figure 12.Graphical representation of the risk level obtained for each NAS for each different NAS segmentation, road B38.

Figure 12 .
Figure 12.Graphical representation of the risk level obtained for each NAS for each different NAS segmentation, road B38.

Table 1 .
Level of expected speed and alertness based on winding and PPI levels.
Figure 3. Example of EXSE identification on the road SR2 stretch.

Table 1 .
Level of expected speed and alertness based on winding and PPI levels.

Table 2 .
Classification of PCLs based on the level of winding and PPI.

Table 3 .
Decision sight distance to account for VIS for each V E level: upper thresholds.

Table 4 .
Aspects to consider while assessing the PEX level.

Table 5 .
Combinations of VIS, GEX, and PEX levels for the identification of CHLs (X).

Table 6 .
D CHT for each V E level.

Table 7 .
Ranking criteria within the same risk level group.

Table 8 .
Summary of the results from the application of the proposed procedure and accident rate calculation, with all NASs analyzed.

Table 9 .
Contingency table for all NASs.T green cells highlight those sections that do not require any interventions.The red cell highlights those sections that require interventions.

Table 10 .
Relationship between the RC index and the value considered for the regression analysis.

Table 11 .
Comparison of the results of the procedure carried out by different teams on the same road (SR2).

Table 11 .
Comparison of the results of the procedure carried out by different teams on the same road (SR2).