2.1. Safety Risk Assessment Index System
In view of the characteristics of complex traffic composition, poor road alignment conditions, and difficult control of the personnel management factors of ordinary arterial highways, the selection of risk assessment indicators should mainly consider the road characteristics, accident forms, environmental characteristics, and traffic safety facilities of arterial highways.
Combined with the study of traffic accidents in References [
34,
35] and based on road traffic safety constraints, a multi-dimensional, multi-level, and multi-factor traffic safety risk index system of ordinary arterial highways is constructed. The system includes three sub-dimensions using risk possibility as the input dimension, the traffic operation status factor, the road environment factor, and the traffic facility factor, and using accident severity as the output dimension, including two sub-dimensions: accident risk and accident loss.
Traffic operation status factors: The traffic operation status of ordinary arterial highways is relatively complex, with the characteristics of high speed limits, many types of vehicles, and road conditions resulting in the blocked state of the road and frequent accidents that interfere with road operations.
Road environment factors: The design conditions of arterial highways are limited, and unique environments are an important factor affecting the traffic safety of arterial highways. If the basic highway performance, geometric conditions, and roadside environmental conditions produce adverse combinations, they will bring a great threat to road traffic safety.
Traffic facilities factors: In a complex road environment, in terms of unfavorable road conditions and road construction funding constraints, traffic safety facilities that are set up poorly on imperfect road sections will not be able to provide the necessary security.
Accident risk: Through the analysis of road traffic accident data, we can find the law of accident occurrence, predict the degree of traffic safety risk, and further isolate the source of accident risk to improve driving safety.
Accident losses: The loss degree after the accident is closely related to road design, the management system, the rescue system, and other factors. It is necessary to adopt reasonable statistical indicators to calculate loss from traffic accidents.
2.3. Risk Assessment Model
After comprehensively analyzing the advantages and disadvantages of each method and optimizing them, this paper proposes a risk assessment model based on RCLE. Through the hybrid model, the data-processing and benchmarking analysis of risk objects are carried out. The calculated possibility and severity dimension rank and ratio are used as the basis for the classification of risk input and accident output, and the risk judgment matrix is constructed. The road risk level calculation results are classified into the matrix to determine the risk level of the road. Secondly, considering the randomness of parameters, the surrogate model of LSSVM is established. Finally, through the model analysis and evaluation results, we can find the safety benchmark objects and risk sensitivity factors and put forward the corresponding traffic safety improvement measures and implementation order in order to use less investment to obtain greater safety gains, reducing the overall level of regional road risk. The calculation process and steps of the RCLE model are as follows:
1. Initial risk matrix
According to the road traffic safety risk index system, combined with the actual survey data, the initial risk matrix,
, is established as follows:
where
represents the number of risk assessment objects;
represents the number of risk indicators (it contains
—possibility dimension indicators—and
—severity dimension indicators); and
represents the risk object,
, and the value of the
risk indicator.
2. Risk matrix standardization
In the traffic safety risk index system, there are positive indicators (such as pavement skid resistance, smoothness, etc.) and negative indicators (such as bad linear ratio, overloaded vehicle ratio, etc.). In order to keep the same change trend and eliminate the dimensional influence, for traffic safety positive indexes, the conversion method is as follows:
For the negative effect indexes, the conversion method is
To obtain the standardized matrix, , where represents the risk object in and the value of the risk indicator, in which , .
3. Index weight by CRITIC
We can calculate the standard deviation of each index and the linear correlation coefficient between the indexes, obtain the amount of information contained in the evaluation index, and determine the weight coefficient of the index:
where
represents the standard deviation of indicator
,
represents the expected value of indicator
,
represents the linear correlation coefficient of indicator
and indicator
,
represents the information on indicator
and
represents the weight of indicator
.
4. Write ranks and calculate weighted rank-sum ratio
The nonintegral RSR method is used to rank the risk matrix after assimilation. For
, the risk evaluation objects are sorted according to the size of the index value; the maximum observation value is given
as a rank, the minimum observed value is given 1 as a rank, and the remaining index values are ranked by similar linear interpolations.
where
represents the rank of the
indicator of the
risk object,
represents the minimum of the
indicator value, and
represents the maximum of the
indicator value.
After calculating the risk rank matrix,
, the input dimension weighted rank-sum ratio,
; the output dimension weighted rank-sum ratio,
; and the risk evaluation individual weighted rank-sum ratio,
, are calculated as follows:
where
represents the possibility dimension weighted rank-sum ratio of the
risk object,
represents the severity dimension weighted rank-sum ratio of the
risk object,
represents the individual weighted rank-sum ratio of the
risk object, and
,
represents the intermediate variable.
5. Determine the rank and ratio distribution
According to the small and large values of ,, and , they are arranged separately. The same values are taken as a group. The frequency, , and cumulative frequency, ↓, of each group are listed, and the rank range and average rank, , of each group are determined. After calculating the cumulative frequency, (the cumulative frequency of the last group is set to ), the probability unit value, , corresponding to the percentile is listed according to the percentile and its corresponding probability unit table.
6. Calculate regression equation
The values of
,
, and
are used as dependent variables (represented by
), and the corresponding probability unit value,
, is used as an independent variable to estimate the regression equation. The error analysis is carried out to ensure that the regression equation has significant statistical significance.
where
,
represents the estimate of the parameters.
7. Construct a risk judgment matrix
According to the actual situation, the optimal number of groups of
,
,
is selected. According to the different number of groups, the corresponding percentiles and their probability unit values are listed, and the interval is calculated according to Formula (11):
where
represents the rank-sum ratio calculated by probability unit,
.
represents the probabilistic unit value corresponding to the number of groupings.
,
represents the estimate of the parameters.
According to the reasonable grouping method in the RSR method [
36], the risk judgment matrix was constructed by using the grading values of
and
as the grading standards of risk possibility and accident severity, respectively. At the same time, the variance consistency test and variance analysis were carried out to ensure that the archived groups were statistically significant and that there were significant differences between groups.
8. Road risk judgment
After the judgment model is constructed, each risk evaluation object after grouping is classified into the matrix according to the scores of and , and the degree of road safety risk is judged to determine the priority of the implementation of safety and security measures.
9. Surrogate model based on LSSVM
The surrogate model can be understood as a mathematical model that predicts unknown responses by performing regression or interpolation on discrete datapoints through approximation techniques, with fitting accuracy as a constraint. The least squares support vector machine transforms the support vector machine problem into linear equations that can be used as solutions to improve the solution speed and reduce memory usage. At the same time, the error square sum loss function of the training sample is used as the empirical loss, which improves the convergence accuracy of the model [
37] and is one of the most commonly used surrogate models.
From the above, RSR obtains the
,
,
regression equation, and the LSSVM linear model can be obtained: let the sample be an n-dimensional vector, and the N samples are
=
. The data can be mapped from the original space,
, to the high-dimensional space,
, through the kernel function, and the data can be obtained:
, according to the structural risk minimization.
where
represents the regularization parameter,
represents the relaxing factor, and
,
represents the coefficients.
Introducing the kernel function,
, and using the Lagrange method to solve Equation (12), we have
where
represents the Lagrange multiplier.
Using the least square method to solve Equation (13), we can obtain the following:
The validity of the LSSVM fitting results is tested as follows:
where
represents the true value of the
sample,
represents the test value of the
sample,
represents the relative error, and
represents availability.
10. EFAST sensitivity analysis
By using the surrogate model and the sensitivity analysis of the risk possibility influencing factors, the high-risk sensitivity indicators are found and further sorted, and then the traffic safety improvement strategies are put forward. This paper uses the extended Fourier sensitivity test (EFAST), which is a global sensitivity analysis method based on variance [
26]. On the basis of the Fourier amplitude sensitivity test, combined with the idea of Sobol variance decomposition, the first-order and high-order sensitivity indexes can be calculated. The number of sampling times [
27] is related to the number of influencing factors considered, and the calculation amount is relatively small and has good robustness.
The variance of the input parameter,
, is denoted by
; the variance of the interaction between the input parameters is denoted by
,
, and
; and the total output variance is denoted by
. The first-order sensitivity index,
(main utility), of
is
Higher-order sensitivity indices caused by the interaction of
and other input parameters are as follows:
This contains the total utility,
, of the sum of the contribution of
and its interaction with the total variance, which can be expressed as
The high-sensitivity risk factors are determined by sorting the total utility sensitivity values.