Modeling of Low Visibility-Related Rural Single-Vehicle Crashes Considering Unobserved Heterogeneity and Spatial Correlation

: Accident analysis and prevention are helpful to ensure the sustainable development of transportation. The aim of this research was to investigate the factors associated with the severity of low-visibility-related rural single-vehicle crashes. Firstly, a latent class clustering model was implemented to partition the whole-dataset into a relatively homogeneous sub-dataset. Then, a spatial random parameters logit model was established for each dataset to capture unobserved heterogeneity and spatial correlation. Analysis was conducted based on the crash data (2014–2019) from 110 two-lane road segments. The results show that the proposed method is a superior crash severity modeling approach to accommodate the unobserved heterogeneity and spatial correlation. Three variables—seatbelt not used, motorcycle, and collision with ﬁxed object—have a stable positive correlation with crash severity. Motorcycle leads to a 12.8%, 23.8%, and 12.6% increase in the risk of serious crashes in the whole-dataset, cluster 3, and cluster 4, respectively. In the whole-dataset, cluster 2, and cluster 3, the risk of serious crashes caused by seatbelt not used increased by 5.5%, 0.1%, and 30.6%, respectively, and caused by collision with ﬁxed object increased by 33.2%, 1.2%, and 13.2%, respectively. The results can provide valuable information for engineers and policy makers to develop targeted measures. model and the latent class logit model. The regression results indicate that both of these statistical models can effectively identify risk factors, but the latent class logit model


Introduction
The impact of traffic crashes on sustainable development cannot be ignored because accidents will cause significant property damage and personal injury. The analysis of traffic crashes can provide targeted measures to improve traffic safety performance and promote sustainable development. Among various crash types, single-vehicle (SV) crashes account for a high fatality rate; according to the National Highway Traffic Safety Administration, the number of SV crashes and fatalities accounted for 16% and 36.9% of the respective totals for all crashes [1,2]. The impact of SV crashes on road safety is daunting. This phenomenon is particularly obvious in rural areas of China because road infrastructure and medical assistance in these areas are worrisome. The number of fatalities caused by rural SV crashes increased from 2013 to 2017, showing an average growth rate of 4% [3,4]. Therefore, analysis on the severity of rural SV crashes has aroused widespread interest from transportation professionals [5,6]. A primary objective of this research is to clarify the relationship between crash severity and various risk factors [7], and it has been generally recognized that weather has a significant influence on crash severity [8].
Severe weather can affect the traffic environment and is more likely to result in serious crashes [9,10]. In particular, low visibility caused by fog and haze has been considered as a crash-concentration environment [11]. Such condition will affect the judgment of traffic obstacles, which has a negative influence on traffic safety [12,13]. In the existing research on rural SV crashes, low visibility has been exhibited as a variable of weather factors; there is a lack of separate research on rural SV crashes under such conditions. For example, Wen et al. [14] investigated the risk factors of SV crash severity in mountainous areas and suggested that compared with sunny weather, the probability of serious and fatal crashes in foggy weather increased by 21.2% and 15.7%, respectively. Visibility conditions were divided into four categories-greater than 200 m, 100 -200 m, 50 -100 m, and less than 50 m-to evaluate the safety performance of rural areas. It was found that the probability of fatal crashes is substantially increased in visibility less than 100 m [15].
Further, research on rural SV crashes under specific inclement weather mainly has focused on rainfall and snowfall. A mixed logit model was established using rain-related rural SV crashes. The regression results show that the influence of male drivers on crash severity exhibited a random effect [5]. Based on the study of snow-related rural SV crashes, it was found that older drivers were 101.7% more likely to be involved in a serious crash compared to middle-aged drivers [6]. Recently, a preliminary analysis of SV crashes associated with low visibility was conducted [16]; however, the research did not separately analyze rural crashes. In urban and rural areas, some substantial differences in crash characteristics have been found [17,18]; thus modeling results cannot be transferred directly [9]. Hence, it is necessary to investigate the effect of contributing factors on the severity of low-visibility-related rural SV (LVR-SV) crashes to provide trustworthy measures.
The rest of this paper is organized as follows. Section 2 presents a comprehensive background review for rural SV crash analysis. Details of the collision dataset are described in the subsequent section. The model framework of this research is exhibited in Section 4. The analysis and discussion of the regression results are given in Section 5. Conclusion of this paper is given in Section 6. The potential limitations, together with future work, are described at the end.

Safety Covariates of Rural Single-Vehicle Crashes
Various risk factors will affect the severity of rural SV crashes, and corresponding research was conducted to clarify the relationship between risk factors and crash severity [8,9]. According to the characteristics of risk factors, three components can be roughly identified: driver characteristics, crash characteristics, and environmenta characteristics.
There is a significant correlation between driver characteristics and rural SV crash severity, which has been widely recognized by transportation professionals. Several driverrelated risk factors, such as gender, age, seatbelt usage, drunk driving, and speeding, were investigated in the existing literature. For example, a positive correlation between male driver and serious crashes was revealed [6]. This phenomenon is related to the aggressive driving behavior of male drivers. The study also noted that older and younger drivers were significantly more likely to have severe SV crashes than middle-aged drivers. Older and younger drivers are susceptible to distracted driving [19]. The impact of distracted driving on crash risk was analyzed, and a positive correlation between distracted driving and crash risk was found [20]. Further, the importance of using seatbelts and maintaining a safe speed has been widely emphasized, as they can effectively prevent drivers from being involved in fatal crashes [14,21]. A latent class logit function was established, and three risk variablesseatbelt not used, driving under the influence of alcohol, and fatigue driving-are expected to result in a significant increase in the probability of severe crashes [22].
Crash characteristics mainly include two components: crash type and vehicle type. Specifically, three crash types-collision with fixed object, run-off-road crashes, and collision with animals-have an important impact on the severity of SV crashes [23]. It was found that these variables are negatively correlated with crash severity. However, this finding could not be supported by Haq et al. [24]. A hierarchical Bayesian random inter-cept approach was established to link collision with fixed object and crash severity. The regression results show that (1) collision with fixed object will cause a significant increase in the possibility of serious crashes and (2) one of the serious consequences of collision with fixed object is rollover, which will increase the risk of serious crashes in some extent [25].
In addition, existing studies have found that the severity of rural SV crashes varies across vehicle types. Three vehicle types-passenger vehicle, motorcycle, and large vehicle-were classified, and the large vehicle was used as a reference variable to establish an ordered logistic regression. The results indicate that compared with large-vehicle crashes, the risk of serious crashes caused by passenger vehicle reduced significantly (odds ratio was 0.553) and caused by motorcycle increased significantly (odds ratio was 3.907) [26]. Considering the substantial differences between different vehicle types, crossvehicle-types modeling was recommended. For example, a crash severity regression function related to motorcycle and truck crashes was established, respectively, to explore the risk factors [27,28]. Establishing a regression model across vehicle types can obtain targeted findings and improve the model fit; this may be due to the fact that a crash dataset containing a specific vehicle type has more homogeneity compared with a crash dataset containing all vehicle types [29].
Many significant risk variables are included in the environmental characteristics [30], such as road surface, weather, light condition, and intersection, etc. Previous studies have found that there is a negative correlation between adverse road surface conditions and crash severity [23]. It indicates that the reduction in crash severity under adverse conditions is due to drivers being more cautious. A more detailed finding related to adverse road surface was drawn by Yu et al. [10]. Compared with dry road surface, the probability of fatal crashes caused by wet, snow, and ice is reduced by 2.73%, 0.85%, and 2.78%, respectively. A severity prediction function was established by Hou et al. [31] to investigate the influence of environmental factors on crash severity. It was found that the probability of severe injury is expected to be significantly reduced under the conditions of visibility less than 200 m and wet road surface. Meanwhile, both roads with intermediate barriers and dark unlighted conditions will cause a significant increase in the tendency of serious crashes. Further, illumination conditions were divided into five categories to establish a logistic regression. Compared with the daylight condition, the probability of serious collision under dusk, dawn, dark lighted roadway, and dark unlighted roadway conditions increased by 1.57 times, 2.02 times, 1.63 times, and 2.61 times, respectively [32]. This shows that the possibility of serious injury in the early morning is of the most concern, and the provision of street lighting can reduce the occurrence of serious traffic collisions.

Statistical Techniques for Unobserved Heterogeneity and Spatial Correlation
Unobserved heterogeneity has been recognized as a primary issue in crash severity analysis [33][34][35] because several risk factors affecting the possibility or severity of the crash cannot be observed [36,37]. Such fact has driven the development of statistical techniques to identify the unobserved heterogeneity. To accommodate the discrete nature of crash severity (no injury, slight injury, serious injury, and fatality), various regression approaches-random parameters logit (RP-logit) model [38,39], random parameters probit model [40], random intercept logit model [41], latent class logit model [10], and finite mixture random parameters model [16,42]-have been widely recommended due to their high flexibility [43][44][45]. Alternatively, random parameters ordered logit model [46] and random parameters ordered probit model [47] were applied to handle the intuitive ordering of crash severity. For example, Wu et al. [8] established the RP-logit model to analyze the risk factors of single-and multi-vehicle crash severity on rural highways. The results show that the RP-logit model can accommodate the unobserved heterogeneity satisfactorily, and some substantial differences in the risk factors between urban and rural were clarified.
Rural SV crash data in the United States were extracted to calibrate the RP-logit model and the latent class logit model. The regression results indicate that both of these statistical models can effectively identify risk factors, but the latent class logit model has a slightly better fit performance than the RP-logit model [5]. Similar findings were obtained by Cerwick et al. [48]. A hierarchical random intercept function was proposed to accommodate cross-level interaction in crash data. The regression results indicate that the fitting performance of the proposed function is comparable to the RP-logit model, but its generalization is expected to be limited by high complexity [49]. Meanwhile, a random effect tobit model and a random parameters tobit model were established to accommodate the unobserved heterogeneity in crash rate analysis. The importance of capturing the unobserved heterogeneity and the superiority of the random parameters function were confirmed [50].
In addition, most crash severity functions were developed using a whole-dataset; however, this modeling approach has some drawbacks. The crash analysis model can identify unobserved heterogeneity, but it cannot reduce them. In addition, extensive unobserved heterogeneity may cause the regression results to deviate from the real situation [34,51]. If the whole-dataset is divided into several sub-datasets, called clustering, and both the homogeneity effect within the sub-dataset and the heterogeneity effect between the sub-datasets are maximized, then the severity prediction functions are established for each sub-dataset. This two-step modeling approach is expected to mitigate the impact of unobserved heterogeneity on model estimation.
Currently, the two-step analysis approach is receiving increasing attention. A latent class clustering (LCC) model and an RP-logit model were combined to investigate motorcycle crash severity. It was found that the LCC model is an efficient crash data clustering technology and can effectively reduce the unobserved heterogeneity in the sub-dataset [52]. Based on crash data from New Mexico, Li et al. [53,54] modeled not only intersectionrelated crashes but also SV crashes and found that the two-step approach can improve the fitting performance of the severity prediction function. The severities of rain-related rural SV crashes and head-on crashes were investigated respectively, and both of them suggested that the two-step analysis approach can reveal more valuable risk factors [5,55]. A similar conclusion can be found in Yu et al. [10].
Further, the severity prediction functions mentioned above can not only provide valuable risk factors but also effectively identify unobserved heterogeneity in crash data. However, these functions can hardly identify spatial correlation across crashes. The existence of spatial correlation is reasonable. Some characteristics, such as road width and road surface conditions, may be shared in adjacent crashes. This phenomenon has been confirmed by much research [56][57][58]. Yang et al. [59] considered the spatial correlation of crashes in road safety assessment; the study focused on investigating the interrelationship among crash frequencies. The results show that the model simultaneously considering both spatial correlation and unobserved heterogeneity outperforms the model considering only the unobserved heterogeneity. Risk factors of freeway crashes were explored using a spatial generalized ordered logit model. It was found that the model fit can be improved by accounting for spatial effect and the spatial error term can be effectively estimated by introducing the Gaussian conditional autoregressive (CAR) technique [60]. Klassen et al. [61] obtained the same conclusion by proposing a spatial random intercept logit model and pointed out that ignoring spatial correlation may lead to biased inferences. To examine pedestrian injury severity in bicyclist-pedestrian crashes, a Gaussian CAR spatial Poissonlognormal model was adopted. The regression results highlight the effectiveness of jointly modeling multiple crash severities to improve fit performance [62].
The spatial functions mentioned above are evolved from traditional regression models by considering a spatial error term, including the multinomial logit model, ordered logit model, random intercept logit model, etc. The spatial function evolved from random parameters models is less common. Several studies have proved that the random parameters model outperforms the fixed-parameter model and the random intercept model. Identifying the spatial correlation by extending the random parameters model is expected to exhibit satisfactory fitting performance. Recently, a spatial random parameters Poisson-lognormal model, a spatial random parameters tobit model, and a spatial random parameters Poisson-lognormal model with a mixture component were proposed to capture the spatial correlation across crashes [63][64][65]. These three advanced crash prediction functions are employed to predict crash count, crash rate, and crash frequency, respectively. However, no spatial function evolved from the random parameters model, especially the RP-logit model, for predicting crash severity was found. This is a methodology gap and corresponding studies should be conducted to supplement existing severity prediction functions.

The Current Research
As discussed above, unobserved heterogeneity and spatial correlation may have a substantial contribution to crash severity. However, these two crucial issues are not clear in the analysis of LVR-SV crash severity. Hence, this research was conducted to comprehensively investigate unobserved heterogeneity and spatial correlation across LVR-SV crashes. A two-step modeling approach combining the LCC model and a novel spatial random parameters logit (SRP-logit) model was proposed to complement the severity prediction function. Specifically, the LCC model was first developed to divide the whole-dataset into several relative homogeneous sub-datasets. Then, the SRP-logit model was established for each sub-dataset to capture the spatial correlation and unobserved heterogeneity. Based on the two-step modeling approach, the following questions will be clarified: (1) Does the SRP-logit model outperform the RP-logit model? (2) Which variables have significant impacts on the severity of LVR-SV crashes? (3) Are there differences in the spatial correlation and the unobserved heterogeneity between the sub-dataset and the whole-dataset? A comprehensive research framework of this paper is exhibited in Figure 1.

Data
A rural SV crash dataset of 110 two-lane undivided road segments for a five-year period (2014-2019) was extracted from the Shandong Department of Transportation. The GIS map of the region under study is shown in Figure 2. The crash dataset records the characteristics of the driver, vehicle, time and location, road conditions, weather, visibility, etc. As visibility and weather are estimated by traffic police, some errors may exist; thus a real-time meteorological dataset was collected from the Shandong Climate Center to calibrate row data. Crash data related to low visibility-less than 100 m-was extracted because the rate of fatal crashes in rural areas greatly increases when visibility is less than 100 m [15]. After excluding abnormal data, 19,187 LVR-SV crashes were sorted. Of note is that rain and snow may also cause low visibility; crashes impacted under these conditions are mainly due to unsatisfactory anti-skid performance and were excluded.
More details of road characteristics were extracted from the Traffic Information System maintained by the Shandong Department of Transportation. All road segments at crash locations were verified as approximately linear horizontally and vertically. By checking the curvature and slope of each site, crash data at sharp turns or ramps were eliminated. In total, 19,014 crashes were used to calibrate the model parameters.
During the modeling process, driver injury severity was regarded as a dependent variable. In China, driver injury severity is divided into four categories-no injury (70.0%), slight injury (20.1%), serious injury (8.6%), and fatality (1.2%). A driver passing away within seven days is regarded as a fatal event. It can be seen that fatality cases are very limited, which may lead to incorrect inference. Considering that the two adjacent injury categories are similar, the combination of fatality and serious injury was named FS injuries. It was not expected to have a substantial influence on model estimation. Hence, the dependent variable contains three categories-no injury, slight injury, and FS injuries. No injury was taken as a reference variable.
Independent variables were processed to optimize the model structure; specifically, continuous variables such as time and driver age were discretized, and the classification of discrete variables was simplified. For example, there were more than 20 crash types in the raw data that were simplified into four categories-collision with non-fixed object, collision with fixed object, collision with pedestrian, and other crashes. Motorcyclists must wear helmets, and passenger car drivers must use seatbelts; because they have similar functions, these data were collectively classified as seatbelts/helmets. Nearly 80 types of motor vehicles were recorded, and the vehicle types, less than 1%, were combined into "other" vehicle type. Five categories-passenger car, motorcycle, pickup, truck, and other vehicle type-were included in vehicle type. Traffic control includes several modes, such as police direction, roadway guide markings, and traffic lights. These control methods were marked as traffic control, otherwise marked as no traffic control. The specific variables classification and proportion statistics are shown in Table 1. Note: * indicates that variable is reference variable in model. Table 1, the proportion of FS injuries of older drivers (>50) (19.2%) and the time of 00:00-7:00 (15.4%) were higher than other categories of factors. In addition, the proportion of FS injuries caused by collision with a fixed object is the highest (46.9%), a finding that is inconsistent with the general rule that the number of crashes decreases with the increase of injury degree. Such a phenomenon is related to the topic of this research. Generally, the traffic environment of rural areas is complex, and the driver safety awareness is not satisfactory. In low-visibility conditions, drivers may not be able to avoid obstacles in time, which is more likely to result in a serious crash.

Latent Class Clustering Model
The LCC model is a probability-based multivariate clustering approach and consists of two types of variables-exogenous and latent class. The exogenous variable is risk factors included in the observation, such as gender and age, and the latent class variable is a kind of dummy variable obtained by estimating clustering probabilities of the exogenous variable. Homogeneous clustering was conducted using latent class variables to reduce unobserved heterogeneity across observations. A complete LCC approach includes four-step theory construction, parameter estimation, fitness evaluation, and result analysis.
For a given LVR-SV crash dataset, it is assumed that R latent class variables are estimated using J categorical exogenous variables. Each of the exogenous variable has K j possible values, for crash i = 1, . . . , I . The number of outcomes may vary across the different exogenous variables, hence indexing by j. Let Y ijk represent the value of the J exogenous variable. Y ijk = 1 denotes the response of crash i for the jth variable is k, and Y ijk = 0 otherwise, where j = 1, . . . , J and k = 1, . . . , K j .
In the LCC model, the class-conditional probability is represented by π jrk , and its meaning is the kth outcome related to jth variable is exhibited by a crash in class r, where r = 1, . . . , R. Further, within each latent class, for each exogenous variable, the sum of the conditional probabilities for all categories is 1, which can be derived as Hence, the probability of a specific set of J outcomes was produced by the variable j in crash i for latent class r and can be obtained as: Further, the prior probability of latent class membership is denoted by p r , which represents the unconditional probability that crash i will belong to each latent class before considering the variable response outcome Y ijk provided by the exogenous variables. The probability density function of all latent classes is a form of weighted sum, which can be expressed as (see Linzer et al. [66]): (2) p r and π jrk are unknown model parameters and are determined by maximizing the log-likelihood function.p r andπ jrk are used to represent the estimated results of p r and π jrk . Based on the outcome of the exogenous variables, the Bayesian inference is used to calculate the posterior probability of each crash belonging to each latent class: Model fitness was evaluated by AIC (Akaike information criterion), BIC (Bayesian information criterion), and CAIC (consistent Akaike information criterion). By introducing a penalty term of model complexity, such diagnostic methods provide the criteria to measure complexity and goodness of fit. The smaller the value of AIC, BIC, and CAIC, the better the fitting performance of the model. The calculation formulas of AIC, BIC, and CAIC are as follows: where p denotes the number of model parameters, LL(β) denotes the log-likelihood convergence value, and I is the number of observations.

Spatial Random Parameters Logit Model
The RP-logit model can capture unobserved heterogeneity by allowing model parameters to vary across observations. Such function provides a flexible framework for the RP-logit model. In theory, it can fit any observations with random effects. However, the generality of the model is limited by the inability to capture the spatial correlation across observations [59]. Since adjacent segments have similar geometric characteristics and environments, some factors will be shared across adjacent crashes [60,61]. By considering the spatial correlation, the fitting performance of the RP-logit model is expected to be improved.
In order to better illustrate the modeling approach, the derivation starts from the RP-logit model, which is the original approach of the SRP-logit model. For a given LVR-SV crash at road segment m (m = 1, . . . , M), the probability of ith crash exhibits a severity level k, marked as P m i,k , which can be given as: U m i,k is a propensity function to determine the severity k of ith crash. i = 1, . . . , I is the ith crash event where I denotes the total number of crashes. k = 1, . . . , K is the crash severity. K denotes the number of categories of crash severity; K = 3 in this research. For such a ternary-dependent variable, P m i,k is determined by a set of covariates representing specific attributes and a set of unknown parameters. The propensity function is assumed to be linear, which can be expressed as: The term β i,k = (β i,k,0 , β i,k,1 , . . . , β i,k,L ) represents the vector of model coefficients to be estimated. Among them, β i,k,l is the coefficient of the lth covariate when the severity of ith crash is k. β i,k,0 is the model intercept. X m i = (x m i,1 , . . . , x m i,L ) represents a set of covariates related to crash severity. x m i,l is the lth covariate related to ith crash. ε m i,k denotes the random error term, which obeys Gumbel distribution.
The response variable Y m i takes one of three values: Y m i = 0 denotes no injury and was used as the reference variable, while Y m i = 1 and Y m i = 2 represent slight injury and FS injuries, respectively. Let Assuming that the probabilities P m i,0 , P m i,1 , P m i,2 obey the multinomial logistic distribution, the structure of RP-logit model is shown as follows.
where all variables in Equations (10) and (11) are the same as those defined earlier.
The regression coefficients to be estimated in RP-logit model may vary across observations. It is assumed that the normal distribution, marked as N(·), is followed by the random coefficients β i,k = (β i,k,0 , β i,k,1 , . . . , β i,k,L ) T and its distribution form is as follows.
It is necessary to point out that the random parameter β l i,k is determined when its variance ϕ k w,l is significantly greater than zero. Otherwise, it will be regarded as a fixed parameter across observations. If no random parameter is found, the RP-logit model will degenerate to a multinomial logit model.
To capture the spatial correlation of LVR-SV crashes, a spatial structure is constructed by adding a spatial error term δ m k to the RP-logit model, called the spatial random parameters logit (SRP-logit) model. The spatial term is interpreted by Gaussian CAR techniques, which have been widely accepted in spatial statistical analysis. Based on this modification, unobserved heterogeneity across observations can be captured by random parameters and the spatial correlation among adjacent crashes can be captured by the spatial structure. The specific model structure is as follows.
An effective joint density specification often used for the spatial term δ k = δ 1 k , δ 2 k , . . . , δ M k is in terms of pairwise differences in errors and a variance term σ 2 δ k [67,68], thus: A normal distribution is implied in this joint density for δ m k conditioning on the effect of δ m k on the remaining observation road segments. The corresponding conditional form is: With where w m, m denotes the un-normalized weight between road segment m and m, and it is defined as a proximity structure based on distance attenuation. The weight is calculated using exponential decay function of the distance between road segments. In addition, let (t m1 , t m2 ) represent the crash coordinate in segment m (the latitude and longitude). The Euclidean distance between road segment m and m is represented by Hence, we have: The term α is a weight coefficient, which controls the decline rate of correlation. At the same time, if the potential spatial correlation of all observed road segments is considered, this will reduce the model fit performance [69]. Inspired by Klassen et al. [61] and Aguero-Valverde et al. [70] and combined with the actual length of road segment, d m,m = 3 km was used as the threshold to identify the spatial correlation. Specifically, the minimum distance between the observed road segments is greater than this threshold, and a weight with a value of 0 is considered, otherwise it is 1. As pointed out by El-Basyouny and Sayed [71], due to the complexity of traffic interaction around the specific crash site, random variations across road segments may be structured spatially, and the SRP-logit model can be calibrated using the PROC GLIMMIX procedure in SAS version 9.3.
In addition to AIC and BIC, McFadden R 2 is often used to evaluate model fit. The calculation approach is as follows: where LL(β) and LL(0) denote the convergence value and the initial value of log-likelihood, respectively. The range of McFadden R 2 is 0 to 1-the larger the value, the better the fitting performance.

Average Marginal Effect
The average marginal effect of the significant risk variables was calculated to quantify the impact of the independent variables on crash severity. Since the independent variables in this research are binary variables, the calculation paradigm of the average marginal effect, marked as E P k X l , is as follows: where X i,l denotes the value of the lth independent variable in ith crash. P i,k (X i,l = 1) represents the probability that the severity of crash i is k under the premise of X i,l = 1. For a given risk variable, all the observations have a marginal effect, and the average marginal effect was obtained by calculating the average of all marginal effects. The mathematical meaning of the average marginal effect refers to the probability change of a certain crash severity when a variable changes by one unit while the value of other variables remains stable.

Analysis of Latent Class Clustering Model
The LCC model was performed based on the exogenous variable shown in Table 1 in Latent GOLD 4.5 software. An exploratory analysis approach was used to determine the optimal number of latent class variables. Starting with the latent class variable of 1, the number of latent class variables was gradually increased and model fitness was checked one by one to select the best model. In total, 12 clustering models were constructed; the specific fitting comparison was shown in Figure 3. As shown in Figure 3, the values of AIC, BIC, and CAIC decrease with the increase of the number of latent class variables. However, the percentage reduction in the values of AIC, BIC, and CAIC is less than 1% starting from a latent class variable of 4. This indicates that dividing the observation into four clusters (sub-datasets) can satisfactorily identify the latent class variable. The distribution statistics of different variables in the different clusters are shown in Appendix A Table A1. Since the proportion of variables in different clusters is different, variables with a high percentage can be identified as characteristic variables; hence, the latent class variables can be described by the characteristic variables. The characteristic variables (nine groups in total) and simple size of different clusters are shown in Table 2. As can be seen from Table 2, the percentages of seatbelt/helmet used and passenger car in cluster 1 are 83.1% and 81.2%, respectively, which are significantly higher than those in other clusters. Hence, these two variables are the characteristic variables in cluster 1. In cluster 2, the percentage of truck is 68.2%, which is less than 10% in other clusters. Further, among the different vehicle types in cluster 2, truck has the highest proportion. Hence, truck is one of the characteristic variables of cluster 2. In addition, the proportion of midage driver  in cluster 2 is significantly higher than that in other clusters. Thus, truck and mid-age driver are the characteristic variables of cluster 2. In cluster 3, the proportion of motorcycles (68.4%) and collision with fixed object (40.2%) is significantly higher than that in other clusters. Therefore, cluster 3 has two characteristic variables, motorcycle and collision with fixed object. Similarly, three characteristic variables are contained in cluster 4, which are older driver, seatbelt/helmet not used, and drunk driving.
In addition, the proportional distribution of crash severity for different clusters is shown in Table 3. Interestingly, the crash severity in cluster 1 contains only two categories, and they are no injury and slight injury. Other clusters include three categories-no injury, slight injury, and FS injuries. This phenomenon can be confirmed by the characteristic variables of cluster 1. As is known, the use of a seatbelt/helmet can provide superior protection for drivers, which will significantly reduce the probability of a serious crash [15]. Moreover, the percentage of FS injuries in cluster 3 is 41.4%, which is significantly higher than that in other clusters. The characteristic variables of cluster 3 is motorcycle and collision with fixed object; both of these variables are risk factors related to high mortality [52]. These findings preliminarily demonstrate the superiority of the LCC model in identifying latent class variables.

Analysis of Spatial Random Parameters Logit Model
Establishment of the SRP-logit model was a progressive process. The initial variables were added to the model one by one and were retained or deleted depending on the significance of the parameters (90% confidence level). The modeling was repeated five or more times for each dataset, and the best-fitting model was employed.
Similar to the RP-logit model, the distribution form of random parameters needs to be pre-set. To verify the optimal distribution hypothesis of random parameters, three commonly used probability density distributions-normal distribution, uniform distribution, and log-normal distribution-were tested. Among them, the log-normal distribution provides a limitation, that is, the model parameters need to maintain the same sign (all positive or all negative). However, signs of the parameters varying across the risk factors are often observed [39,55]. Hence, the log-normal distribution is not appropriate in this research. The fitting performance of random parameters obeying normal and uniform distributions was compared. The results are shown in Table 4. The optimal distribution hypothesis was determined by comparing the BIC and McFadden R 2 values. As shown in Table 5, for the whole-dataset, cluster 1, cluster 3, and cluster 4, the random parameters in the SRP-logit model conforming to the normal distribution outperform those conforming to the uniform distribution because the normal distribution in such dataset has a lower BCI value and a higher McFadden R 2 value. However, no random parameter was captured in cluster 2. Hence, the SRP-logit model established by cluster 2 degenerates to the spatial multinomial logit model. According to the optimal distribution hypothesis of random parameters in the SRPlogit model, the LL(β) value of the model built by the whole-dataset was −13,762.75, which was less than the sum of the LL(β) values of the model built by the different sub-dataset (−13,747.80). The BIC value of the whole-dataset model was greater than the sum of the BIC values of the sub-dataset model (the difference is about 75.49). Meanwhile, the chi-square test of both the whole-dataset model and the sub-dataset models was significant (p < 0.001). It was shown that the fitting performance of the SRP-logit model can be improved by using the relatively homogeneous sub-dataset to calibrate the parameters.
In addition, the RP-logit model was established for each dataset, and the distribution of random parameters was consistent with the SRP-logit model. Then, the fitting performance was compared between the RP-logit model and the SRP-logit model by the BIC criteria. As shown in Table 5, the BIC values of the PR-logit model and the SRP-logit model established by cluster 1 were 9109.08 and 9084.12, respectively, which shows a difference of 24.96. Meanwhile, the RP-logit model and the SRP-logit model calibrated by cluster 2 showed a BCI difference of 18.31. Similarly, the SRP-logit models calibrated by cluster 3 and cluster 4 were accompanied by more satisfactory BIC values, which were 12.58 and 16.17 lower than the RP-logit model, respectively. The same findings can be confirmed for the model built by the whole-dataset.
These statistics showed that the SRP-logit model outperforms the RP-logit model in terms of fitting crash severity, and this finding remained stable across different datasets. Hence, spatial correlation is a recommended modeling approach. In addition, this study also proved that the Gaussian CAR technology is an effective method to identify spatial correlation in modeling crash severity, which further enriches the research on traffic safety. Guo et al. [72] and Klassen et al. [61] reached a similar conclusion and pointed out that a highly significant spatial error component can reveal more crash information compared to a model without spatial structure.
The estimation results of the variance term σ 2 δ k in the SRP-logit model are shown in Table 6, and the independent variable estimation and average marginal effect analysis are shown in Tables 7 and 8, respectively. As shown in Table 6, the spatial correlation was captured in all crash datasets because the spatial variance term was highly significant. Of note is that the significance of variance term varies across the different crash severities and the different datasets. Specifically, in the whole-dataset, a significant spatial correlation was identified for the three severity levels-no injury, slight injury, and FS injuries-under the distance threshold of 3 km. In cluster 1, the spatial parameters of no injury and slight injury were significant. In cluster 2, there is a significant spatial correlation among no-injury crashes. The spatial correlation of slight injury and FS injuries was captured in cluster 3 and cluster 4. These findings indicate that there is a spatial correlation between the severities of LVR-SV crashes. As can be seen in Tables 7 and 8, there are significant difference in the parameter estimation and the average marginal effect between the whole-dataset model and the sub-dataset model. For example, some variables are not significant in the wholedataset model but are significant in the sub-dataset model. By comparing these models, it is possible to reveal some important variable information hidden in the whole-dataset model [67].

Discussion
Based on the regression results in Tables 7 and 8, a total of 18 factors that have significant impact on crash severity were identified. Among them, three factors (seatbelt not used, motorcycle, and collision with fixed object) lead to a significant increase in the probability of serious crashes, and these findings are consistent across different datasets. It is suggested that these three risk factors should be considered emphatically by the policy makers to reduce the severity of rural crashes. In most cases, the other three factors-drunk driving, old drivers, and the time of 21:00-7:00 (21:00-24:00 and 0:00-7:00)-also have a positive correlation with crash severity. In addition, truck and collision with pedestrian have a negative correlation with crash severity. These findings are consistent with the official statistics [3] and previous studies [4,6,10,54]. A detailed discussion of the risk factors is shown below.

Driver Characteristics
A significant correlation was found between male driver and the crash severity. The parameter of this variable (FS injuries) follows a normal distribution, with a mean of −0.28 and a standard deviation of 0.84 in the whole-dataset model, showing that 37.0% of male drivers are more likely to cause FS injuries. Further, the regression coefficient of male driver varies across different sub-datasets. For FS injuries, the parameters of male driver in cluster 3 and cluster 4 were negative (fixed parameter) and positive (random parameter), respectively. This variable in clusters 1 and 2 was not significant. It is shown that the unobserved heterogeneity related to male driver (FS injuries) presenting in the whole-dataset was divided into cluster 3 and cluster 4 by the LCC model. Among these, the heterogeneity of male driver in cluster 3 was effectively eliminated, but the same finding was not supported by cluster 4. Hence, a conclusion can be drawn (consistent with Liu and Fan [55]), under the specific risk factor, that the LCC model can effectively eliminate the heterogeneity in some of the sub-datasets, but this function cannot be supported by all the sub-datasets.
Young driver (under age 25) was significantly related to crash severity, which is consistent with Feng et al. [73]. However, the coefficient of such variable varies across different datasets. In the whole-dataset model, the parameter of young driver (FS injuries) follows a normal distribution with a mean of −0.39 and a standard deviation of 1.06. However, no random effects were found in sub-dataset models. The coefficient of young driver (FS injuries) in cluster 2 and cluster 4 was negative and positive, respectively; this variable was not significant in cluster 1 and cluster 2. This finding shows that the unobserved heterogeneity of young driver in the whole-dataset can be effectively captured by the LCC model and divided into cluster 2 and cluster 4. Further, these coefficients in the sub-dataset can be explained by the characteristic variables. In cluster 2, one characteristic variable was truck, which can provide satisfactory protection measures for a driver [48]. In cluster 4, both seatbelt/helmet not used and drunk driving were the characteristic variables, and there is a significant positive correlation between these two variables and serious crashes [52]. These findings reflect the infinite potential of the LCC model to mine data heterogeneity.
Older driver (above age 50) shows a significant influence in most datasets. In the whole-dataset, cluster 3, and cluster 4, the likelihood of FS injuries caused by older drivers increased by 5.3%, 21.8%, and 14.7%, respectively. Wu et al. [9] reached a similar conclusion. Therefore, older drivers should drive carefully and maintain safe driving habits to reduce crash severity. However, the coefficient of this variable in cluster 2 suggested that older driver was expected to significantly reduce the probability of FS injuries (the average marginal effect was −0.021). A characteristic variable in cluster 2 was truck. Under low-visibility driving conditions, more driving experience may be a dominant factor in determining crash severity. This conclusion reflects the advantages of the LCC model to mine the potential variable information; it can maximize the heterogeneity across different sub-datasets and reveal the important variable information hidden in the whole-dataset.
The influence of seatbelt/helmet on crash severity cannot be ignored. The probability of FS injuries caused by seatbelt/helmet not used increases by 5.5%, 0.1%, and 30.6% in the whole-dataset, cluster 1, and cluster 3, respectively, which is consistent with previous studies [54]. The impact of seatbelt/helmet not used on cluster 3 is significantly higher than that in other datasets. Cluster 3 has two characteristic variables-motorcycle and collision with fixed object. Rollovers often occur in motorcycle crashes, which can cause serious injury, especially head injury, to drivers; the available literature confirms that a significant cause of motorcyclists being killed is head injury [52]. Not wearing a helmet leaves a rider's head unprotected, which leads to a significant increase in the probability of a serious injury. However, 30.4% of drivers still do not use seatbelts/helmets in the whole-dataset. It is necessary to take some mandatory management measures to strengthen the safety awareness of drivers.
In the whole-dataset, cluster 3, and cluster 4, drunk driving causes a significant increase in the probability of serious crashes (marginal effects 0.007, 0.109, and 0.028, respectively) [55]. These results demonstrate the pernicious effects of drunk driving on traffic safety to policy makers and the general public. Driving behavior can be affected by alcohol, which can lead to serious collisions. In cluster 2, however, there was a significant negative correlation between drunk driving and severe crashes (marginal effect−0.043). This may be due to the high proportion of trucks in cluster 2 (68.2%), and some risk compensation operations will be carried out by experienced truck drivers to reduce the influence of alcohol [74].

Vehicle Characteristics
According to the model calibration results, motorcycles, pickups, and trucks have significant impacts on the severity of LVR-SV crashes. As there are some differences between various vehicle types, which cause crash severity to vary across different vehicle types, this finding was proved by Rezapour et al. [26]. Specifically, there was a significant positive correlation between motorcycle and serious crashes. In the whole-dataset, cluster 3, and cluster 4, the probability of FS injuries caused by a motorcycle increased by 12.8%, 23.8%, and 12.6%, respectively. Wei and Cai [15] obtained a similar finding and pointed out that motorcycles cannot provide superior protection for riders, and high mortality is inevitable. Meanwhile, judgment related to the driving environment is influenced by low-visibility conditions, which increase the probability of serious crashes [75].
In the whole-dataset, the parameter of pickup (FS injuries) obeys a normal distribution, with a mean of −1.12 (0.11) and a standard deviation of 2.24 (0.13). The heterogeneity exhibited in the whole-dataset was identified and divided into different sub-datasets by the LCC model. In cluster 2 and cluster 4, the coefficient of pickup (FS injuries) was fixed (−1.22) and random (mean 1.42, standard deviation 1.35), respectively; it was not significant in clusters 1 and 3. This result shows that the LCC model can effectively eliminate the heterogeneity of pickup in cluster 2, but this finding cannot be supported by cluster 4. The residual heterogeneities need to be identified by the SRP-logit model. This finding indicates that the LCC model and the SRP-logit model can compensate for the shortcomings of each other, and combining the two models can provide a superior modeling approach.
There is a significant negative correlation between truck and the severity of LVR-SV crashes, which is consistent with previous research [76]. In the whole-dataset and cluster 2, the probability of FS injuries caused by truck was reduced by 3.0% and 5.9%, respectively, compared with passenger car. In addition, truck (slight injury) was not significant in the whole-dataset model, but a significant influence was shown in cluster 1 and cluster 2. This finding suggests that the significance of truck (slight injury) was concealed by the wholedataset; this potential impact can be demonstrated by using the sub-dataset to calibrate the SRP-logit model.

Other Characteristics
In the whole-dataset and cluster 4, the probability of slight injury related to the time of 17:00-21:00 increased by 2.6% and 2.2%, respectively, compared with 7:00-10:00. Chang et al. [52] reached a similar conclusion and pointed out that this phenomenon was caused by aggressive driving caused by traffic chaos during the rush hour [77] because the sample in this research was rural crashes and there is no evening peak phenomenon in rural China. Therefore, a more suitable explanation is needed for the findings of this research. Under low-visibility conditions, it is difficult to maintain a satisfactory driving environment, and the infrastructure of rural roads is lacking. Furthermore, drivers are fatigued after a day's work and may not be able to avoid a crash under such conditions. This phenomenon is particularly significant for drivers who are not familiar with the local traffic environment [14].
In the whole-dataset, cluster 3, and cluster 4, the probability of FS injuries related to the 21:00-24:00 increased by 5.2%, 10.5%, and 5.1%, respectively; this variable corresponding to FS injuries was not significant in cluster 1 and cluster 2. Further, 0:00-7:00 (FS injuries) was significant in the whole-dataset, cluster 2, and cluster 3 (average marginal effect 0.062, 0.003, and 0.097, respectively). These findings suggested that serious crashes are more likely to occur at night and satisfactory driving conditions should be maintained to reduce crash severity.
In the whole-dataset, cluster 2, and cluster 3, the probability of FS injuries caused by collision with fixed object increased by 33.2%, 1.2%, and 13.2%, respectively. Zhu et al. [78] reached a similar conclusion and pointed out that this finding may be related to the high probability of rollover caused by collision with a fixed object. In addition, the parameter of collision with fixed object corresponding to FS injuries was fixed across observations in the whole-dataset, but it was random across observations in cluster 2 (mean 1.64, standard deviation 1.51). This finding suggests that the model calibrated using the sub-dataset, as obtained by the LCC model, can reveal the variable information hidden by the whole-dataset.
There was a significant negative correlation between collision with pedestrian and crash severity. In the whole-dataset and cluster 2, the probability of serious injury caused by collision with pedestrians was reduced by 4.9% and 0.1%, respectively. Pedestrians are vulnerable compared with motor vehicles, and a collision with a pedestrian is not expected to cause serious injury to motor vehicle occupants but could seriously threaten the lives of pedestrians.

Recommendations
The method demonstrated in this study can be used to identify risk factors associated with crash severity under low-visibility conditions. For areas where low-visibility conditions occur frequently, such as Chongqing and Zibo, China, the proposed method can be used to extract significant influencing factors. Then, effective solutions can be proposed according to the factor characteristics to improve traffic safety performance and promote sustainable development.
Based on the findings in this article and previous engineering experience, the greatest benefit received by policy makers is the clarification of risk factors of the LVR-SV crash severity. Some measures can be adopted to improve traffic safety.
(1) Certain risky driving behaviors (such as seatbelt not used, drunk driving, overspeed, and fatigue driving) have significant influence on crash severity. Legal measures are effective means to overcome risky driving behavior. Several electronic policies can be installed on the roadside in rural areas to monitor risky driving behavior and impose penalties. Meanwhile, an in-vehicle crash warning system that identifies such risky behaviors in real time may contribute to improving traffic safety performance. The primary purpose of these measures is to remind drivers to maintain safe driving habits.
In fact, these dangerous behaviors are closely related to the lack of adequate safety awareness among drivers. Improving the safety awareness of drivers in rural areas is fundamental and important for improving road safety. Appropriate publicity and education are necessary. In addition, an on-board navigation system can report the low-visibility conditions ahead and advise drivers to slow down appropriately.
(2) A conclusion was drawn by this study that collision with fixed object will lead to a significant increase in the risk of fatal collisions. Hence, the traffic management department should focus on improving hard guardrails or other pavement fixtures on both sides of the road (such as concrete wall pillars), replacing hard guardrails with soft guardrails that can absorb collision energy, or installing a buffer energy-absorbing device on the surface of a fixed object to reduce crash severity.
(3) The impact of motorcycles on traffic safety must be seriously improved. The supervision, management, and education of motorcycle riders should be strengthened. Dangerous driving and abnormal driving license status should be given severe punishment to reduce high-risk travel behavior. The government should encourage residents to choose alternative modes of transportation instead of motorcycles.
(4) In rural China, road infrastructure is commonly inadequate. The Department of Transportation should install sufficient roadway ancillary facilities, including road condition signs, speed limit signs, pavement marking, and streetlights. Among them, installation of streetlights is most essential because nights without street lighting will lead to a dramatic increase in the risk of fatal collisions [48]. Meanwhile, to ameliorate the effects of low-visibility conditions on crashes, we recommend installing blinking lights on the roadside to alert drivers.
Among the safety measures mentioned above, the countermeasures corresponding to seatbelt not used, motorcycle, and collision with fixed object should be implemented first because the positive correlation between these variables and crash severity remains stable across different datasets. Then, other measures could be implemented one after another.
It is necessary to point out that we have received strong support from the Zibo Department of Transportation and several corresponding measures have been implemented in some of the studied road segments. In the future, traffic crash data before and after the implementation of safety measures will be collected and compared to evaluate the effectiveness of the measures. This part is the focus of our future research.

Conclusions
The primary objective of this research was to investigate the risk factors of LVR-SV crash severity. The injury severity of drivers was used as the dependent variable-no injury, slight injury, and FS injuries. A two-step modeling approach was proposed. First, the LCC model was used to divide the whole-dataset into several sub-datasets. By comparing the fitting indicators of different numbers of sub-datasets, the whole-dataset was divided into four sub-datasets. Then, the SRP-logit model was established for the whole-dataset and the sub-datasets to identify the risk factors of crash severity and the spatial correlation among adjacent crashes. In total, 18 significant factors were identified-male driver, young age (<25, >50), seatbelt/helmet not used, drunk driving, career (self-employed and farmer), vehicle type (motorcycle, pickup, truck), time of day (10:00-17:00, 17:00-21:00, 21:00-24:00, 0:00-7:00), collision type (collision with fixed object, collision with pedestrian), and traffic control.
The statistical comparison showed that the SRP-logit model outperforms the RP-logit model and is a superior modeling technique. The significance of the spatial error term in the SRP-logit model indicates that there is a spatial correlation between the severity of LVR-SV crashes. Meanwhile, the significance of the spatial error term varies across different datasets. In the whole-dataset, the spatial correlation was captured among all the three severity levels; however, the spatial correlation in the sub-dataset was less than the whole-dataset. For example, the spatial parameter of no injury in cluster 2 was significant, but slight injury and the FS injuries were not significant. This finding suggests that the LCC model can reduce the spatial correlation in the sub-dataset, but it cannot be completely eliminated.
In addition, dividing the whole-dataset into sub-datasets through the LCC approach and calibrating the model by the sub-dataset can further improve the fitting performance of the SRP-logit model. Hence, a two-step analysis approach is advocated to model crash severity.
Some interesting findings were obtained. First, the LCC model can effectively eliminate the unobserved heterogeneity across observations in some sub-datasets, but it cannot eliminate them in all sub-datasets. For example, male drivers showed random effects in the whole-dataset and cluster 4, but no random effects were found in cluster 3. This finding suggests that the LCC model can effectively eliminate the heterogeneity of male drivers in cluster 3, but there was residual heterogeneity in cluster 4. Second, the sub-dataset model can reveal some important variable information that was hidden by the whole-dataset model. Truck (slight injury) was not significant in the whole-dataset model, but a significant influence was captured in cluster 1 and cluster 2.

Limitations of the Study
This research has some limitations.
(1) The risk factors considered in this study are not rich enough. If variables such as traffic volume and vehicle speed can be considered, it is expected to further improve the fitting performance of the SRP-logit model, although it is difficult to obtain such information. Therefore, in the future, a millimeter-wave radar could be installed on several rural road segments to collect valuable data.
(2) The crash extracted in this research occurred in China; the crash characteristics may vary with different cultural backgrounds and traffic conditions. Thus, the transferability of the model proposed in this research across countries should be studied to further enrich the existing literature.
(3) This study used Gaussian CAR technology to account for the spatial correlation error of crashes. We will continue to study the decrease of the spatial correlation error by using other alternative methods.
Author Contributions: Study conception and design, Z.C., F.W., Y.G. and Z.W.; data collection, F.W., X.L. and L.C.; data analysis and interpretation of results, Z.W. and Z.C.; draft manuscript preparation, F.W., Y.G. and Z.C.; manuscript revision, Y.G., F.W. and Z.W. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.