Identifying Factors that Influence the Patterns of Road Crashes Using Association Rules : A case Study from Wisconsin , United States

Road traffic injury is currently the leading cause of death among children and young adults aged 5–29 years all over the world. Measures must be taken to avoid accidents and promote the sustainability of road safety. The current study aimed to identify risk factors that are significantly associated with the severity in crash accidents; therefore, traffic crashes could be reduced, and the sustainable safety level of roadways could be improved. The Apriori algorithm is carried out to mine the significant association rules between the severity of the crash accidents and the factors influencing the occurrence of crash accidents. Compared to previous studies, the current study included the variables more comprehensively, including environment, management, and the state of drivers and vehicles. The data for the current study comes from the Wisconsin Transportation crash database that contains information on all reported crashes in Wisconsin in the year 2016. The results indicate that male drivers aged 16–29 are more inclined to be involved in crashes on roadways with no physical separation. Additionally, fatal crashes are more likely to occur in towns while property damage crashes are more likely to occur in the city. The findings can help government to make efficient policies on road safety improvement.


Introduction
The number of road traffic deaths in the world remains unacceptably high and increases continuously, reaching 1.35 million in 2016 [1].However, the fact is, every one of those deaths and injuries is avertible.Improving traffic safety levels is one of the great opportunities to save lives around the world, which does not receive anywhere near the attention it deserves [2].
Traffic crashes can be decreased significantly and identifying the causes of a traffic crash is the most critical procedure in adopting precautionary measures to reduce the severity and quantity of traffic crashes.However, some previous studies estimated a model of crash frequency and severity using only the volume of traffic as an explanatory variable, while clearly many other factors affect the frequency and severity of crashes, such as environmental conditions, roadway geometrics, driver characteristics, and so on.Due to the complex nature of traffic crashes, the policy decision makers must consider numerous contributory factors when making decisions on the improvement of safety [3].It is vital for decision makers to find the most significant factors that affect the occurrence and consequence of traffic crashes.After years of research, it is generally accepted that through recognizing risk factors as shown in Figure 1, which affect the severity of a crash and corresponding coping strategies, the impact of crashes can be significantly reduced [4][5][6].Some previous studies have been devoted to identifying the contributing factors that affect the occurrence and severity of traffic crashes through traffic data.Various approaches were proposed by these studies such as binary logit/probit models [7,8], multinomial logit models [9,10], nested logit models [11,12], log-linear models [13], artificial neural networks [14,15], spatial and temporal correlations [16], Markov switching models [17], and genetic algorithms [18], etc.Meanwhile, various contributing factors to frequency and severity of traffic crashes have been identified in the above literature, such as weather, gender and age of drivers, posted speed, roadway geometrics, condition of drivers, and so on.
In recent years, the analysis of the various types of data using data mining techniques has been attracting more and more attention among researchers.Data mining technology has been employed in traffic crash analysis and achieved satisfactory results in areas such as assessing the inherent connection between crashes and road geometry [19][20][21], critical points identification [22], factors that contribute to the severity of traffic crashes [23], and the relationship between driver characteristics and traffic crashes [24].Many studies have analyzed crash data with data mining techniques.Agrawal et al. utilized the data mining technique of association analysis for crash data analysis [25].Golob and Recker used clustering analysis for relating prevailing traffic conditions on freeways with type of collision most likely to occur [26].Prati et al. applied a decision tree technique and Bayesian network to predict the severity of bicycle crashes [27].However, some of these studies are based on the hypotheses that these factors are independent of one another, which might misunderstand the contribution of every single factor.
Among these data mining techniques, association rules mining is a valid technique to analyze traffic crashes since data mining methods do not rely on any hypothesis and can discover meaningful connections hidden in large datasets.There are three kinds of basic algorithms for association rules mining, which are the Apriori algorithm, an algorithm based on partition, and the Frequent Pattern tree algorithm.The Apriori algorithm is succinct and clear, which adopts an iterative method of layerby-layer search.Compared to the other two algorithms, the Apriori algorithm is more capable of processing large-scale datasets.In the current study, the Apriori algorithm was used to discover the significant rules between the factors and crashes in Wisconsin.

Raw Data and Study Area
The raw crash data for the current study was collected from the Wisconsin Transportation crash database that contains information about all reported crashes in Wisconsin in 2016.A reportable crash was a crash leading to injury or death of any person, total damage to property owned by any one person to an apparent extent of $1000 or more, or any damage to government-owned non-vehicle property to an apparent extent of $200 or more.Some previous studies have been devoted to identifying the contributing factors that affect the occurrence and severity of traffic crashes through traffic data.Various approaches were proposed by these studies such as binary logit/probit models [7,8], multinomial logit models [9,10], nested logit models [11,12], log-linear models [13], artificial neural networks [14,15], spatial and temporal correlations [16], Markov switching models [17], and genetic algorithms [18], etc.Meanwhile, various contributing factors to frequency and severity of traffic crashes have been identified in the above literature, such as weather, gender and age of drivers, posted speed, roadway geometrics, condition of drivers, and so on.
In recent years, the analysis of the various types of data using data mining techniques has been attracting more and more attention among researchers.Data mining technology has been employed in traffic crash analysis and achieved satisfactory results in areas such as assessing the inherent connection between crashes and road geometry [19][20][21], critical points identification [22], factors that contribute to the severity of traffic crashes [23], and the relationship between driver characteristics and traffic crashes [24].Many studies have analyzed crash data with data mining techniques.Agrawal et al. utilized the data mining technique of association analysis for crash data analysis [25].Golob and Recker used clustering analysis for relating prevailing traffic conditions on freeways with type of collision most likely to occur [26].Prati et al. applied a decision tree technique and Bayesian network to predict the severity of bicycle crashes [27].However, some of these studies are based on the hypotheses that these factors are independent of one another, which might misunderstand the contribution of every single factor.
Among these data mining techniques, association rules mining is a valid technique to analyze traffic crashes since data mining methods do not rely on any hypothesis and can discover meaningful connections hidden in large datasets.There are three kinds of basic algorithms for association rules mining, which are the Apriori algorithm, an algorithm based on partition, and the Frequent Pattern tree algorithm.The Apriori algorithm is succinct and clear, which adopts an iterative method of layer-by-layer search.Compared to the other two algorithms, the Apriori algorithm is more capable of processing large-scale datasets.In the current study, the Apriori algorithm was used to discover the significant rules between the factors and crashes in Wisconsin.

Raw Data and Study Area
The raw crash data for the current study was collected from the Wisconsin Transportation crash database that contains information about all reported crashes in Wisconsin in 2016.A reportable crash was a crash leading to injury or death of any person, total damage to property owned by any one person to an apparent extent of $1000 or more, or any damage to government-owned non-vehicle property to an apparent extent of $200 or more.
The crash data included 129,051 crashes that occurred in Wisconsin and were described by 49 variables including calendar date on which the crash occurred, crash severity, type of crash, age of the driver, etc.However, not all the reported crashes listed in the database are described by all the 49 variables, and not all the variables were necessarily significant for the crashes.Therefore, in the current study the dataset needs to be pretreated with the following process as shown in Figure 2. The crash data included 129,051 crashes that occurred in Wisconsin and were described by 49 variables including calendar date on which the crash occurred, crash severity, type of crash, age of the driver, etc.However, not all the reported crashes listed in the database are described by all the 49 variables, and not all the variables were necessarily significant for the crashes.Therefore, in the current study the dataset needs to be pretreated with the following process as shown in Figure 2.

Crash Data Processing
First, a clustering algorithm of k-means was used to clean the noise data, which were erroneous or abnormal [28].Meanwhile, each reported crash needed to be checked for missing values.A reported crash would have to be removed if it had noise data or lacked key information, such as reasons of crash, the condition of the road, weather condition, injury condition, driver information, etc.
Because the data for the current study came from crash and spot investigations with combing meticulously, variables in the dataset were independent and the problem of data conflict does not exist.There was no need to clean up the redundant data and integrate the data.In order to mine association rules more efficiently, variables such as calendar date on which the crash occurred, the name of the street, name of the highway, house, fire, railroad, or other numbers that contributed little to the traffic crash were removed.Some variables that had the same range of value such as NTFYHOUR (the one-hour range in which the enforcement agency was notified of the crash) and POSTSPD (posted speed) were converted into a different range of value as shown in Table 1.Boolean variables or discrete numeric variables were required to mine association rules using the Apriori algorithm, so that the continuous numerical variable AGE needed to be dispersed as shown in Table 2. Since the residents can get a driver's license at the age of 16 in the United States, the age value of the first group was set by (0,15).

Crash Data Processing
First, a clustering algorithm of k-means was used to clean the noise data, which were erroneous or abnormal [28].Meanwhile, each reported crash needed to be checked for missing values.A reported crash would have to be removed if it had noise data or lacked key information, such as reasons of crash, the condition of the road, weather condition, injury condition, driver information, etc.
Because the data for the current study came from crash and spot investigations with combing meticulously, variables in the dataset were independent and the problem of data conflict does not exist.There was no need to clean up the redundant data and integrate the data.In order to mine association rules more efficiently, variables such as calendar date on which the crash occurred, the name of the street, name of the highway, house, fire, railroad, or other numbers that contributed little to the traffic crash were removed.Some variables that had the same range of value such as NTFYHOUR (the one-hour range in which the enforcement agency was notified of the crash) and POSTSPD (posted speed) were converted into a different range of value as shown in Table 1.Boolean variables or discrete numeric variables were required to mine association rules using the Apriori algorithm, so that the continuous numerical variable AGE needed to be dispersed as shown in Table 2. Since the residents can get a driver's license at the age of 16 in the United States, the age value of the first group was set by (0,15).

Structured Dataset Construction
Twenty-one variables and 63,325 reported crashes were filtered from 129,051 reported crashes by data processing.The description and range of value of the twenty-one variables are cataloged in Table 3.
Table 3. Description and information field of corresponding variables.• OW = One-way traffic 4.2

AGE
The age of the driver who causes the crash

Basic Conceptions
In the current study, the item set is a set of items and it includes at least one reported crash.An item is one element of an item set, which represents a reported crash.A k-item set is defined as an item set consisting of k items.A frequent pattern means that the same combination of eigenvalues occurs a certain number of times in the dataset [29].The association pattern represents the association and correlation between several items.Association rules are association patterns that satisfy user-specified support [30].
Given a finite set of items I = {i 1 , i 2 . . . . . . ,i m }.Let D be a dataset including plenty of transactions that are subsets of I [31].An extracted association rule is an implication of the form X ⇒ Y, where X is the antecedent, and Y is the consequent.X and Y are item sets, which belong to D, and A ∩ B = ∅.Support and confidence are the two most commonly used criteria for measuring the importance of association rules.The support indicates the frequency of the association rule in the transaction set containing X and Y, which is defined as Sup (X ⇒ Y) = P (X ∩ Y): |D| is the total number of transactions, while |X ∪ Y| is the number of transactions that include both item sets X and Y.
The confidence indicates the credibility of the association rule X ⇒ Y, which is defined as Con f (X ⇒ Y): |X| is the number of transactions that only contain item set X, while |X ∪ Y| is the number of transactions that include both item sets X and Y.The association rules whose value of support and confidence are equal to or bigger than the threshold defined by users are valid rules, which deserve to be analyzed.
To avoid generating a great number of uninteresting association rules, many algorithms for mining association rules use criteria based on minimum support and minimum confidence.Due to lacking consideration of correlation between the support of X and the support of (X, Y), useless association rules may still be generated when the support value of the consequent is too high.In order to solve this problem, previous researchers have proposed several valid measures.Lift is the most widely used measure of them, which is defined as Conf(X⇒Y) is the confidence of association rule (X ⇒ Y), while Sup(Y) is the support value of item set Y.There is no correlation between item set X and Y with lift = 1, while the occurrence of item set X is exclusive to item set Y with lift < 1.Only if lift > 1, the association rules are recognized as valuable rules.

Association Rule Mining
Extracting important and hidden information from a large dataset by mining association rules is one of the most common tasks in data mining [32].The association rule mining can be described as a two-step process [33]:

•
Generating frequent item sets-find all frequent item sets whose support value is equal to or greater than the minimum support value; • Generating association rules-generate association rules from frequent item sets under the condition of minimum confidence.
Figure 3 shows the process of association rule mining.
Sustainability 2019, 11 FOR PEER REVIEW 6 of 14 |X| is the number of transactions that only contain item set X, while |X ∪ Y| is the number of transactions that include both item sets X and Y.The association rules whose value of support and confidence are equal to or bigger than the threshold defined by users are valid rules, which deserve to be analyzed.
To avoid generating a great number of uninteresting association rules, many algorithms for mining association rules use criteria based on minimum support and minimum confidence.Due to lacking consideration of correlation between the support of X and the support of (X, Y), useless association rules may still be generated when the support value of the consequent is too high.In order to solve this problem, previous researchers have proposed several valid measures.Lift is the most widely used measure of them, which is defined as Conf(X⇒Y) is the confidence of association rule (X ⇒ Y), while Sup(Y) is the support value of item set Y.There is no correlation between item set X and Y with lift = 1, while the occurrence of item set X is exclusive to item set Y with lift < 1.Only if lift > 1, the association rules are recognized as valuable rules.

Association Rule Mining
Extracting important and hidden information from a large dataset by mining association rules is one of the most common tasks in data mining [32].The association rule mining can be described as a two-step process [33]:  Generating frequent item sets-find all frequent item sets whose support value is equal to or greater than the minimum support value;  Generating association rules-generate association rules from frequent item sets under the condition of minimum confidence.
Figure 3 shows the process of association rule mining.The association rules mining algorithms include Apriori, SETM [34], ECLAT [35], Pincer Search [36], and MAFIA [37], which are based on a support-confidence framework proposed by Agrawal and Srikant.The Apriori algorithm is succinct and clear, which adopts an iterative method of layerby-layer search.In the current study, the Apriori algorithm was used to discover the significant rules between the factors and crashes in Wisconsin.

Validity Test of Association Rules
An extreme risk of type-I error exists because of the large number of association rules, which needs a process of validity tests to evaluate the statistical significance of the rules obtained [38].The validation process is generally distinguished in two ways.The first approach is the direct adjustment approach, which requires all association rules to pass statistical tests at the adjusted critical value.The second approach is the holdout approach, which divides the data into exploratory data for generating association rules without regard for the problem of multiple testing and holdout data for statistical tests.The association rules mining algorithms include Apriori, SETM [34], ECLAT [35], Pincer Search [36], and MAFIA [37], which are based on a support-confidence framework proposed by Agrawal and Srikant.The Apriori algorithm is succinct and clear, which adopts an iterative method of layer-by-layer search.In the current study, the Apriori algorithm was used to discover the significant rules between the factors and crashes in Wisconsin.

Validity Test of Association Rules
An extreme risk of type-I error exists because of the large number of association rules, which needs a process of validity tests to evaluate the statistical significance of the rules obtained [38].
The validation process is generally distinguished in two ways.The first approach is the direct adjustment approach, which requires all association rules to pass statistical tests at the adjusted critical value.The second approach is the holdout approach, which divides the data into exploratory data for generating association rules without regard for the problem of multiple testing and holdout data for statistical tests.
In the current study, a direct adjustment approach was applied to test the validation of association rules, as it has an advantage of data usage for both association rule discovery and statistical evaluation [38].Meanwhile, no more statistical tests will be required under this approach than under the holdout approach.A number of direct adjustment approaches were employed to perform multiple hypothesis tests, such as Bonferroni correction [39], sequentially rejective Bonferroni [40], adaptive Benjamini-Hochberg algorithm [41], and so on.The Bonferroni correction states that if an experimenter is testing n independent hypotheses on a set of data, then the statistical significance level that should be used for each hypothesis separately is 1/n times what it would be if only one hypothesis was tested.Because of the principle and characteristics of Bonferroni correction, it made the results more rigorous with a tight upper bound.Thus, the method of Bonferroni correction was applied in the current study.The definition of Bonferroni correction is as follows: Let H 1 , H 2 ,..., H n be a family of hypotheses and p 1 , p 2 , . . ., p n be their corresponding p-values.The n is the total number of null hypotheses, while n 0 is the number of true hypotheses.The familywise error rate (FWER) is the probability of rejecting at least one true H i ; in other words, of making at least one type I error.The Bonferroni correction rejects the null hypothesis for each p i ≤ α/n, while α is the global significance level.Proof of this control follows from Boole's inequality, as follows:

Results and Discussions
Through the procedure of data pretreatment, 63,325 pieces of valid reported crashes data were filtrated.Among them, there were 43,239 pieces of property damage only (PD) crashes, 19,766 injuries occurred (INJ) crashes, and 320 fatal crashes (FAT) as in Figure 4. Based on the dataset, the current study then used the mathematical programming software Python 3.5 on a Lenovo laptop with Intel Core i5-5200U 2.20GHz CPU and 8 GB RAM to generate association rules.There were 766 pieces of association rules that were obtained with filter criteria of minimum support equal to 0.1, minimum confidence equal to 0.14, and minimum lift greater than 1.0, as shown in Figure 5.In the current study, a direct adjustment approach was applied to test the validation of association rules, as it has an advantage of data usage for both association rule discovery and statistical evaluation [38].Meanwhile, no more statistical tests will be required under this approach than under the holdout approach.A number of direct adjustment approaches were employed to perform multiple hypothesis tests, such as Bonferroni correction [39], sequentially rejective Bonferroni [40], adaptive Benjamini-Hochberg algorithm [41], and so on.The Bonferroni correction states that if an experimenter is testing n independent hypotheses on a set of data, then the statistical significance level that should be used for each hypothesis separately is 1/n times what it would be if only one hypothesis was tested.Because of the principle and characteristics of Bonferroni correction, it made the results more rigorous with a tight upper bound.Thus, the method of Bonferroni correction was applied in the current study.The definition of Bonferroni correction is as follows: Let H1, H2,..., Hn be a family of hypotheses and p1, p2,…, pn be their corresponding p-values.The n is the total number of null hypotheses, while n0 is the number of true hypotheses.The familywise error rate (FWER) is the probability of rejecting at least one true Hi; in other words, of making at least one type I error.The Bonferroni correction rejects the null hypothesis for each pi ≤ α/n, while α is the global significance level.Proof of this control follows from Boole's inequality, as follows:

Results and Discussions
Through the procedure of data pretreatment, 63,325 pieces of valid reported crashes data were filtrated.Among them, there were 43,239 pieces of property damage only (PD) crashes, 19,766 injuries occurred (INJ) crashes, and 320 fatal crashes (FAT) as in Figure 4. Based on the dataset, the current study then used the mathematical programming software Python 3.5 on a Lenovo laptop with Intel Core i5-5200U 2.20GHz CPU and 8 GB RAM to generate association rules.There were 766 pieces of association rules that were obtained with filter criteria of minimum support equal to 0.1, minimum confidence equal to 0.14, and minimum lift greater than 1.0, as shown in Figure 5.The current study estimated the smallest p-value for the association rules based on the upper bound of 0.1/766 that equals 1.3*10 −4 , while 766 pieces of association were obtained with a minimum support value that equals 0.1 [42].Only two rules had p-values higher than 1.3*10 −4 -the p-value of rule WET, MALE ⇒ ND is 0.012 and the p-value of rule LT TRN ⇒ PD is 0.029.The reason for the extremely low number of false discoveries is that the support, confidence, and lift threshold already do an excellent job of pruning out most rules that are not statistically significant.
High support rules indicate a high frequency of association rules (i.e., events that occur frequently in a crash), while high confidence indicates the probability of occurrence of a consequent event when the antecedent item occurred (i.e., the antecedent event is more likely to occur when the antecedent event happens in a crash).Rules with high lift value, which are greater than 1.0, are valid rules and indicate strong associations between the factors (i.e., there is a strong positive correlation between the two events in a crash).The current study screened out the top 20 support association rules of the highest value as in Table 4, the top 20 confidence association rules of the highest value as in Table 5, and the top 20 lift association rules of the highest value as shown in Table 6.Following are the analysis of results from Table 4:  Due to the PD (property damage only) crashes having a proportion of 68.3% in the whole dataset, the top 20 support association rules of highest value are all related to PD.It indicates that most The current study estimated the smallest p-value for the association rules based on the upper bound of 0.1/766 that equals 1.3*10 −4 , while 766 pieces of association were obtained with a minimum support value that equals 0.1 [42].Only two rules had p-values higher than 1.3*10 −4 -the p-value of rule WET, MALE ⇒ ND is 0.012 and the p-value of rule LT TRN ⇒ PD is 0.029.The reason for the extremely low number of false discoveries is that the support, confidence, and lift threshold already do an excellent job of pruning out most rules that are not statistically significant.
High support rules indicate a high frequency of association rules (i.e., events that occur frequently in a crash), while high confidence indicates the probability of occurrence of a consequent event when the antecedent item occurred (i.e., the antecedent event is more likely to occur when the antecedent event happens in a crash).Rules with high lift value, which are greater than 1.0, are valid rules and indicate strong associations between the factors (i.e., there is a strong positive correlation between the two events in a crash).The current study screened out the top 20 support association rules of the highest value as in Table 4, the top 20 confidence association rules of the highest value as in Table 5, and the top 20 lift association rules of the highest value as shown in Table 6.Following are the analysis of results from Table 4: • Due to the PD (property damage only) crashes having a proportion of 68.3% in the whole dataset, the top 20 support association rules of highest value are all related to PD.It indicates that most of the crashes are not related to injury and fatalities, which is consistent with the findings of the Global status report on road safety 2018 [1].

•
The significant factors for the high value of support association rules are the type of road, the extent of the worst vehicle damage, posted speed, male drivers, and a roadway with no physical separation, weather, location, and age of drivers.

•
It is obvious that the extent of vehicle damage is more likely to be moderate (MOD) in a property damage only crash (rule 5, 9, 11, 13, 14, 18, and 20).

•
The crashes mostly occurred in urban areas (rule 11, 12, and 17) with no physical separation (rule 3 and 6), while Abdel-Aty and Radwan found that highway geometry is the second important factor in occurrence of traffic crashes [24], and a lower posted speed (rule 15).Especially, the rule PD → S25, U CITY, ND (support = 0.68, confidence = 0.18, lift = 1.07) clearly expresses the relationship between them.Through the revelation of the above rules, decision makers can reduce the occurrence of crashes by setting up physical separations on crash-prone sections.

•
Male drivers are more prone to be associated with property damage only traffic crashes than female drivers, which can be observed from the rules (4, 6, 8, 16, 17, and 20) and rules (12 and 13).On the one hand, male drives are more likely to drive drunk and/or speed than female drivers [43].On the other hand, it is probable that male drivers are less likely to comply with traffic rules and are generally overconfident while driving [44].Following are the analysis of results from Table 5:

•
The highest confidence value rule FTC (following too close) → REAR (rear end) (support = 0.14, confidence = 0.95, lift = 3.17) indicates that following too close will lead to rear ending between cars, which is widely known.

•
Same as the result from Table 4, low posted speed and roadways with no physical separation (rule 3, 4, 5, 6, etc.) are significant elements that affect the occurrence of crashes.The large deviation of speed, which is generated by drivers that ignore the posted speed and speed a lot, is perhaps the reason why crashes happen in the location with low posted speed.Elvik found that lower posted speed is prone to lead to a crash as a result of a high deviation of speed [45].Roadways with no physical separation often cause the problem that drivers sometime occupy the opposite lanes, which probably leads to a collision.

•
In comparison with other drivers, the drivers aged 16-25, which are presented by A2 in Tables 4 and 5, are most likely to be involved in crashes (rule 5, 16 in Table 4, rule 17 in Table 5), because drivers aged 16-25 are a large proportion of the whole drivers, and they are more likely to violate driving rules.Decision makers can strengthen traffic safety education for drivers aged 16-25 to reduce the occurrence of traffic crashes.

•
'0' indicates that the crash occurred at an intersection.Four rules (rule 7, 13, 14, and 15) show that crashes are more likely to occur at an intersection.The intersection is a convergence area of city traffic flow and flow of people, which have complex traffic conditions and are more likely to lead to a crash.Wang et al. found that a crash is more prone to occur at an intersection [46].An appropriate organization of intersection flow might help decision makers control the occurrence of crashes effectively.

•
Following too close (FTC), failure to yield (FTY), and failure to keep the vehicle under control (FVC) are perhaps the significant driver-contributing circumstances in a crash (rule 1, 14, 15, 16, and 19).Abdel-Aty and Radwan found that driver conditions were the most important factors in the occurrence of traffic crashes [24].Following are the analysis of results from Table 6: • High lift values suggest a strong interdependence between the antecedent and the consequent.Three rules with high lift values indicate that drivers failing to yield, crash occurring at the intersection, and the collision type of angle have a strong connection [24].
The support value shows that 15% of crashes result from failing to yield at an intersection [46].The confidence value proves that 78% of the crashes occurred due to angle collision.The ratio of angle collision crashed was 3.17 times the ratio of other types of collision.

•
The crash is more likely to happen when drivers go straight (rule 8, 18, and 19), because drivers might tend to be more relaxed with their vigilance during going straight than when crossing a curve.

•
There are nine rules with NO C = no collision as a consequent, which indicates that most of the crashes with no collision happened between vehicles because most of the vehicles had a collision with a physical barrier.

•
Male drivers are more prone to fail to keep the vehicle under control.Das et al. also found a higher number of males are associated with crashes [47].
With the percentage of fatal crashes (0.5%) being too small, it is impossible to produce high values of support and confidence.To discuss the influence factors of fatal crashes, the dataset applied only included fatal crashes.Twelve pieces of association rules that were obtained with filter criteria of minimum support that equaled 0.5, minimum confidence that equaled 0.5, and minimum lift that was greater than 1.0 is shown in Table 7.The following are the analysis of results from Table 7: • The significant factors for fatal crashes are location, male drivers, the extent of the worst vehicle damage, roadway with no physical separation, weather and road surface condition.

•
Different from property damage only crashes, fatal crashes are more likely to occur in town instead of the city.Compared with the city road, there are fewer vehicles, police, and less supervision in town.Drivers tend to be more relaxed with their vigilance and speeding.

•
Male drivers are prone to be involved in fatal crashes, which has the same reason with other types of crashes.

•
Drivers are more likely to get involved in fatal crashes when the weather condition is clear, and the road surface condition is dry.It is perhaps because drivers would pay more attention to driving when the weather and road surface condition are dangerous.Karlaftis and Yannis suggest a negative relationship between adverse weather and road safety, mainly because drivers are not used to driving under adverse weather conditions and consequently adjust their behavior by driving more carefully [48].

•
Roadways with no physical separation have always been a problem threatening traffic safety.

Conclusions
Due to the complicated interaction among different factors-the situation of the driver, the condition of vehicle and road, environment and management-a traffic crash is a complex and systemic problem.In order to decrease the number of traffic crashes, fundamental reasons, which are the basis for promoting measures, need to be systematically analyzed.A large number of researchers have made efforts to identify the vital factors that influence the severity and frequency of traffic crashes during recent years, in order to formulate effective safety countermeasures to enhance traffic sustainability [47].
In the current study, the Apriori algorithm was implemented to identify characteristics and factors impacting traffic crashes in Wisconsin, United States.By setting an appropriate threshold value of support and confidence, essential information of traffic crash characteristics can be gained to analyze the fundamental causes of a traffic crash.The association rules, which were generated in the current study, suggest a couple of significant factor groups: posted speed, driver condition, weather condition, road surface condition, distance from the intersection, a roadway with no physical separation, an administrative grade of crash location, male drivers, and the age of drivers.Taking these factors into account, the government can make countable measures to improve the sustainable level of traffic safety.The majority of the findings are consistent with previous studies.The variables considered are more comprehensive, including environment, management, and state of drivers and vehicles, which is the critical contribution of the current study.
Note that the present study did not optimize the parameters with any optimization method, for the current study obtained objective and significant results in the current size of the database.For future directions, efforts could be made on incorporating genetic algorithms and particle swarm optimization with the Apriori algorithm to optimize the values of the parameters, and to obtain significant results with high efficiency in analyzing large-scale databases.

Figure 1 .
Figure 1.The causative mechanisms of traffic incidents/accidents.

Figure 1 .
Figure 1.The causative mechanisms of traffic incidents/accidents.

Figure 2 .
Figure 2. The procedure of data pretreatment.

Figure 2 .
Figure 2. The procedure of data pretreatment.

Figure 4 .
Figure 4.The proportion of accident category.Figure 4. The proportion of accident category.

Figure 4 .
Figure 4.The proportion of accident category.Figure 4. The proportion of accident category.

Figure 5 .
Figure 5. Seven hundred and sixty-six pieces of association rules.

Figure 5 .
Figure 5. Seven hundred and sixty-six pieces of association rules.

Table 4 .
Top 20support association rules of the highest value.

Table 4 .
Top 20support association rules of the highest value.

Table 5 .
Top 20confidence association rules of the highest value.

Table 6 .
Top 20lift association rules of the highest value.

Table 7 .
Association rules related to fatal crashes.