Driver Behavior Classification at Stop-Controlled Intersections Using Video-Based Trajectory Data

Understanding how drivers behave at stop-controlled intersection is of critical importance for the control and management of an urban traffic system. It is also a critical element of consideration in the burgeoning field of smart infrastructure and connected and autonomous vehicles (CAV). A number of past efforts have been devoted to investigating the driver behavioral patterns when they pass through stop-controlled intersections. However, the majority of these studies have been limited to qualitative descriptions and analyses of driver behavior due to the unavailability of high-resolution vehicle data and sound methodology for classifying various driver behaviors. In this paper, we introduce a methodology that uses computer-vision vehicle trajectory data and unsupervised clustering techniques to classify different types of driver behaviors, infer the underlying mechanism and compare their impacts on safety. Two major types of behaviors are investigated, including vehicle stopping behavior and vehicle approaching patterns, using two clustering algorithms: a bisecting K-means algorithm for classifying stopping behavior, and the improved density-based spatial clustering of applications with noise (DBSCAN) algorithm for classifying vehicle approaching patterns. The methodology is demonstrated using a case study involving five stop-controlled intersections in Montreal, Canada. The results from the analysis show that there exist five distinctive classes of driver behaviors representing different levels of risk in both vehicle stopping and approaching processes. This finding suggests that the proposed methodology could be applied to develop new safety surrogate measures and risk analysis methods for network screening and countermeasure analyses of stop-controlled intersections.


Introduction
Stop signs are one of the most commonly applied traffic control devices around the world, especially in North American and European countries. At stop-controlled intersections, vehicles are required to come to a full stop in front of the stop bar, regardless of whether conflicts are presented or not [1]. However, evidence suggests that there is a large variation in how drivers behave at the stop-controlled intersections, and that many drivers tend to just slow down and slide, instead of come to a full stop [2,3]. For example, a field study conducted in the US showed that roughly 45% of the 25,660 vehicles observed did not fully stop at stop signs and 7% of the vehicles did not slow down at all when passing the stop sign [4].
Driver behavior variation and traffic violations at stop-controlled intersections have become the major cause of collisions and safety concerns in many municipalities. For instance, in 2008, 34,017 fatal collisions occurred at various intersections and 2750 of these fatal collisions occurred at stop-controlled intersections [5], which was more than what had occurred at signalized intersections. In 2011, about 673,000 traffic collisions occurred at stop-controlled intersections, whereof 2433 were fatal collisions and 208,000 collisions were injury-causing [6]. About 72% of fatal crashes occur at unsignalized intersections because the drivers failed to stop at stop signs [7]. Considering the high rate of collisions and the increased probability of severe injures and fatalities at stop-controlled intersections, driver compliance at stop-controlled intersections has become a critical concern.
A comprehensive understanding of driver behavior at stop-controlled intersections is also critical for the development of connected and automated vehicles (CAV) and intelligent transportation systems (ITS), which are emerging as the most promising solution for addressing various transportation-related challenges, such as safety, environmental impact and congestion [8]. These technologies promise to improve safety by providing pre-emptive warnings to drivers to prevent collisions [9] and reduce vehicle delay with real-time information and cooperative control [10]. The importance of the driver behavior in the development of CAV and ITS has stimulated a large number of past studies focusing on investigating the complex variation in driver behavior under a full range of operating environments [11]. However, most past studies on driver behavior have focussed on carfollowing and lane-changing behavior, or behavior at signalized intersections [12]. There has been little focus on driver behavior at the stop-controlled intersections and developing a system that can provide a warning to drivers at stop signs [13].
A few past studies have focused on understanding driver behavior at stop-controlled intersections. For example, McKelvie [14] investigated the stopping behavior of drivers through surveys and noted that unclear definitions of full compliance had led to confusion even among law enforcement. In another study by Langton, L. and Matthew, D. R. [15], it was found that most drivers believed that the police did not have a legitimate reason for stopping them. Woldeamanuel [16] classified driver's stopping behavior into three different types: full stop (0 mph), rolling stop (5 mph) and no stop (less and above 5 mph). However, the definitions of the non-compliance maneuvers were largely subjective. Lastly, it should also be noted that our literature review did not find any past studies focusing on the motion pattern of approaching vehicles, which could have important safety implications at the stop-controlled intersections.
This paper describes a study on vehicle stopping behavior and vehicle approaching patterns at stop-controlled intersections using video-based trajectory data. A case study involving five stop-controlled intersections is conducted. Furthermore, vehicle approaching patterns for vehicles are clustered and analyzed through the improved density-based spatial clustering of applications with noise (DBSCAN) algorithm.

Literature Review
Despite the lack of advanced methods to automatically detect the microscopic behavior of the drivers at stop-controlled intersections, researchers have conducted many manual observations of different behavior types and statistical analyses of the drivers in front of stop signs [14,[17][18][19][20]. Shaaban et al. [20] investigated driver compliance behavior in front of stop sign intersections in Qatar using field observations. McKelvie et al. [19] observed the behavior of 600 automobile male or female drivers when they approached stop signs during the day and night, and analyzed the full stop, slow stop, no slow or stop behavior of drivers by percentage. Beanland et al. [17] found that 59% drivers stopped completely at rail level crossings, 27% of drivers made a rolling stop, and 14% of drivers made no slow or stop behavior. This study observed the driving behavior of twenty-two volunteers. DeVeauuse et al. [18] investigated the rate of driver compliance behavior with stop signs in college campuses using observers. Statistical analyses to define the types of driver behavior have been proposed [16,17]. For instance, Beanland et al. [17] defined a rolling stop as a speed below 10 km/h and above 0 km/h. However, the method used to study the microscopic behavior in front of the stop sign remains subjective.
Most studies that assess driver behavior at stop-controlled intersections use the braking behavior of drivers, such as full stop, rolling stop, initial brake point and maximum deceleration [13,21]. Bao et al. [21] investigated the driving behavior of test drivers of various age groups at stop signs using four dependent measures: brake pedal differential time, maximum deceleration, initial brake point and full stop, which were determined using video footage from cameras inside the vehicle, a global positioning system (GPS) receiver and surveys of the test drivers. Doerzaph et al. [13] conducted an experiment at stop-controlled intersections based on vehicle kinematics using speed, distance and acceleration to explain driver behavior. A cluster analysis was used to classify stopping behavior. Studies on driver behavior at stop-controlled intersections remain limited.
Microscopic trajectory data, providing detailed positioning and speed information of drivers, show promise. The use of trajectory data in road intersections studies is widespread [22][23][24][25][26][27]. Fu et al. [24] proposed an automated video-based methodology to analyze the safety of pedestrians at nighttime crossings, based on trajectory data extracted from thermal camera systems using an open source computer vision-based traffic intelligence project [28]. Beitel et al. [22] used computer vision software to automatically extract user trajectories to analyze pedestrian-cyclist collisions in shared spaces. St-Aubin et al. [27] provided a methodology to process large-scale trajectory data to proactively analyze traffic safety at roundabouts. Essa et al. [23] proposed a video analysis procedure to evaluate the safety of six signalized intersections, using the trajectory video data recorded by cameras located in two cities in Canada. Some of these works have looked at the behavior of drivers at non-signalized intersections using video-based trajectory [29,30]. For instance, Fu et al. [29] investigated the microscopic behavior of secondary interactions of pedestrians and vehicles at non-signalized intersections, based on the trajectory data extracted from videos. With the help of trajectory data, driving behavior can be better understood, and the behavior at stop-controlled intersections in particular shows great promise.
State-of-the-art machine learning techniques have great potential to solve complex problems, including understanding complex road user behavior [31][32][33][34][35][36]. Different studies have investigated the pattern of driver behavior using the machine learning approach [32,37,38]. For example, Mohamed and Saunier [32] introduced a multi-level motion pattern learning framework for understanding driver behavior in an unsupervised pattern recognition approach. Ferreira et al. [36] investigated the performance of four machine learning algorithms to detect various driving event types, such as aggressive breaking, aggressive acceleration, aggressive left or right turning, aggressive left or right lane changing and non-aggressive events. Qi et al. [34] extracted latent driving states to analyze the behavior characteristics of drivers using an ensemble clustering method based on the kernel fuzzy C-means algorithm and the modified latent Dirichlet allocation model to deal with the longitudinal driving behavior data. Osman et al. [33] proposed a methodology using driving behavior parameters to identify what drivers are engaged in while driving. The analysis of the data was based on K nearest neighbor, random forest, support vector machine, decision trees, Gaussian neighborhood, multilayer perceptron, adaptive boost, and quadratic discrimination analyses. Some of the studies considered road user behaviors using trajectory data based on machine learning methods [37,39]. Aoude et al. [39] used a support vector machines classifier with a Bayesian filter and hidden Markov models to classify driver behaviors, including compliant drivers and violators at intersections. Lu et al. [37] investigated the behavioral characteristics based on the support vector regression method. According to the successful attempts in the previous studies, machine learning techniques are an effective means of analyzing trajectory data and extracting driver behavior indicators at stop-controlled intersections. Data analysis using the machine learning method is therefore a promising venture in investigating driver behavior at stop-controlled intersections, such as the issue associated with road user behavior.

Methodology
The methodology, as presented in Figure 1, consists of three main steps: (i) video data collection and processing, (ii) vehicle stopping behavior definition, (iii) vehicle approaching pattern analysis.

Video Data Collection and Trajectory Data Extraction
The data used in this paper were collected by mobile cameras from different sites in the same environmental conditions. Trajectory data were extracted using the open source video progressing application traffic intelligence [28]. Detailed steps of the data collection and processing are shown in Figure 1. The extracted trajectories were further post-processed to address some of the issues [29] using the shared-source traffic vision analysis platform TvaLib [40]. After the extraction and correction of the vehicle trajectories, accurate speed and position data were obtained. A script was used to retrieve individual vehicles' position and speed data, based on the frame numbers.

Vehicle Analysis Zone
The prevailing rule of the stop-controlled intersections is that all the vehicles must come to a full stop behind the stop line and proceed when it is safe. Moreover, if the stop sign is set on the minor approach of the intersection, vehicles on the minor approach must stop and watch the vehicles on the major road. To study the driver behavior when approaching the intersection with a stop sign, the stop line is first identified for analysis. At stop-controlled intersections, all vehicles must stop behind the stop line so as to avoid a traffic violation. Most vehicles are expected to moderately brake or hard brake when they approach the stop line to ensure the vehicle can stop at the stop line [41]. However, some vehicles may not fully stop at the stop line because either their approaching speed is too high or they do not intend to stop at all. To study the full spectrum of these driver behaviors using video data, an analysis zone is defined for each location based on the consideration of the limited coverage of the video camera and the extent of the driver's expected response, as well as vehicle length [42]. The analysis zone includes the segment 5 meters before the stop line to 1 meter after the stop line, as shown in Figure 2.

Identifying and Classifying Vehicle Stopping Behaviors
The vehicle trajectory data extracted from the video at stop-controlled intersections indicate that violation behavior is quite common. The stopping behavior of a vehicle can be identified on the basis of the degree to which the vehicle tires had stopped rolling, varying from full stop to rolling through. There is therefore a need to develop an objective way of identifying and classifying driver behaviors at stop signs.
In this research, we propose to apply an improved variant of the K-means clustering method, the bisecting K-means algorithm, to explore and categorize the vehicle stopping behavior at a stop sign. The bisecting K-means algorithm is a partitional unsupervised learning algorithm designed to classify large data into subsets, such that the data within each subset share the largest common trait [43]. Its clustering objective is to minimize the distance between every data point and the center of the corresponding cluster. Compared to the conventional K-means clustering, this algorithm has the advantage of being less sensitive to outliers, having higher computational efficiency, and being less susceptible to falling into the local optimum [44].
The speed of a vehicle at the stop sign may be indicative of the vehicle stopping behavior. As a result, it is considered as one of the measures used for classifying the vehicle behavior in this research. When a vehicle passes an intersection controlled by a stop sign, its lowest speed is of the most interest and is considered as the critical measure for classifying stopping behavior. A computer script is used to extract the lowest speed of each vehicle as they cross the stop sign based on the analysis of the zone data. The bisecting K-means algorithm requires pre-specifying the number of clusters to be created and the distance measure to be used. Given the low dimensionality of the clustering problem, this paper uses the squared Euclidian distance as the distance metric for optimization. The number of clusters is selected based on the within-cluster sum of squared errors of the cluster, and the stopping behavior groups described in the previous research.
In general, the potential driver behavior can be categorized into one to five groups, as summarized in Table 1. Note that vehicles with full stop behavior were a specific type which was not considered in the clustering, as by law this is the action the driver is supposed to take. Therefore, it is considered as an independent type. To evaluate the appropriate number of driver-stopping behavior clusters, according to the potential groups shown in Table 1, bisecting K-means clustering analysis was repeated for two groups, three groups, four groups, and five groups. The clusters were evaluated through the within-cluster sum of squared errors of the cluster, and this was the objective of this study. The smaller the value of the within-cluster sum of squared errors (SSE), the better the clustering result. The within-cluster sum of squared errors is computed as follows Equation (1): where SSE TRJ is the within-cluster sum of squared errors; x(i) is the point of the cluster; θ(j) is the center point of cluster j. If x(i) is the point of cluster j, w(i, j) = 1; otherwise, w(i, j) = 1.

Identifying Vehicle Approaching Pattern
In addition to the vehicle stopping behavior at the stop sign, it is important to consider the vehicle approaching pattern when the driver traverses a stop-controlled intersection at the analysis zone. Typical vehicle trajectories, which are in x-y coordinates in the zenith angle of view, include just vehicle positional information. However, the positions of vehicles are reasonably fixed at stop-controlled locations; that is, most of them follow a fixed path which is in line with the lane, or in other words their lateral displacement is quite limited. Therefore, position-based trajectory data may not provide detailed information to understand the approaching pattern of vehicles. When investigating vehicle approaching patterns at such locations, their speeds should be a major consideration, as adaptations in their maneuvers are mostly the changes in speed along their path to the stop sign position. Therefore, to analyze the approaching pattern of a vehicle, its speed profile representing the relationship between the speed of the vehicle and its distance to the stop line is used for clustering.
In order to cluster the vehicles based on their speed profiles, we applied one of the most common clustering algorithms, the density-based spatial clustering of applications with noise (DBSCAN) method developed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu (1996) [45]. DBSCAN is a data clustering algorithm for clustering a set of points in some space by grouping the points that are closely packed together (points with many nearby neighbors). The DBSCAN algorithm can be used to discover clusters of arbitrary shape with noise observations [46,47]. However, the DBSCAN algorithm needs to be modified to cluster the vehicle trajectories, which include different numbers of points and different lengths of intervals.
To group similar vehicle trajectories at stop signs as a whole using the DBSCAN algorithm, we made the following improvements. The most important improvement was to design a distance function to define the density of the line segment. In fact, the lengths of the vehicle trajectories are different from each other. Furthermore, every trajectory contains a different number of points with known speed and distance data. As a result, we proposed to employ mathematical differentiation and the vector product method to calculate the distance between two trajectories. The calculation method is shown in Figure 4, where all the points of a shorter trajectory are projected to a longer trajectory to obtain the vertical distance between every point of shorter trajectory to longer trajectory. Then, the Lehmer mean is used to obtain the whole vertical distance of two trajectories.
Vehicle trajectory clustering analysis at a stop sign using the improved DBSCAN algorithm also requires parameter values of the algorithm to be chosen, which include the density-reachability parameter D and the minimum number of clusters MinL. The density-reachability parameter is used to decide which clusters the trajectory belongs to. The minimum number of clusters is used to determine whether all the trajectories selected can become a new cluster. The analysis is shown in Figure 5, in which cluster 2 is shown to meet the requirement of the cluster. According to the above analysis of driver behavior, the suitable clusters are identified by setting proper values for the clustering parameters of D and MinL.

Analysis of Vehicle Trajectory Patterns
Based on the definition of vehicle stopping behavior and the clustering result of the different vehicle approaching patterns, the driver behavior when approaching the stopcontrolled intersection is analyzed. The stopping behavior is defined according to the minimum and maximum clustering value. To determine the distribution of each type of stopping behavior, the following speeds are considered: the minimum speed, the 75th percentile, median speed, and 25th percentile values. The analysis will evaluate the vehicles' approaching patterns from the perspective of whether or not the vehicles have come to a full stop or decelerated, based on the definition of vehicle stopping behavior. Furthermore, the statistical summary of vehicle stopping behaviors from the vehicle approaching process are determined, and the speed trends of vehicles approaching stop-controlled intersections are used to differentiate vehicle approaching patterns. Furthermore, the proportion of each vehicle approaching pattern is used to understand which pattern the drivers prefers to follow, based on the 85th percentile, median, 15th percentile values.

Study Sites and Data Description
This study focusses on the responses of vehicles when they approach a stop sign, and distinguishes whether the vehicles fail to comply with traffic rules. To investigate the vehicle stopping behavior and vehicle approaching pattern, a total number of 2909 observations were made at five stop-controlled intersections in Montreal, including 317, 1482, 378, 439 and 293 observations from Guizot-Henri Julien, Fleury-Millen, St Georges-Notre Dame, Dutrisac-duRuisseau and 13e-Belair, respectively. In order to ensure the driver's visual environment is consistent when entering the stop-controlled intersection, all five stop-controlled intersections selected have an unobstructed view of the stop sign. Moreover, the videos were collected when the weather was good with clear sight. Conditions that may affect the behavior of drivers as they approach the stop sign, such as inclement weather, were avoided. Details about the sites, video data and video snapshots are given in Table 2. As the trajectory extracted from the video is in the format of time-stamped latitude and longitude coordinates, it is necessary to transform the latitude-longitude trajectory into speed-distance trajectory for implementing the proposed approach. As presented below, Figure 6a presents sample trajectory data plotted on the aerial photo for a site involved in this study, and Figure 6b presents the converted speed-distance trajectories for a sample of 50 vehicles. A script was used to automatically extract the instantaneous speed and distance of all the observations in the analysis zone from the videos [30].

Analysis of Vehicle Stopping Behavior
Bisecting K-means analysis was applied to cluster the vehicle stopping behavior. As discussed before, full stopping is considered as an independent type, and only the rest of the stopping behaviors were clustered. To determine the proper number of clusters for defining types of driver behavior, the SSEs of a varying number of clusters were compared and the results are shown in Figure 7. As expected, the within-cluster sum of squared errors decreases as the number of clusters increases. However, the reduction starts to level off when the number of clusters reaches four or over. As a result, we chose to use the four clusters as the optimal number of clusters. Therefore, with full stopping as another class, the vehicle stopping behaviors are classified into five different types based on the clustering results. Based on the clusters determined, the vehicle stopping behaviors at the analysis zone are categorized into five types. Statistical summaries of the five types of vehicle stopping behavior are provided in Table 3. Figure 8 shows the boxplots of the speeds within each type of behavior, while Figure 9 illustrates the proportions of these behavior types. Five types of behaviors are defined and discussed in this part, which include the following:    As compared in Figure 9, up to 37% of the vehicles tended to make a slight rolling stop while approaching the stop sign. These drivers either were confused about the rules at a stop-controlled intersection, or they knew the rules but did not intend to come to a complete stop in order to save time. Rolling stop behavior is the second most prevalent, and accounts for 33% of the observations. Many vehicles showed a tendency to stop; however, before the tires stopped completely the vehicle accelerated to drive through the intersection. Furthermore, about 14% of the vehicles slowed down but did not try to stop. The proportion of running through behavior among all drivers was 5%. Despite the relatively low percentage of vehicles approaching the stop-controlled intersection with this running through behavior, this type of behavior is also the most dangerous compared to all the other categories. Figure 10a provides the clustering results of different vehicle approaching patterns using the improved DBSCAN algorithm based on the speed-distance trajectory in the analysis zone. The statistical summary of the five patterns of vehicle approaching process is provided in Table 4, including the number of observations in each type of cluster, the proportion of each type, the mean of the 15th percentile speed, the minimum value of all the 15th percentile speeds, the maximum value of all the 15th percentile speeds, the mean of the 85th percentile speed, the minimum value of all the 85th percentile speeds, and the maximum value of all the 85th percentile speeds.  Cluster 1: The vehicles in this cluster exhibited a noticeable deceleration in the analysis zone, and then sped up through the stop-controlled intersection, as shown in Figure 10b. On the way to the stop line, the speeds of vehicles dropped to a minimum value (the minimum values of the median, 15th percentile and 85th percentile are 0.75 m/s, 0.24 m/s and 2.81 m/s, respectively). Among these drivers, about 15% of them made a full stop or slight rolling stop, roughly 50% of the vehicles approaching the stop-controlled intersection decelerated to a slight rolling stop, and approximately 35% of the vehicles made rolling stops or slowed down without stopping, which means that most of the drivers displaying this pattern understand the requirement of the stop-controlled intersection, and they tended to make a full stop. Vehicles rolling or slowing down account for a small part. Thus, drivers with this approaching pattern had a strong awareness of obeying traffic rules. Unfortunately, the vehicles with this pattern accounted for only 1.56% of all the observations. Cluster 2: This is the dominant cluster which accounts for over 86% of all observations. The vehicles in this cluster generally had a relatively low speed, with little noticeable deceleration or acceleration, as shown in Figure 10c. The minimum values of the median, 15th percentile and 85th percentile are 1.64 m/s, 0.67 m/s and 2.43 m/s, respectively. Approximately 15% of the vehicles approaching the stop engaged in slight rolling stop behavior. Moreover, about 35% of vehicles initiated a rolling stop when approaching the stop-controlled intersection. Furthermore, nearly 35% of the vehicles exhibited the slow down without stop behavior. According to the statistical results, some drivers with this vehicle approaching pattern had a full stop tendency, and the majority of drivers approaching the stop-controlled intersection tried to stop in a rolling way but never stopped completely. Therefore, it can be concluded that the drivers with this pattern approached the stop-controlled intersection cautiously, but had no intention to come to a full stop, probably because there were no conflicting pedestrians and vehicles at the intersection when they approached.

Analyzing Vehicle Approaching Pattern
Cluster 3: The vehicles of this pattern tended to accelerate through the analysis zone of the stop-controlled intersection, as shown in Figure 10d. However, the speeds at which the vehicles entered the analysis zone were the lowest as compared to the other clusters (the minimum values of the median, 15th percentile and 85th percentile are 3.46 m/s, 2.82 m/s and 4.01 m/s, respectively). Therefore, it is possible that the vehicle had decelerated before entering the analysis zone, and sped up after recognizing that there were no conflicting vehicles or pedestrians. The data suggest that roughly 15% of the vehicles approached the stop-controlled intersections with slow down without stop behavior, and about 70% of the vehicles showed running through behavior. Furthermore, most vehicles showed no indication of eventually coming to a full stop, but rather the majority of vehicles did not slow down. It can be conjectured that the drivers with this type of pattern were aware that there may be conflicts at the stop-controlled intersection; however, their safety awareness was not strong enough, as they just made a slight deceleration to observe the situation of the stop-controlled intersection and then accelerated through the intersection. Lastly, vehicles with this pattern of approaching accounted for only a small part of all the observations (5.63%).
Cluster 4: The vehicles of this pattern displayed a slight deceleration lasting for about 2~5 meters, and then sped up through the stop-controlled intersection. Additionally, the vehicles had a relatively high speed when entering the analysis zone. No large fluctuation of speed appeared in this pattern, as shown in Figure 10e. The minimum values of the median, 15th percentile and 85th percentile speeds are 4.29 m/s, 3.56 m/s and 4.82 m/s, respectively. According to the statistical results, all of the observations in this pattern showed that the driver approached the stop-controlled intersection with a running through behavior, which reveals that the drivers of this pattern did not see, or ignored, the stop sign. Furthermore, the safety awareness of these drivers was lower than all those exhibiting the above vehicle approaching patterns. The proportion of this type of pattern is roughly 5.60% among all the observations; therefore, this dangerous vehicle approaching pattern occupies a small portion of observations, as most of the drivers approaching stop-controlled intersection would likely slow down their speed to ensure safety.
Cluster 5: The speed trajectories of this pattern reveal that the vehicles of this pattern travelled at high speeds (the minimum values of median, 15th percentile and 85th percentile are 5.11 m/s, 3.99 m/s and 5.79 m/s, respectively) and sped up through the analysis zone and the stop-controlled intersection, as shown in Figure 10f. The statistical result shows that all the vehicles of this pattern had running through behavior. In comparison with cluster 4, the vehicles of this pattern accelerated slightly when approaching the stop line (instead of reducing their speeds). Therefore, the drivers of this pattern did not only ignore the rule of the stop sign, but also crossed the stop-controlled intersection even more aggressively. Furthermore, the awareness of these drivers in terms of traffic rules was the lowest when compared to all of the above patterns. About 0.69% of the vehicles conducted this pattern when approaching the stop-controlled intersection. Despite the low proportion, serious concerns need to be raised for vehicles in this pattern.

Main Findings
Based on the results from the case study, some discussions and suggestions can be made, as below.
Most people do not follow the rules of stop-controlled intersections. Two-thirds of drivers conducted rolling stops. This is probably because the majority of people considered a rolling stop as being safe. Rolling stop violations could be further classified into two types, as the types of slight rolling behavior (rolling with reduced speeds) and rolling behavior were successfully clustered.
Among the five approaching patterns, cluster 1 seems to be safest. Cluster 2 occupies the second position, with a safer vehicle approaching pattern according to its relatively low speed compared to cluster 1. Cluster 3 is the third, as both the speed statistics and behavioral characteristics of this type are among the average. Cluster 4 was found to be the fourth, while cluster 5 is the worst among all types.
Drivers of cluster 1 conducted full stops, or if not, they were moving at quite a low speed before entering the intersection. For vehicles of cluster 2, vehicles reduced their speeds, but still the speeds were higher than those of cluster 1, indicating the potential higher risks caused by vehicles approaching with the pattern of cluster 2. Therefore, suggestions can be provided to the local police that the behaviours seen in cluster 2 should be considered as violations. The rest of the patterns are serious violation patterns.

Conclusions
This paper has attempted to address the critical methodological need for quantitative analyses of risky driver behaviors at stop-controlled intersections. A new methodology was proposed, which focuses on two key aspects of the driving process when a vehicle is traversing a stop-controlled intersection: (a) vehicle stopping behavior-to what extent had vehicles slowed down before the stop line? (b) vehicle approaching patterns-with what speed patterns were vehicles approaching the intersection? Both of these behavior types could be determined using high-resolution vehicle trajectories extracted from video data. Two clustering algorithms were proposed for classifying the underlying driver behaviors: the bisecting K-means algorithm for vehicle stopping behavior, and the improved DBSCAN algorithm for vehicle approaching patterns.
A case study was conducted to assess the effectiveness of the proposed methodology. The case study used video data from five stop-controlled intersections in Montreal, Canada.
A tracker in the open source traffic intelligence project and a shared-source traffic vision analysis platform TvaLib were used to extract vehicle trajectory data from the videos. The extracted trajectory data were then used as inputs for the two cluster algorithms. The results showed that there exist five distinctive classes of driver behavior in both stopping and approaching processes, corresponding to varying levels of risk. It should be pointed out that this study represents the first step toward the development of a comprehensive methodology for understanding complex driver behaviors and assessing the risk of stopcontrolled facilities. The idea of distinguishing vehicle approaching patterns provides a novel approach in understanding vehicles behavior when they are approaching the intersection, which should be a serious concern but remains broadly unaddressed. This provides a practical reference for improving behavior models in areas of safety analysis, and in human-factor research traffic simulations (for both traffic safety and efficiency evaluation). Meanwhile, the method of the identification of different types of vehicle stopping behaviors and approaching patterns can be used in intelligent vehicle systems to detect erroneous maneuvers of the driver, and conduct proper interferences in the field of CAV and ITS. In practice, the model will further help police in issuing tickets by automatically and objectively determining the level of violation (and risk) in terms of the stopping behavior and approaching pattern of the vehicle.
The proposed methodology should be further evaluated using data from a larger pool of study sites with a wider range of road, traffic and environmental conditions. Another step for future research is the developing of surrogate measures based on the classification results from the proposed clustering algorithms. Lastly, future efforts should focus on developing predictive models, similar to safety performance functions, that link the frequencies of different classes of driver behavior to various contributing factors, such as traffic exposure, intersection geometry and weather conditions. Author Contributions: Conceptualization, X.W. and T.F.; data curation, X.W.; formal analysis, X.W. and T.F.; funding acquisition, X.W., T.F. and L.F.; investigation, X.W. and T.F.; methodology, X.W. and T.F.; project administration, X.W., T.F., J.K., L.F. and M.Z.; resources, T.F. and L.F.; software, X.W. and T.F.; supervision, X.W., T.F., L.F. and M.Z.; validation, X.W., T.F., J.K. and L.F.; visualization, X.W., T.F. and L.F.; writing-original draft, X.W., T.F., J.K. and L.F.; writing-review and editing, X.W., T.F., J.K. and L.F. All authors have read and agreed to the published version of the manuscript.