Next Article in Journal
Strategic Traffic Management in Mixed Traffic Road Networks: A Methodological Approach Integrating Game Theory, Bilevel Optimization, and C-ITS
Next Article in Special Issue
Modeling Determinants of Autonomous Vehicle Utilization in Private and Shared Ownership Models
Previous Article in Journal
Preliminary Study on Cooperative Route Planning Reinforcement Learning with a Focus on Avoiding Intersection Congestion
Previous Article in Special Issue
Pedestrian Interaction with a Novel Urban Light Rail Vehicle: Implications for Multi-Modal Crash Compatibility Standards
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Micro-Mobility Safety Assessment: Analyzing Factors Influencing the Micro-Mobility Injuries in Michigan by Mining Crash Reports

Department of Civil and Construction Engineering, Western Michigan University, Kalamazoo, MI 49008, USA
*
Author to whom correspondence should be addressed.
Future Transp. 2024, 4(4), 1580-1601; https://doi.org/10.3390/futuretransp4040076
Submission received: 11 October 2024 / Revised: 26 November 2024 / Accepted: 5 December 2024 / Published: 10 December 2024
(This article belongs to the Special Issue Emerging Issues in Transport and Mobility)

Abstract

:
The emergence of micro-mobility transportation in urban areas has led to a transformative shift in mobility options, yet it has also brought about heightened traffic conflicts and crashes. This research addresses these challenges by pioneering the integration of image-processing techniques with machine learning methodologies to analyze crash diagrams. The study aims to extract latent features from crash data, specifically focusing on understanding the factors influencing injury severity among vehicle and micro-mobility crashes in Michigan’s urban areas. Micro-mobility devices analyzed in this study are bicycles, e-wheelchairs, skateboards, and e-scooters. The AlexNet Convolutional Neural Network (CNN) was utilized to identify various attributes from crash diagrams, enabling the recognition and classification of micro-mobility device collision locations into three categories: roadside, shoulder, and bicycle lane. This study utilized the 2023 Michigan UD-10 crash reports comprising 1174 diverse micro-mobility crash diagrams. Subsequently, the Random Forest classification algorithm was utilized to pinpoint the primary factors and their interactions that affect the severity of micro-mobility injuries. The results suggest that roads with speed limits exceeding 40 mph are the most significant factor in determining the severity of micro-mobility injuries. In addition, micro-mobility rider violations and motorists left-turning maneuvers are associated with more severe crash outcomes. In addition, the findings emphasize the overall effect of many different variables, such as improper lane use, violations, and hazardous actions by micro-mobility users. These factors demonstrate elevated rates of prevalence among younger micro-mobility users and are found to be associated with distracted motorists, elderly motorists, or those who ride during nighttime.

1. Introduction

In recent years, the micro-mobility concept has emerged, offering environmentally favorable and convenient alternatives to conventional modes of transportation to address urban transportation challenges. Micro-mobility comprises an extensive array of compact and lightweight vehicles, such as bicycles, skateboards, and personal mobility devices, including electric scooters, segways, and electric wheelchairs [1,2]. Although micro-mobility options provide exceptional convenience and environmental advantages, their incorporation into urban environments has brought about novel difficulties, notably increased traffic congestion and crashes. As metropolitan areas increasingly adopt these novel forms of transportation, it is crucial to comprehend the determinants that impact the severity of injuries caused by crashes involving vehicles and micro-mobility [1].
In numerous cities, there is no defined space for micro-mobility vehicles, and sharing infrastructure with other road users is a necessity in many urban environments, thereby increasing the likelihood of traffic conflicts [3]. Moreover, a comprehensive analysis of the shared road space has yet to be conducted [2]. Evaluating micro-mobility safety is paramount in comprehending the hazards linked to these modes of transportation and formulating efficacious mitigation strategies. By considering many elements, including infrastructure design, user behavior, vehicle technology, and regulatory frameworks, safety assessments offer significant perspectives on the obstacles and prospects associated with the integration of micro-mobility in urban environments [1,2,3].
Recent studies have highlighted critical factors influencing micro-mobility safety, complementing this research’s focus. For instance, a Geographic Weighted Regression (GWR) analysis of bike crashes in Budapest (2017–2022) revealed that built environment features, such as traffic signals, road crossings, and bus stops, significantly impact crash patterns and severity, with suburban traffic signals contributing to safer conditions [4]. The study also found that commercial activity and public transportation stops increased crash prevalence in certain districts. At the same time, one-way roads and higher speed limits were linked to more severe crashes [4]. Similarly, a study of e-scooter crashes in Bari, Italy (2020–2022) emphasized the risks associated with higher-speed roads, off-peak daytime crashes, and non-use of cycle paths [5]. The research further revealed that off-peak daytime hours led to more frequent crashes, although nighttime crashes tended to be more severe. Non-use of cycle paths emerged as a significant safety concern, reinforcing the need for targeted awareness campaigns and infrastructure improvements [5]. These findings underscore the importance of considering the built environment, user behavior, and infrastructure characteristics when designing interventions to improve micro-mobility safety, aligning with this study’s goal of understanding injury determinants and proposing data-driven safety strategies.
The severity of injuries sustained in micro-mobility crashes can differ significantly, although the most reported injuries include incapacitating and minor injuries [1]. Additionally, the rider’s gender, age, intent, the location of the crash, and whether the incident occurred on a weekday or weekend may be associated with the severity of injuries [6]. Research is more limited to crashes involving e-scooters or wheelchairs, and their safety is not as well understood as other modes of transportation [7,8]. The most frequently observed incident in single e-scooter and wheelchair crashes is a fall, which predominantly results in severe injuries [9,10].
Detailed crash data are essential for such safety studies, as comprehensive descriptions of crash scenes and micro-mobility rider’s behavior are predominantly available within crash narratives and diagrams [1,11]. However, extracting and applying this information from traffic crash reports poses significant challenges [12,13]. Various studies have employed diverse methodologies to analyze the severity of crash injuries. A long-standing practice in crash severity analysis, statistical modeling is a conventional method that yields dependable insights into the probability of crashes and produces straightforward results. Despite this, statistical modeling necessitates predefined relationships between dependent and independent variables and specific assumptions regarding underlying data distribution [14].
On the contrary, machine learning methodologies are increasingly embraced in this field due to their independence from presumptive associations among variables [14,15,16]. Consistently, numerous studies have found that the analytical methods presented—namely Random Forest (RF), Support Vector Machine, and Decision Tree—exhibit superior performance. Furthermore, in a comparative analysis of RF and other methods, RF demonstrated superior accuracy in identifying significant variables associated with injury severity categories and in classifying and predicting the severity of injuries sustained by vehicle drivers [15,16].
Although several studies have been conducted on a restricted sample of micro-mobility-related crashes and the resulting injuries, numerous unexplored features must be uncovered to understand these crashes comprehensively [2]. This research has, therefore, undertaken a comprehensive examination of the variables that may impact the extent of injuries caused by a single micro-mobility crash. Four distinct categories of micro-mobility devices were incorporated into the study: bicycles, e-wheelchairs, skateboards, and e-scooters.
Computer vision disciplines are the source of the computer vision methodologies utilized in safety-related research [17]. As part of the conventional computer vision process, feature extraction continues to be used for object detection and classification. A suitable feature model is necessary to implement precise object detection and classification [18,19]. This methodology will automatically extract an extensive set of image characteristics, which will subsequently be utilized in image classification. Convolutional Neural Networks (CNNs) and Histogram of Oriented Gradients (HOGs) are a few examples. The capability of these algorithms to self-learn from a provided dataset is an advantage [19]. Pre-performing feature extraction is not required due to the end-to-end nature of the deep neural network process. The deep neural network has emerged as the preeminent technology for image processing and resolving computer vision-related challenges, coinciding with the progress made in computer hardware and software [19].
This study represents the first utilization of CNNs to detect features related to micro-mobility users from crash diagrams extracted from police crash reports (Michigan UD-10). AlexNet CNN architecture will serve as a highly skilled investigator, capable of discerning intricate patterns within photographs [20,21]. It is exceptionally well-suited for identifying various attributes from crash diagrams. The feature classification technology provided by AlexNet will be employed to extract multiple features in a tabulated format, facilitating the development of an automated system for safety data extraction based on crash diagrams. Furthermore, this study aims to evaluate the effectiveness of AlexNet CNN architecture in classifying the location of micro-mobility device crashes using crash report diagrams, including three categories: roadside, shoulder, and bicycle lane.
The selection of machine learning models, specifically AlexNet CNN and Random Forest, over traditional statistical methods is driven by their capability to address the complexities inherent in crash data. Conventional methods, such as regression models, depend on predefined relationships between variables and assume specific data distributions, which may not adequately capture crash data’s nonlinear interactions and heterogeneity. In contrast, machine learning models excel in identifying complex patterns without requiring strict assumptions about variable relationships or distributions. AlexNet CNN is particularly well-suited for image-based data, such as crash diagrams, as it automatically extracts and learns spatial and contextual features through its deep architecture. Similarly, Random Forest handles diverse datasets effectively, modeling nonlinear interactions while providing interpretable rankings of variable importance. The unique value of this integration lies in its ability to combine the strengths of AlexNet CNN and Random Forest, enabling the automated analysis of unstructured data from crash diagrams while concurrently ranking the most critical factors influencing injury severity. This innovative approach enhances the understanding of crash dynamics across various micro-mobility devices. It yields actionable insights that traditional or single-model approaches may fail to uncover, thereby significantly strengthening the study’s analytical framework.
While previous studies have explored factors influencing micro-mobility crashes, significant gaps remain in understanding how crash diagrams and textual data can be systematically leveraged to analyze injury severity outcomes. Despite growing research on micro-mobility safety, existing studies often need to pay more attention to unstructured crash data and account for the unique characteristics of a broader range of devices, such as skateboards and e-wheelchairs. Most research relies on structured crash datasets, often overlooking the rich, unstructured data in police reports and diagrams. Additionally, studies focusing on specific micro-mobility devices, such as e-scooters or bicycles, usually need to consider a broader range of devices, such as skateboards and e-wheelchairs, and their unique crash characteristics. Furthermore, previous studies have yet to integrate advanced machine learning techniques such as CNNs to extract latent features from crash diagrams. Few studies have also comprehensively evaluated crash locations’ spatial and contextual attributes.
This study addresses these gaps by employing AlexNet CNN for automated feature extraction and Random Forest for injury severity analysis, providing a novel methodology that combines image processing with machine learning to offer a more nuanced understanding of micro-mobility crashes. The integration of AlexNet CNN and Random Forest allows for a deeper analysis of crash diagrams and injury severity, offering new insights into micro-mobility safety. In addition, this research bridges the gap between academic studies and real-world applications, providing actionable recommendations for policymakers and urban planners to enhance safety for micro-mobility users.
The structure of this paper is as follows: Section 2 delves into the research data and methodology. Section 3 presents the results and discusses the findings. Finally, Section 4 concludes the research with conclusions and offers suggestions for future work.

2. Materials and Methods

2.1. Crash Data

The present study employs UD-10 crash data gathered by the Michigan Safety Police (MSP) agency as the primary source of information regarding traffic crashes. The UD-10 crash data serve as a unique approach to record and systematically document many facts about traffic crashes. These details encompass driver characteristics, vehicle information, environmental conditions, and contributing factors. UD-10 crash data offers valuable insights into the causes and consequences of traffic crashes due to its extensive utilization and comprehensive nature in traffic safety analysis. We aim to discern patterns, trends, and risk factors linked to micro-mobility users by examining the 2023 UD-10 crash reports of Michigan’s urban areas.
The crash reports, downloaded in Portable Document Format (PDF) and included crash diagrams and narratives, were obtained from the Michigan Traffic Crash Facts website [22]. The crash narratives, diagrams, and crash ID were obtained iteratively through the texts and images of the crash reports using a Python script designed to extract texts and crash diagrams in a tabulated format. The crash narratives and diagrams were subsequently merged with crash metadata using the crash ID. To accommodate the diverse contexts and variations in which micro-mobility devices may appear in crash narratives, patterns were established using regular expressions. These patterns encompassed a range of spelling variations, plural forms, and commonly used terms associated with each type of device. Subsequently, the information extraction technique leveraged Excel tools to identify matches in the input text and generate a structured list of extracted entities, including the corresponding micro-mobility device type. Micro-mobility devices analyzed in this study are bicycles, e-wheelchairs, skateboards, and e-scooters. A total of 1254 crashes were selected from the 2023 crash database, all of which involved at least one micro-mobility device. Data quality was guaranteed by conducting a comprehensive debugging procedure to eliminate records that contained missing or insufficient information. As a consequence, the final database consists of 1174 micro-mobility crash diagrams that are diverse.

2.2. Crash Diagrams Preprocessing

The crash diagrams database underwent adjustments during preprocessing to enhance the model’s performance and ensure uniformity. An initial resizing was performed using the ‘squish’ method to maintain consistent resolution across all diagrams [23]. Although this compromises the aspect ratio of the original images, it guarantees that the entire collection has consistent dimensions. Subsequently, the pixel values of the images are normalized—a critical stage in the training of deep learning models. Typically, this normalization procedure modifies the pixel values to form a standardized range of [0, 1]. This optimization of CNN convergence during training was achieved by employing these preprocessing methodologies to reduce disparities in image dimensions [24]. The ultimate objective of these endeavors was to produce more precise and resilient outcomes in classifying micro-mobility crash locations.

2.3. K-Fold Cross-Validation

This research involves a systematic data collection and preparation process, which employs the k-fold cross-validation and AlexNet CNN architecture to classify micro-mobility crash locations accurately. Figure 1 illustrates the data extraction framework. The extracted information regarding micro-mobility crash locations was also organized into structured data formats and merged with the original metadata. This facilitated the analysis of patterns influencing injury severity outcomes. Micro-mobility crash locations were classified into three categories: roadside, shoulder, and bicycle lane.
The model’s capacity to process novel data was thoroughly assessed using k-fold cross-validation, with a value of k = 5. Using this method, our dataset is partitioned into five subsets or “folds” of comparable size [25]. One-fold is designated as the validation set during each iteration, while the remaining four folds are used for model training. Every fold performs as the validation set precisely once, and this process is repeated five times. K-fold cross-validation enables a more robust evaluation of our model’s performance by reducing the impact of data variability and establishing a more reliable assessment of its performance across multiple subsets of the data [25].

2.4. AlexNet CNN Architecture

This study employed AlexNet CNN architecture as a robust data extraction methodology. Inspired by the human brain, this architecture exhibits exceptional intelligence, akin to highly skilled investigators capable of discerning intricate patterns within photographs [20,21]. This model demonstrates remarkable proficiency in differentiating complex components, boundaries, and shapes depicted in images, making it particularly well-suited for identifying various features from crash diagrams.
AlexNet, an innovative CNN architecture, has made significant advancements in deep learning and image recognition [26]. Its exceptional performance was illustrated by its victory in the 2012 ImageNet Large Scale Visual Recognition Challenge [27]. This achievement marked a pivotal moment in neural networks’ ability to identify objects within images accurately. AlexNet comprises eight layers: five convolutional layers and three fully connected layers. Implementing rectified linear units (ReLU) as activation functions was a significant innovation that considerably accelerated convergence and training. Furthermore, the network’s efficacy was improved by incorporating dropout regularization, which prevented overfitting. The foundation for subsequent advancements in deep learning, particularly in tasks involving image categorization, was established by the innovative achievements and profound architecture of AlexNet [27]. The architecture of AlexNet is depicted in Table 1.

2.5. Micro-Mobility Crash Variables

The dependent variable in this research is the severity of pedestrian injuries, which are classified as “Minor/No Injury” or “Fatal/Serious Injury”. The “Minor/No Injury” category encompasses micro-mobility users who have sustained minimal injuries in a crash involving a micro-mobility device. Conversely, the “Fatal/Serious Injury” category pertains to pedestrians who have suffered severe injuries or fatalities as a result of these crashes. The independent variables in this study are characterized as binary variables, with the presence of a condition represented by a value of 1 and the absence represented by a value of 2. However, the gender variables are an exception to this rule, where 2 denotes women and 1 signifies males. Table 2 displays the final micro-mobility crash database, listing the original and extracted variables along with their respective categories.
Some variables in Table 2 are derived from the original databases, while others were developed using the available data. For example, the “Weekend” variable was created by classifying the crash date as a weekend or weekday. In the same vein, the “Intersection” variable differentiates between mid-block crashes and intersection crashes. Furthermore, the age ranges of vehicle drivers or micro-mobility device riders were examined in three age groups: the young, middle, and elderly. It is crucial to recognize that less than 1% of the analyzed crashes are affected by the scenario in which the rider’s or driver’s age is unknown, as all age variables are zero.
In addition, other variables encompass hazardous actions and violations by micro-mobility users or vehicle drivers, such as disobeying traffic control devices (TCDs), careless driving, failure to yield, and more. On the other hand, the “driver intent” variable includes the driver’s intentions before the crash occurred, such as going straight, turning left, turning right, being stopped on the road, etc. The “distracted by” variables were also developed to identify diversions specific to vehicle drivers or micromobility users. Distractions exclusive to micro-mobility users include using navigation systems, smoking, and engaging in concurrent activities like eating or drinking while riding.
The micro-mobility device type was identified using crash narratives written by police officers. Consequently, this study’s micro-mobility device types include bicycles, e-scooters, e-wheelchairs, and skateboards. Additionally, the micro-mobility crash location variables, denoted as ‘MC Location’, were extracted from crash diagrams and classified into roadside, shoulder, and bicycle lane. All these variables are characterized as binary, with the presence of a variable represented by a value of 1 and its absence represented by a value of 2.

2.6. Random Forest (RF)

The referenced variables were employed to train and validate a classification model using the RF approach. The dependent variable in this model was ‘crash severity’. The rationale for using the RF method is supported by its strong classification capabilities, particularly its exceptional performance in predicting the severity of injuries sustained in road crashes [15,16]. As mentioned earlier, an issue arises due to the disparity in the dependent variable ‘crash severity’: the number of crashes leading to fatalities/severe injuries is comparatively lower than the number of crashes resulting in minor/no injuries. This data imbalance may introduce bias into classifiers due to their tendency to prioritize the majority class [28]. Also, a widely recognized and implemented oversampling method, was implemented to address the challenge posed by the imbalanced dataset, which features fatal/severe injury crashes in contrast to those involving no/minor injuries called the Synthetic Minority Oversampling Technique (SMOTE).
The SMOTE is a highly influential and widely recognized data sampling and balancing algorithm in machine learning [28,29]. It is an oversampling technique that was devised to address data imbalances. By generating synthetic examples, this method generates new minority instances [29]. These new instances are generated by interpolating between adjacent minority class instances while maintaining the original features.
RF is a highly recognized and effective supervised learning technique in machine learning [30]. It is employed to generate predictions and resolve classification or regression issues [30]. A classification tree framework is typically employed to evaluate the discriminatory capability of each predictor variable in the model. RF’s superior performance compared to simpler methods such as classification and regression trees arises from its capability of constructing and aggregating the predictions of numerous individual Decision Trees [31,32]. The RF model is characterized by its randomness, a key source of its robustness and variability [32]. A bootstrap sample of the original data was employed to train each tree in the collection. Additionally, each division of the tree nodes is determined by evaluating a distinctive set of independently selected variables [32]. The explained variance for the dependent variable is optimized by partitioning the sample data into two offspring nodes. As a result, each partition of the RF employs a random sample of the data and the predictor variables [32].
Predictions are generated by an RF model using the Out-of-Bag (OOB) sample, which is a subset of the original dataset that was omitted from the tree construction process during the bootstrap sample selection [33]. The OOB error is computed to determine the accuracy of the model.
RF offers two distinct importance measures: Mean Decrease Accuracy (MDA) and Mean Decrease Gini (GINI) [33]. These measures serve the purpose of variable selection and ranking. Normalized by the number of trees, GINI is the sum of all decreases in Gini impurity caused by a given variable used to form a split in the RF [33]. MDA measures the significance of a variable by calculating the change in prediction accuracy (OOB error) that occurs when the variable’s values are permuted at random compared to the initial observations [33].

3. Results and Discussion

3.1. AlexNet CNN Model Configuration and Performance Metrics

An essential aim of this research is to evaluate the efficacy of the AlexNet CNN-trained model described in the previous section. The dataset consists of 1174 diverse micro-mobility crash diagrams. Three features associated with the location of the micro-mobility crashes have been identified and classified as roadside, bicycle lane, and shoulder. The K-fold Cross-Validation method was implemented, with k being set at five. The FastAI platform and Python 3.10.0 were employed to develop and assess our approach on a GPU server (Kaggle personal notebook) [34].
This study established a consistent training environment for the AlexNet CNN architecture. Throughout the optimization process, a singular learning rate of 0.01 regulated the magnitude of each step [35,36]. Additionally, the model’s performance was evaluated using accuracy and F-score as the principal metrics, allowing us to determine the percentage of correctly classified features [35,36,37]. Accuracy denotes the proportion of features that are classified correctly. It includes True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) [33]. Equation (1) denotes accuracy. The F-score is a metric that assesses the precision of positive predictions and the capacity to identify all positive cases. It combines recall (the capability to identify all positive cases) and precision (the accuracy of positive predictions) into a single value. By effectively balancing these two metrics, the F-score provides an integrated performance metric [33]. A higher value on a scale of 0 to 1 signifies superior performance, making it a valuable tool for assessing the efficacy of classification models [37]. Equation (2) denotes the F-score.
A c c u r a c y = T P + T N T P + T N + F P + F N
F S c o r e = 2 × R e c a l l × P r e c i s i o n R e c a l l + P r e c i s i o n
The assessment used a 5-fold cross-validation methodology to guarantee its accuracy and consistency. The dataset was partitioned into five discrete subsets using this method. The model was trained on the remaining four subsets, and each subset was used as a validation set in turn. Each fold was employed as the validation set once after five iterations. This comprehensive evaluation of the model’s effectiveness across different subgroups was achieved by averaging the outcomes of these five iterations. The systematic application of this methodology effectively reinforced the reliability of our model in detecting micro-mobility crash locations and allowed us to draw strong inferences about the efficiency of the AlexNet CNN architecture, particularly for transportation safety research. Table 3 presents the training and validation accuracy and F-score results of the AlexNet CNN model.

3.2. RF Model Configuration and Validation

This approach was employed to develop a model for analyzing individual micro-mobility crashes. It investigates the relationship between crash severity outcomes (considered dependent variable) and various features related to crashes, micro-mobility riders and devices, drivers, and infrastructure (considered independent variables) in this model. Cross-validation was performed to develop and subsequently validate the RF model. The model training utilized 75% of the final database, followed by validation with the remaining 25%. Research has consistently shown that employing a training-to-test data ratio of 75:25 produces high-performance scores when comparing tree-based machine learning models [38,39,40]. Therefore, this ratio is commonly recommended.
The variables mentioned earlier were used in the development of the RF model. An analysis was conducted to assess the model’s effectiveness by observing the evolution of OOB errors in relation to the number of trees generated (ntree). The OOB errors and their variations are visually depicted in Figure 2. The red classification curve illustrates the OOB error that occurred when crashes involving minor or no injuries were misclassified, while the green classification curve depicts the OOB error for crashes involving fatal or severe injuries. The black-colored curve represents the average progression of both OOB errors for the RF model.
As depicted in Figure 2, the OOB errors achieve a stable state at approximately 250 trees. For this investigation, the selected hyperparameter was ntree = 300, leading to stable OOB errors. With an aggregate OOB error rate of 17.1%, the model demonstrated an estimated accuracy of 83% when applied to the training set. The red curve represents a classification error of 21.2% for micro-mobility crashes involving minor or no injuries, while the green curve represents a classification error of 12.3% for micro-mobility crashes involving fatal or severe injuries. While both errors are relatively low, the RF model demonstrates superior performance in classifying micro-mobility crashes that result in fatal or severe injuries, which is economically more critical for ensuring road safety.
The validation procedure predicted the model’s outcome for micro-mobility crash-related injuries using the validation set. To evaluate the model’s performance, a confusion matrix and Receiver Operating Characteristic (ROC) curve were generated based on these predictions. Figure 3 depicts the confusion matrix. The RF model’s classification errors are indicated by values outside the diagonal, while the correctly classified data are represented along the diagonal. The model’s validation data indicates a significant level of predictive accuracy, 86.4%. In addition, the confusion matrix in Figure 3 suggests that the classification errors are marginally lower for micro-mobility crashes that result in fatal or severe injuries than those that result in minor or no injuries.
The ROC curve was implemented to evaluate the model’s performance further. It illustrated the RF model’s specificity and sensitivity (false positive rate and true positive rate, respectively). Equations (3) and (4) represent these values, as illustrated in Figure 4. The AUC is a metric used to approximate the model’s classification capability [41].
S e n s i t i v i t y = T P T P + F N
S p e c i f i c i t y = F P F P + T N
A classifier with an AUC of 1 is considered perfect in its classification. The classifier’s efficacy declines as the AUC decreases. The AUC value of 0.84, which is exceedingly close to 1, was demonstrated by the RF model in this study. According to this high AUC score, the RF model accurately classified micro-mobility crash injury outcomes. Consequently, the RF model’s ability to accurately classify crashes based on the severity of injuries was effectively validated.
Furthermore, an analysis was performed to ascertain the importance of the variables’ rankings as determined by the RF model. Both Gini and Mean Decrease Accuracy (MDA) rankings were computed to improve the reliability of the findings from the model’s randomized setting. The analysis determined that the Gini and MDA rankings offer consistent information, and that the Gini index exhibits superior stability results than the MDA criteria, which aligns with the results of prior research [42]. Consequently, the Gini index was employed to rank the importance of the variables in this study, as illustrated in Figure 5.
The GINI index indicates that roads with high-speed limits have the greatest primary effect on crash classification by injury severity outcomes. Additionally, primary factors related to micro-mobility users include sex, middle-aged riders, and whether the micro-mobility rider was considered to be in violation or distracted. Furthermore, significant variables related to drivers include elderly age, driver’s sex, intent to turn left, and failure to yield, particularly at stop-controlled intersections. Moreover, intersection, weekend, and nighttime crashes are considered highly influential variables.
Resource managers and governing bodies must comprehend the cumulative impact of a variety of factors on the severity of micro-mobility injuries. As valuable as it is to identify the most significant variables, the combined impact of these factors offers critical insights [16]. The decision rules produced by the RF model facilitate this understanding by enabling a comprehensive analysis of the concurrent impacts of multiple factors [16]. This information can help governing bodies develop and implement targeted strategies and countermeasures to reduce injuries in micro-mobility-related crashes.

3.3. Decision Rules from the RF Model

An analysis of decision rules can be conducted to understand the impact of various factors and their combinations on the severity of injuries in micro-mobility-related crash incidents. These rules unveil the overall impact of various components, a key element for understanding the implications of these crashes. The insights provided by the model’s predictions and decision rules are not only beneficial for traffic authorities but also play a significant role in guiding governmental decision-making on road safety policies and measures [16].
The RF model produced over 900 decision rules. However, only some of these rules are distinct, as their rates of misclassification and frequency of occurrence vary. From the dataset, a total of 540 distinct decision rules were extracted. The model emphasizes the 25 rules that exhibited the highest frequency and lowest error rates. These prominent decision rules are displayed in Table 4. The ‘Decision Rule’ column lists the combinations of variables, while the ‘Prediction’ column shows the RF model’s predictions based on these combinations. Furthermore, the table displays the estimated error (%) and frequency (%) of each decision rule produced by the RF model.
The type of micro-mobility device and crash location emerged as significant factors. According to Table 4, decision rules (1, 2, 6, 11, 24 and 25) offer substantial evidence regarding the micro-mobility device type ‘Bicycle’ and its location ‘in the bicycle lane’, shedding light on their influence on injury severity. The likelihood of minor or no injuries in crashes is higher for bicyclists cycling in designated bicycle lanes, free from distractions, and in environments where drivers comply with traffic control devices. Conversely, those who deviate from bicycle lanes face a higher risk of sustaining fatal or severe injuries, particularly if they are elderly, riding ‘on the shoulder’ or ‘on the road’, traversing high-speed roads, commuting during nighttime, exhibiting distracted behavior, failing to yield at uncontrolled intersections, or encountering trucks, buses, or careless drivers. Similar results have been reported in other research regarding single-vehicle crashes involving bicycles [43,44,45].
In contrast, decision rules (8, 13, 15, 16 and 17) reveal a significant correlation with injury severity outcomes for e-scooter users when they are on the road. E-scooter riders navigating through mixed-traffic environments face a heightened risk of severe or fatal injuries, especially within work zones and when encountering distracted drivers, traffic violators, frequent lane changers, and young drivers. Conversely, riders on roads with attentive, law-abiding drivers are more likely to sustain minor or no injuries. These results are consistent with another study that identified e-scooter activity as a risk factor in single-vehicle crashes involving e-scooters; however, the specific riding locations were not specified [8].
Furthermore, ‘wheelchair’ users face a higher risk of sustaining severe or fatal injuries when encountering distracted drivers, whereas encounters with non-distracted drivers are associated with minor to no injuries, as indicated by decision rules (18 and 19). Additionally, limited evidence from decision rule (22) suggests that ‘skateboard’ users who flout traffic regulations and traverse low-speed roads face an elevated risk of sustaining fatal or severe injuries, particularly when confronted with left-turning events involving middle-aged drivers. While the results may be less conclusive due to the limited sample size, it is evident that further research focusing on e-wheelchair and skateboard users is warranted.
Specific conditions, as identified by decision rules (3, 14, 24), indicate that micro-mobility users who ride during the nighttime are at a heightened risk of sustaining severe injuries, particularly if they are young, riding on the road shoulder, traversing high-speed roads, navigating through rainy weather, and encountering left-turning drivers at signal-controlled intersections. This is consistent with the results of an additional study that examined micro-mobility users at night. These findings align with an additional study that specifically investigated micro-mobility users at night [46]. Furthermore, decision rules (12 and 23) underscore a critical correlation between severe injury outcomes and elderly micro-mobility users. Those who fail to yield, encounter traffic violators and lane-changing drivers, and negotiate yield-controlled intersections face an increased risk of severe injuries.
Crashes involving micro-mobility devices at intersections pose a higher risk of fatalities and severe injuries compared to those at midblock locations. Decision rules (5, 14 and 20) indicate that severity is influenced by additional factors, such as elderly drivers, left-turning drivers, and micro-mobility users who ride in mixed traffic, violate traffic regulations, or improperly use lanes. This aligns with other research findings, which suggest that the likelihood of crashes increases at intersections, particularly when motor vehicles are turning across bike lanes [1]. Although decision rules include heavy vehicles such as trucks or buses (7, 11, 21), their effects are ambiguous when linked to other significant factors, such as rider age and sex. Further research is needed to study the effects of these variables’ interactions with various factors.
Several decision rules (4, 6, 7, 10, 16, 17, 18, 19 and 21) shed light on the significant influence of distracted actions on injury severity outcomes in micro-mobility crashes. Distracted actions, whether exhibited by micro-mobility users or vehicle drivers, significantly elevate the risk of severe injuries in these incidents. For micro-mobility users, distractions manifest in various forms, including but not limited to mobile phone usage, adjusting music players, eating or drinking, and engaging in conversations with fellow riders. Such distractions divert their attention away from the road environment and compromise their ability to react swiftly to potential hazards, increasing their vulnerability to crashes and subsequent injuries. Similarly, distracted actions among drivers, such as texting while driving, using electronic devices, eating, or engaging in complex conversations, significantly impair their attentiveness to the road and surrounding traffic, thereby elevating the likelihood of collisions with micro-mobility users. In such scenarios, micro-mobility users become more susceptible to severe injuries due to the greater force of impact resulting from crashes with larger vehicles. These conclusions are consistent with several research on micro-mobility safety [2,47], further emphasizing the critical importance of addressing distracted driving and riding behaviors for enhancing road safety.
It is essential to recognize that the impact of particular factors may be contingent upon their interaction with other factors. The micro-mobility crash severity outcomes cannot be determined solely by examining the effects of individual variables; instead, the cumulative effect of all contributing factors is more substantial. By considering the combined impacts of numerous factors, decision rules make significant contributions, enhancing the value of this research.
The findings of this study hold significant real-world implications for improving micro-mobility safety. By identifying critical factors influencing injury severity and leveraging unstructured crash data, the methodology developed in this research can aid policymakers and urban planners in designing targeted safety interventions. For instance, the insights into spatial and contextual crash attributes can guide the placement and design of dedicated micro-mobility lanes, reduce conflicts with other road users, and enhance infrastructure for vulnerable road users. Furthermore, the automated feature extraction approach can streamline crash data analysis, enabling transportation agencies to analyze large volumes of crash reports more efficiently and develop data-driven strategies to mitigate crash risks. These contributions address current challenges in micro-mobility safety and provide a scalable framework for future applications in diverse urban environments.

4. Conclusions

Micro-mobility devices, including but not limited to scooters, wheelchairs, skateboards, and bicycles, substantially impact accessibility and urban mobility. Analyzing unstructured data for occurrences of these devices can yield significant insights regarding their usage patterns, trends, and contexts. This study presents an image processing methodology for the automated extraction of features from crash diagrams related to micro-mobility devices.
The AlexNet CNN architecture proved to be a resilient method in this research. Drawing inspiration from the human brain, this sophisticated computer program functions as a highly skilled investigator adept at discerning intricate patterns within photographs. Demonstrating exceptional proficiency in differentiating intricate components, boundaries, and shapes present in images, it is exceptionally well-suited for identifying various attributes from crash diagrams. Our results confirmed that the micro-mobility crash location could be detected accurately in crash diagrams. In our model, a validation accuracy of 82% and an F-score of 86% were achieved. Additionally, this study addresses the integration of extracted micro-mobility crash data into structured data formats, such as initial metadata. This stage involves consolidating the extracted features into a tabular structure, incorporating supplementary attributes or contextual data, and integrating them with the pre-existing metadata linked to textual data sources. Overall, our approach to image processing offers a feasible and scalable solution for extracting micro-mobility attributes from crash diagrams. This aids in examining, representing, and determining matters related to assistive technology development, urban planning, and transportation administration. It is expected to contribute to understanding micro-mobility utilization patterns and facilitate data-driven interventions to improve accessibility and mobility in urban settings.
This study embarks on a comprehensive investigation into the determinants of injury severity in micro-mobility-related crashes within urban areas of Michigan. Bicycles, e-wheelchairs, skateboards, and e-scooters, analyzed in this study, emerge as pivotal components of urban transportation, necessitating a nuanced understanding of their interactions with vehicular traffic and infrastructure. By analyzing a dataset comprising 1174 diverse micro-mobility crash diagrams sourced from the 2023 Michigan UD-10 crash reports and utilizing the RF classification algorithm, the research elucidates primary factors and their interactions contributing to the severity of micro-mobility-related injuries. The RF model assessed the combined influence of the considered factors on crash severity outcomes and offered insights into their individual effects through decision rules. These rules significantly contribute to this research by assessing the severity of the injuries and incorporating the interactions of multiple factors, which is common in crashes.
In conclusion, this study identifies roads with speed limits exceeding 40 mph as a predominant factor influencing the severity of micro-mobility-related injuries. Additionally, micro-mobility rider violations and vehicular left-turning maneuvers significantly contribute to crash severity outcomes. The study highlights the importance of micro-mobility device type and crash location in determining injury severity. Bicyclists in designated lanes with minimal distractions and adherence to traffic rules are less likely to sustain severe injuries. Conversely, straying from designated lanes increases the risk of severe injuries, particularly under specific conditions such as nighttime riding or encounters with careless drivers. In contrast, e-scooter riders face a heightened risk of severe or fatal injuries when navigating mixed-traffic environments, particularly in work zones or when encountering distracted or careless drivers. In addition, wheelchair users are more vulnerable to severe injuries when encountering distracted drivers, and skateboard users face higher risks when violating traffic regulations. Further research on wheelchair and skateboard user groups is warranted to validate these findings and inform safety interventions.
Also, this study highlights the significant impact of distracted actions on injury severity in micro-mobility crashes. Both micro-mobility users and drivers who engage in distractions face a higher risk of severe injuries, emphasizing the importance of addressing distracted behaviors to enhance road safety. Reducing distractions among all road users is crucial for lowering injury severity in micro-mobility crashes. Therefore, addressing distracted actions among both micro-mobility users and drivers is paramount to mitigating the risk of severe injuries. Strategies to raise awareness about the dangers of distracted driving and riding and stringent enforcement of laws prohibiting such behaviors are essential to enhance road safety for all.
Furthermore, this research underscores the multifaceted nature of variables impacting micro-mobility safety, such as improper lane use, violations, and user-hazardous actions. These factors are notably established among younger micro-mobility users and are associated with distracted or elderly motorists, particularly at night. In light of these findings, the study advocates for a holistic approach to enhancing road safety in micro-mobility.
In light of the results, numerous suggestions can be made, such as modifying infrastructure to separate micro-mobility devices from vehicular traffic and widening road shoulders. This could involve creating dedicated lanes and pathways for micro-mobility devices and ensuring clear signage and road markings to separate them from traditional vehicular traffic. These changes would help reduce the likelihood of collisions and enhance the overall safety of micro-mobility users. Also, comprehensive education and training initiatives targeting micro-mobility users are essential for equipping them with the knowledge and skills necessary for safe riding practices. Such initiatives could include community workshops, school programs, and online resources that emphasize the importance of following traffic rules, proper use of protective gear, and awareness of potential hazards. Additionally, intensified enforcement actions are crucial in addressing risky behaviors among both micro-mobility users and other drivers. Implementing stricter penalties for violations, increasing the presence of law enforcement officers in high-risk areas, and utilizing technology such as traffic cameras to monitor compliance can help deter unsafe practices and promote a culture of safety.
Overall, this research highlights the need for a complex approach to improving micro-mobility safety. By addressing infrastructure, education, and enforcement simultaneously, cities can create a safer and more inclusive environment for all road users. These efforts are expected to contribute to a better understanding of micro-mobility utilization patterns and facilitate data-driven interventions to improve accessibility and mobility in urban settings.
This study has several limitations that warrant consideration. First, the analysis focused exclusively on crash data from Michigan, which may limit the generalizability of the findings to other regions with different infrastructure, policies, and user behavior. Second, while the RF model provided robust insights into crash severity determinants, the study did not include comparative analyses with other machine learning techniques, which could provide additional perspectives on model performance. Additionally, this research relied on crash reports, which may contain biases or inconsistencies in how data are recorded, potentially influencing the analysis.
Furthermore, several methodological assumptions underpin this study. It is assumed that crash diagrams and narratives in police reports are accurate and comprehensive representations of the crash scene, though these sources may contain omissions or inaccuracies. AlexNet CNN assumes that the visual features extracted from crash diagrams adequately capture the spatial and contextual factors contributing to injury severity. However, nuanced details not represented in the diagrams may be overlooked. Similarly, the RF model assumes robustness in variable importance rankings, yet its performance may depend on the quality and representativeness of the training data. These limitations highlight the need for future research to validate the findings across diverse contexts, explore additional machine learning techniques for comparative analysis, and further refine the use of unstructured crash data to address potential biases.
Future research could expand upon this work by incorporating data from multiple regions or countries to enhance generalizability and exploring additional machine learning models, such as Support Vector Machines, Decision Trees, Gradient Boosting Machines, or advanced neural network architectures, to compare their performance with the RF model. Such analyses could provide deeper insights into the strengths and limitations of different methods in modeling crash severity outcomes, refining predictive capabilities, and identifying the most suitable techniques for specific micro-mobility crash scenarios. Moreover, integrating dynamic data, such as real-time traffic conditions or user behavior from GPS devices, could offer a more comprehensive understanding of the factors influencing micro-mobility safety. Finally, future studies could investigate the impact of emerging micro-mobility technologies and infrastructure design on crash risks to develop more targeted safety interventions.

Author Contributions

Conceptualization, B.Q.; methodology, B.Q.; software, B.Q.; validation, B.Q.; formal analysis, B.Q.; investigation, B.Q.; resources, B.Q.; data curation, B.Q.; writing—original draft preparation, B.Q.; writing—review and editing, B.Q., J.-S.O. and V.K.; visualization, B.Q.; supervision, J.-S.O. and V.K.; project administration, J.-S.O. and V.K.; funding acquisition, B.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in [Michigan Traffic Crash Facts (MTCF)] at [https://www.michigantrafficcrashfacts.org/data/querytool/#q1;0;2023;; (accessed on 1 June 2024)].

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pérez-Zuriaga, A.M.; Dols, J.; Nespereira, M.; Garcia, A.; Sajurjo-de-No, A. Analysis of the consequences of car to micromobility user side impact crashes. J. Saf. Res. 2023, 87, 168–175. [Google Scholar] [CrossRef] [PubMed]
  2. Zhang, C.; Du, B.; Zheng, Z.; Shen, J. Space sharing between pedestrians and micro-mobility vehicles: A systematic review. Transp. Res. Part D Transp. Environ. 2023, 116, 103629. [Google Scholar] [CrossRef]
  3. Ferenchak, N.N.; Marshall, W.E. Traffic safety for all road users: A paired comparison study of small & mid-sized US cities with high/low bicycling rates. J. Cycl. Micromobil. Res. 2024, 2, 100010. [Google Scholar]
  4. Jaber, A.; Csonka, B. Towards a sustainable and safe future: Mapping bike accidents in urbanized context. Safety 2023, 9, 60. [Google Scholar] [CrossRef]
  5. Longo, P.; Berloco, N.; Coropulis, S.; Intini, P.; Ranieri, V. Analysis of E-Scooter Crashes in the City of Bari. Infrastructures 2024, 9, 63. [Google Scholar] [CrossRef]
  6. Eriksson, J.; Niska, A.; Forsman, Å. Injured cyclists with focus on single-bicycle crashes and differences in injury severity in Sweden. Accid. Anal. Prev. 2022, 165, 106510. [Google Scholar] [CrossRef]
  7. Anke, J.; Ringhand, M.; Petzoldt, T.; Gehlert, T. Micro-mobility and road safety: Why do e-scooter riders use the sidewalk? Evidence from a German field study. Eur. Transp. Res. Rev. 2023, 15, 29. [Google Scholar] [CrossRef]
  8. Gao, D.; Zhang, X. Injury severity analysis of single-vehicle and two-vehicle crashes with electric scooters: A random parameters approach with heterogeneity in means and variances. Accid. Anal. Prev. 2024, 195, 107408. [Google Scholar] [CrossRef]
  9. Bennett, C.; Ackerman, E.; Fan, B.; Bigham, J.; Carrington, P.; Fox, S. Accessibility and the crowded sidewalk: Micromobility’s impact on public space. In Proceedings of the 2021 ACM Designing Interactive Systems Conference, Virtual, 28 June–2 July 2021; pp. 365–380. [Google Scholar]
  10. Goralzik, A.; König, A.; Alčiauskaitė, L.; Hatzakis, T. Shared mobility services: An accessibility assessment from the perspective of people with disabilities. Eur. Transp. Res. Rev. 2022, 14, 34. [Google Scholar] [CrossRef]
  11. Yang, H.; Ma, Q.; Wang, Z.; Cai, Q.; Xie, K.; Yang, D. Safety of micro-mobility: Analysis of E-Scooter crashes by mining news reports. Accid. Anal. Prev. 2020, 143, 105608. [Google Scholar] [CrossRef]
  12. Zhang, X.; Green, E.; Chen, M.; Souleyrette, R.R. Identifying secondary crashes using text mining techniques. J. Transp. Saf. Secur. 2020, 12, 1338–1358. [Google Scholar] [CrossRef]
  13. Kwayu, K.M.; Kwigizile, V.; Lee, K.; Oh, J.-S. Discovering latent themes in traffic fatal crash narratives using text mining analytics and network topology. Accid. Anal. Prev. 2021, 150, 105899. [Google Scholar] [CrossRef] [PubMed]
  14. Qawasmeh, B.; Oh, J.S.; Kwigizile, V.; Qawasmeh, D.; Al Tawil, A.; Aldalqamouni, A. Analyzing Daytime/Nighttime Pedestrian Crash Patterns in Michigan Using Unsupervised Machine Learning Techniques and their Potential as a Decision-Making Tool. Open Transpl. J. 2024, 18. [Google Scholar] [CrossRef]
  15. Azhar, A.; Ariff, N.M.; Bakar, M.A.A.; Roslan, A. Classification of driver injury severity for accidents involving heavy vehicles with decision tree and random forest. Sustainability 2022, 14, 4101. [Google Scholar] [CrossRef]
  16. Ijaz, M.; Lan, L.; Zahid, M.; Jamal, A. A comparative study of machine learning classifiers for injury severity prediction of crashes involving three-wheeled motorized rickshaw. Accid. Anal. Prev. 2021, 154, 106094. [Google Scholar] [CrossRef]
  17. Hou, L.; Chen, H.; Zhang, G.; Wang, X. Deep learning-based applications for safety management in the AEC industry: A review. Appl. Sci. 2021, 11, 821. [Google Scholar] [CrossRef]
  18. Nixon, M.; Aguado, A. Feature Extraction and Image Processing for Computer Vision; Academic Press: Cambridge, MA, USA, 2019. [Google Scholar]
  19. O’Mahony, N.; Campbell, S.; Carvalho, A.; Harapanahalli, S.; Hernandez, G.V.; Krpalkova, L.; Riordan, D.; Walsh, J. Deep learning vs. traditional computer vision. In Advances in Computer Vision: Proceedings of the 2019 Computer Vision Conference (CVC), Las Vegas, NV, USA, 25–26 April 2019, Volume 11; Springer: Berlin/Heidelberg, Germany, 2020; pp. 128–144. [Google Scholar]
  20. Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
  21. Yuan, Z.-W.; Zhang, J. Feature extraction and image retrieval based on AlexNet. In Proceedings of the Eighth International Conference on Digital Image Processing (ICDIP 2016), Chengu, China, 20–22 May 2016; SPIE: Bellingham, WA, USA, 2016; pp. 65–69. [Google Scholar]
  22. MTCF. Michigan Traffic Crash Facts (MTCF). Available online: https://www.michigantrafficcrashfacts.org/ (accessed on 1 June 2024).
  23. Calhoun, B.C.; Uselman, H.; Olle, E.W. Development of Artificial Intelligence Image Classification Models for Determination of Umbilical Cord Vascular Anomalies. J. Ultrasound Med. 2024, 43, 881–897. [Google Scholar] [CrossRef]
  24. Qawasmeh, B.S. Safety Assessment for Vulnerable Road Users Using Automated Data Extraction with Machine-Learning Techniques. Ph.D. Thesis, Western Michigan University, Kalamazoo, MI, USA, 2024. [Google Scholar]
  25. Samir, S.; Emary, E.; El-Sayed, K.; Onsi, H. Optimization of a pre-trained AlexNet model for detecting and localizing image forgeries. Information 2020, 11, 275. [Google Scholar] [CrossRef]
  26. Fang, A.; Kornblith, S.; Schmidt, L. Does progress on ImageNet transfer to real-world datasets? Adv. Neural Inf. Process. Syst. 2024, 36. [Google Scholar] [CrossRef]
  27. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  28. Morris, C.; Yang, J.J. Effectiveness of resampling methods in coping with imbalanced crash data: Crash type analysis and predictive modeling. Accid. Anal. Prev. 2021, 159, 106240. [Google Scholar] [CrossRef] [PubMed]
  29. Skryjomski, P.; Krawczyk, B. Influence of minority class instance types on SMOTE imbalanced data oversampling. In First International Workshop on Learning with Imbalanced Domains: Theory and Applications; Pmlr: Skopje, Macedonia, 2017; pp. 7–21. [Google Scholar]
  30. Boateng, E.Y.; Otoo, J.; Abaye, D.A. Basic tenets of classification algorithms K-nearest-neighbor, support vector machine, random forest and neural network: A review. J. Data Anal. Inf. Process. 2020, 8, 341–357. [Google Scholar] [CrossRef]
  31. Demir, S.; Sahin, E.K. Comparison of tree-based machine learning algorithms for predicting liquefaction potential using canonical correlation forest, rotation forest, and random forest based on CPT data. Soil Dyn. Earthq. Eng. 2022, 154, 107130. [Google Scholar] [CrossRef]
  32. Walker, A.M.; Cliff, A.; Romero, J.; Shah, M.B.; Jones, P.; Gazolla, J.G.F.M.; A Jacobson, D.; Kainer, D. Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data. Comput. Struct. Biotechnol. J. 2022, 20, 3372–3386. [Google Scholar] [CrossRef]
  33. Schonlau, M.; Zou, R.Y. The random forest algorithm for statistical learning. Stata J. 2020, 20, 3–29. [Google Scholar] [CrossRef]
  34. Kaggle. Available online: https://www.kaggle.com (accessed on 8 June 2024).
  35. Wang, S.-H.; Xie, S.; Chen, X.; Guttery, D.S.; Tang, C.; Sun, J.; Zhang, Y.-D. Alcoholism identification based on an AlexNet transfer learning model. Front. Psychiatry 2019, 10, 454348. [Google Scholar] [CrossRef]
  36. Kalaiarasi, P.; Rani, P.E. A comparative analysis of AlexNet and GoogLeNet with a simple DCNN for face recognition. In Advances in Smart System Technologies: Select Proceedings of ICFSST 2019; Springer: Berlin/Heidelberg, Germany, 2021; pp. 655–668. [Google Scholar]
  37. Singh, I.; Goyal, G.; Chandel, A. AlexNet architecture based convolutional neural network for toxic comments classification. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 7547–7558. [Google Scholar] [CrossRef]
  38. Al Tawil, A.; Almazaydeh, L.; Qawasmeh, D.; Qawasmeh, B.; Alshinwan, M.; Elleithy, K. Comparative Analysis of Machine Learning Algorithms for Email Phishing Detection Using TF-IDF, Word2Vec, and BERT. Comput. Mater. Contin. 2024, 81, 3395. [Google Scholar] [CrossRef]
  39. Hardalaç, F.; Akmal, H.; Ayturan, K.; Acharya, U.R.; Tan, R.-S. Fetal Status Classification Based on Feature Elimination and Hyperparameter Optimization Using Cardiotocographic Data. SSRN 2020. [Google Scholar] [CrossRef]
  40. Mali, N.; Restrepo, F.; Abrahams, A.; Ractham, P. Implementation of mars metrics and Mars charts for evaluating classifier exclusivity: The comparative uniqueness of binary classifier predictions. Softw. Impacts 2022, 12, 100259. [Google Scholar] [CrossRef]
  41. Muppalaneni, N.B.; Ma, M.; Gurumoorthy, S.; Kannan, R.; Vasanthi, V. Machine learning algorithms with ROC curve for predicting and diagnosing the heart disease. In Soft Computing and Medical Bioinformatics; Springer: Berlin/Heidelberg, Germany, 2019; pp. 63–72. [Google Scholar]
  42. Nembrini, S.; König, I.R.; Wright, M.N. The revival of the Gini importance? Bioinformatics 2018, 34, 3711–3718. [Google Scholar] [CrossRef] [PubMed]
  43. Houten, R.V.; Kwigizile, V.; Oh, J.S.; Mwende, S.; Qawasmeh, B. Effective Pedestrian/Non-Motorized Crossing Enhancements Along Higher Speed Corridors; No. SPR-1734; Michigan. Dept. of Transportation, Research Administration: Lansing, MI, USA, 2023.
  44. Prati, G.; Pietrantoni, L.; Fraboni, F. Using data mining techniques to predict the severity of bicycle crashes. Accid. Anal. Prev. 2017, 101, 44–54. [Google Scholar] [CrossRef] [PubMed]
  45. Wang, T.; Yu, J.; Chen, Y.; Ma, C.; Ye, X.; Chen, J. Factors associated with the severity of motor vehicle crashes involving electric motorcycles and electric bicycles: A random parameters logit approach with heterogeneity in means. Transp. Res. Rec. 2023, 2677, 691–704. [Google Scholar] [CrossRef]
  46. Yitzhak Acosta-Carrascal, H. Gender-Based Motivations for Usage and Avoidance of Shared Micro-Mobility During Night-Time in Stockholm, Sweden. Master’s Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2023. Available online: https://diva-portal.org/smash/record.jsf?pid=diva2%3A1825417&dswid=-5395 (accessed on 8 June 2024).
  47. Karpinski, E.; Bayles, E.; Sanders, T. Safety analysis for micromobility: Recommendations on risk metrics and data collection. Transp. Res. Rec. 2022, 2676, 420–435. [Google Scholar] [CrossRef]
Figure 1. Data Extraction Framework.
Figure 1. Data Extraction Framework.
Futuretransp 04 00076 g001
Figure 2. The OOB Classification Errors for the RF Model (Red: crashes with minor or no injuries, Green: crashes with fatal or severe injuries, and Black: average progression of both Red and Green).
Figure 2. The OOB Classification Errors for the RF Model (Red: crashes with minor or no injuries, Green: crashes with fatal or severe injuries, and Black: average progression of both Red and Green).
Futuretransp 04 00076 g002
Figure 3. The Confusion Matrix of the RF Model.
Figure 3. The Confusion Matrix of the RF Model.
Futuretransp 04 00076 g003
Figure 4. The ROC Curve.
Figure 4. The ROC Curve.
Futuretransp 04 00076 g004
Figure 5. Variables Importance Ranking from RF Model by GINI Index.
Figure 5. Variables Importance Ranking from RF Model by GINI Index.
Futuretransp 04 00076 g005
Table 1. AlexNet CNN Architecture [27].
Table 1. AlexNet CNN Architecture [27].
Layer TypeOutput ShapeNumber of FiltersKernel SizeStride
Input227 × 227 × 3---
Convolutional 155 × 55 × 969611 × 114
Max Pooling 127 × 27 × 96-3 × 32
Convolutional 227 × 27 × 2562565 × 51
Max Pooling 213 × 13 × 256-3 × 32
Convolutional 313 × 13 × 3843843 × 31
Convolutional 413 × 13 × 3843843 × 31
Convolutional 513 × 13 × 2562563 × 31
Max Pooling 36 × 6 × 256-3 × 32
Fully Connected 14096---
Fully Connected 24096---
Fully Connected 31000---
Table 2. Final Database of Micro-mobility Crash Variables.
Table 2. Final Database of Micro-mobility Crash Variables.
VariablesVariables CodeValuesFatal/Serious InjuryMinor/No InjuryTotal
General Crash CharacteristicsWeekendWeekend1 = Weekend213239452
2 = Weekday329393722
IntersectionIntersection1 = Intersection329432761
2 = Midblock213200413
Wet PavementWetPav1 = Yes5981140
2 = No4835511034
NighttimeNighttime1 = Yes140164304
2 = No402468870
Truck/Bus involvedTruckBus1 = Yes191635
2 = No5236161139
Work Zone PresentWorkZonePrsnt1 = Yes9615
2 = No5336261159
High SpeedLimitHighSpeedLimit1 = “≥40 MPH”184190374
2 = “<40 MPH”358442800
Signal controlSignal_control1 = Yes207257464
2 = No335375710
Stop controlStop_control1 = Yes129178307
2 = No413454867
Yield controlYield_control1 = Yes9514
2 = No5336271160
UncontrolledUncontrolled1 = Yes197192389
2 = No345440785
Driver CharacteristicsDriver SexdriverSex1 = Male347383730
2 = Female195249444
Driver ageDriverage_Lessthan251 = Yes8592177
2 = No457540997
Driverage_between25_601 = Yes253279532
2 = No289353642
Driverage_geaterthan601 = Yes204261465
2 = No338371709
Driver Distracted BydriverDistractedBy1 = Yes191234425
2 = No351398749
Driver ViolatordriverViolator1 = Yes242345587
2 = No300287587
Driver Hazardous ActiondriverHazdAction_carelessDriving1 = Yes394584
2 = No5035871090
driverHazdAction_Disobeyded_TCD1 = Yes4267109
2 = No5005651065
driverHazdAction_Failed_to_yield1 = Yes166239405
2 = No376393769
Driver IntentdriverIntent_GoingStraight1 = Yes246278524
2 = No296354650
driverIntent_TurningLeft1 = Yes6681147
2 = No4765511027
driverIntent_TurningRight1 = Yes113162275
2 = No429470899
driverIntent_Stopped_on_road1 = Yes6666132
2 = No4765661042
driverIntent_Backing1 = Yes101525
2 = No5326171149
driverIntent_Changing_Lanes1 = Yes413071
2 = No5016021103
Micro-mobility DeviceBicycleBicycle1 = Yes4975931090
2 = No453984
e_scootere_scooter1 = Yes252045
2 = No5176121129
WheelchairWheelchair1 = Yes141327
2 = No5286191147
SkateboardSkateboard1 = Yes6612
2 = No5366261162
Micro-mobility Rider (MR) CharacteristicsMR SexMSex1 = Male429513942
2 = Female113119232
MR ageMage_Lessthan251 = Yes219226445
2 = No323406729
Mage_between25_601 = Yes219281500
2 = No323351674
Mage_geaterthan601 = Yes104125229
2 = No438507945
MR Distracted ByMDistractedBy1 = Yes149166315
2 = No393466859
MR ViolatorMViolator1 = Yes220194414
2 = No322438760
MR Hazardous Action MHazdAction_Improper_lane_use1 = Yes9193184
2 = No451539990
MHazdAction_Disobeyded_TCD1 = Yes9284176
2 = No450548998
MHazdAction_Failed_to_yield1 = Yes7469143
2 = No4685631031
Micro-mobility Crash (MC) LocationMC LocationM_on_the_road1 = Yes4915571048
2 = No5175126
M_on_the_shoulder1 = Yes202040
2 = No5226121134
M_in_bicycle_lane1 = Yes91827
2 = No5336141147
Table 3. The AlexNet CNN Model’s Training and Validation Accuracy and F-score Values for 10-Epochs.
Table 3. The AlexNet CNN Model’s Training and Validation Accuracy and F-score Values for 10-Epochs.
#EpochLearning RateFoldAccuracyPrecisionRecallF-Score
Training
Outputs
100.0110.82600.85160.77230.8100
20.75170.80330.69810.7470
30.77240.82360.72520.7713
40.82580.85830.79250.8241
50.83230.86090.80870.8340
Mean0.800.840.760.80
Validation Outputs100.0110.80651.00000.75000.8571
20.87100.94440.85000.8947
30.80650.77700.87500.8231
40.80650.94440.77270.8500
50.80651.00000.75000.8571
Mean0.820.930.800.86
Table 4. The Top 25 Decision Rules Using the RF model by Lower Errors and Higher Frequencies.
Table 4. The Top 25 Decision Rules Using the RF model by Lower Errors and Higher Frequencies.
No.Decision RulePrediction%FrequencyError
1If [Bicycle = 1 & M_in_bicycle_lane = 1 & driverHazdAction_Disobeyded_TCD = 2 & MViolator = 2]Minor/No Injury9.10.000
2If [Bicycle = 1 & M_in_bicycle_lane = 2 & driverHazdAction_carelessDriving = 1 & MViolator = 2]Fatal/Serious Injury12.50.000
3If [Nighttime = 1 & M_on_the_shoulder = 1 & HighSpeedLimit = 1 & Mage_between25_60 = 1]Fatal/Serious Injury6.20.000
4If [M_on_the_road = 1 & HighSpeedLimit = 2 & driverDistractedBy = 2 & Mage_between25_60 = 1]Minor/No Injury3.10.000
5If [Intersection = 1 & M_on_the_road = 1 & Driverage_geaterthan60 = 1 & Mage_between25_60 = 2]Fatal/Serious Injury4.70.000
6If [M_in_bicycle_lane = 2 & Uncontrolled = 1 & Mage_between25_60 = 1 & MDistractedBy = 1 & MHazdAction_Failed_to_yield = 1]Fatal/Serious Injury1.60.000
7If [driverViolator = 1 & MSex = 2 & MDistractedBy = 2 & MHazdAction_Failed_to_yield = 1 & TruckBus = 1]Fatal/Serious Injury7.80.125
8If [HighSpeedLimit = 2 & driverHazdAction_Disobeyded_TCD = 1 & driverIntent_Changing_Lanes = 1 & e_scooter = 1 & WorkZonePrsnt = 1]Fatal/Serious Injury0.7000.125
9If [Uncontrolled = 1 & driverSex = 1 & driverHazdAction_Disobeyded_TCD = 1 & MHazdAction_Failed_to_yield = 1]Fatal/Serious Injury6.20.250
10If [M_on_the_road = 1 & driverSex = 2 & driverDistractedBy = 2 & driverViolator = 1]Minor/No Injury10.90.273
11If [M_on_the_road = 1 & Driverage_geaterthan60 = 2 & Mage_geaterthan60 = 1 & Bicycle = 1 & TruckBus = 1]Fatal/Serious Injury9.40.308
12If [driverViolator = 1 & Mage_geaterthan60 = 1 & MHazdAction_Failed_to_yield = 1]Fatal/Serious Injury14.10.312
13If M_on_the_road = 1 & Driverage_Lessthan25 = 1 & driverHazdAction_Disobeyded_TCD = 1 & e_scooter = 1]Fatal/Serious Injury23.40.333
14If [Intersection = 1 & WetPav = 1 & Nighttime = 1 & Signal_control = 1 & driverIntent_TurningLeft = 1 & Mage_Lessthan25 = 1Fatal/Serious Injury1.60.333
15If [M_on_the_road = 1 & driverViolator = 2 & e_scooter = 1]Minor/No Injury12.50.385
16If [M_on_the_road = 1 & Driverage_Lessthan25 = 1 & driverSex = 1 & driverDistractedBy = 1 & e_scooter = 1]Fatal/Serious Injury3.10.400
17If [M_on_the_road = 1 & Driverage_Lessthan25 = 1 & driverSex = 1 & driverDistractedBy = 2 & e_scooter = 1]Minor/No Injury6.20.417
18If [driverDistractedBy = 1 & driverIntent_TurningRight = 1 & MViolator = 2 & Wheelchair = 1]Fatal/Serious Injury17.20.429
19If [driverDistractedBy = 2 & driverIntent_TurningRight = 1 & MViolator = 2 & Wheelchair = 1]Minor/No Injury1.60.435
20If [Intersection = 1 & driverIntent_TurningLeft = 1 & MViolator = 1 & MHazdAction_Improper_lane_use = 1]Fatal/Serious Injury3.10.444
21If [HighSpeedLimit = 2 & driverDistractedBy = 2 & Mage_Lessthan25 = 1 & TruckBus = 2]Minor/No Injury20.30.446
22If [HighSpeedLimit = 2 & Stop_control = 1 & Driverage_between25_60 = 1 & driverIntent_TurningLeft = 1 & MViolator = 1 & Skateboard = 1]Fatal/Serious Injury4.70.462
23If [Yield_control = 1 & driverIntent_Changing_Lanes = 1 & Mage_geaterthan60 = 1]Fatal/Serious Injury3.10.500
24If [Nighttime = 1 & MHazdAction_Failed_to_yield = 1 & Bicycle = 1]Fatal/Serious Injury7.80.500
25If [M_on_the_shoulder = 1 & HighSpeedLimit = 1 & Bicycle = 1]Fatal/Serious Injury14.10.523
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qawasmeh, B.; Oh, J.-S.; Kwigizile, V. Micro-Mobility Safety Assessment: Analyzing Factors Influencing the Micro-Mobility Injuries in Michigan by Mining Crash Reports. Future Transp. 2024, 4, 1580-1601. https://doi.org/10.3390/futuretransp4040076

AMA Style

Qawasmeh B, Oh J-S, Kwigizile V. Micro-Mobility Safety Assessment: Analyzing Factors Influencing the Micro-Mobility Injuries in Michigan by Mining Crash Reports. Future Transportation. 2024; 4(4):1580-1601. https://doi.org/10.3390/futuretransp4040076

Chicago/Turabian Style

Qawasmeh, Baraah, Jun-Seok Oh, and Valerian Kwigizile. 2024. "Micro-Mobility Safety Assessment: Analyzing Factors Influencing the Micro-Mobility Injuries in Michigan by Mining Crash Reports" Future Transportation 4, no. 4: 1580-1601. https://doi.org/10.3390/futuretransp4040076

APA Style

Qawasmeh, B., Oh, J.-S., & Kwigizile, V. (2024). Micro-Mobility Safety Assessment: Analyzing Factors Influencing the Micro-Mobility Injuries in Michigan by Mining Crash Reports. Future Transportation, 4(4), 1580-1601. https://doi.org/10.3390/futuretransp4040076

Article Metrics

Back to TopTop