Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors

Hamdan, Noura; Sipos, Tibor

doi:10.3390/futuretransp5040197

Open AccessArticle

Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors

by

Noura Hamdan

^1,*

and

Tibor Sipos

^1,2

¹

Department of Transport Technology and Economics, Faculty of Transportation Engineering and Vehicle Engineering, Budapest University of Technology and Economics, Megyetem rkp. 3., H-1111 Budapest, Hungary

²

KTI Hungarian Institute for Transport Sciences Nonprofit Ltd., Than Károly Street 3-5., H-1119 Budapest, Hungary

^*

Author to whom correspondence should be addressed.

Future Transp. 2025, 5(4), 197; https://doi.org/10.3390/futuretransp5040197

Submission received: 27 August 2025 / Revised: 5 December 2025 / Accepted: 11 December 2025 / Published: 12 December 2025

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of road traffic crash severity is essential for developing data-driven safety strategies and optimizing resource allocation. This study presents a predictive modeling framework that utilizes Random Forest (RF), Gradient Boosting (GB), and K-Nearest Neighbors (KNN) to estimate segment-level frequencies of fatalities, serious injuries, and slight injuries on Hungarian roadways. The model integrates an extensive array of predictor variables, including roadway geometric design features, traffic volumes, and traffic composition metrics. To address class imbalance, each severity class was modeled using resampled datasets generated via the Synthetic Minority Over-sampling Technique (SMOTE), and model performance was optimized through grid-search cross-validation for hyperparameter optimization. For the prediction of serious- and slight-injury crash counts, the Random Forest (RF) ensemble model demonstrated the most robust performance, consistently attaining test accuracies above 0.91 and coefficient of determination (R²) values exceeding 0.95. In contrast, for fatalities count prediction, the Gradient Boosting (GB) model achieved the highest accuracy (0.95), with an R² value greater than 0.87. Feature importance analysis revealed that heavy vehicle flows consistently dominate crash severity prediction. Horizontal alignment features primarily influenced fatal crashes, while capacity utilization was more relevant for slight and serious injuries, reflecting the roles of geometric design and operational conditions in shaping crash occurrence and severity. The proposed framework demonstrates the effectiveness of machine learning approaches in capturing non-linear relationships within transportation safety data and offers a scalable, interpretable tool to support evidence-based decision-making for targeted safety interventions.

Keywords:

traffic crash severity; random forest; gradient boosting; k-nearest neighbors; ensemble learning; geometric design; imbalanced data

1. Introduction

Road traffic crashes (RTCs) remain a persistent global challenge, exacting substantial human, social, and economic tolls. These incidents result in millions of fatalities and injuries annually, straining healthcare systems and disrupting economic productivity due to emergency responses, long-term rehabilitation, and lost labor contributions [1,2,3]. From an infrastructure planning perspective, the ability to accurately predict the severity of road traffic incidents at a granular level, such as the roadway segment, is fundamental to enabling data-driven safety interventions. Effective prediction informs not only where but also how safety measures such as geometric redesign, enforcement strategies, and traffic control systems should be deployed to maximize reductions in crash severity and frequency [4,5,6]. Conventional statistical approaches, including Poisson and negative binomial regression models, have historically served as the analytical backbone for crash frequency and severity modeling [7,8]. While these models offer transparency and statistical rigor, they are frequently limited in their capacity to account for the complex, non-linear interactions that characterize real-world crash dynamics [9,10]. Variables such as road geometry, traffic flow composition, and temporal patterns often interact in ways that violate the assumptions underpinning traditional parametric techniques. In response to these limitations, the transportation safety field has increasingly adopted machine learning (ML) methodologies, which offer enhanced flexibility in uncovering hidden patterns within high-dimensional datasets [11]. Techniques such as support vector machines, artificial neural networks, and ensemble methods like gradient-boosted trees have shown improved performance in classifying crash severity [12,13,14,15]. However, a substantial portion of existing ML-based research tends to either aggregate crash outcomes or divide severity levels into simplistic binary categories (e.g., fatal vs. non-fatal) [16,17,18]. This oversimplification undermines the specificity needed for granular policy design and weakens the alignment between predictive insights and real-world safety interventions.

To address these gaps, the present study introduces a comprehensive ensemble-learning framework utilizing three machine-learning algorithms, principally Random Forest, Gradient Boosting, and K-Nearest Neighbors classifiers, to model segment-level counts of three crash severity outcomes: fatalities, serious injuries, and slight injuries. The predictive models are constructed using a rich set of cross-sectional covariates, incorporating roadway design characteristics (e.g., horizontal and vertical alignment), traffic-volume attributes, and disaggregated vehicle-class flow metrics [19]. To counter the pronounced class imbalance that often characterizes crash data, the study implements the Synthetic Minority Over-sampling Technique (SMOTE) [20,21]. Furthermore, a stratified sampling strategy is adopted to maintain proportional distributions of severity classes within both the training and testing subsets, thereby improving the statistical robustness and reliability of model evaluation. Hyperparameter optimization is achieved through grid-search cross-validation to enhance generalizability and predictive robustness [22,23]. Model performance is comprehensively assessed using a suite of evaluation metrics, including accuracy, precision, F1-score, the Matthews correlation coefficient (MCC), and the geometric mean (G-Mean), to provide a rigorous and balanced appraisal of predictive quality across severity classes. Unlike prior efforts that prioritize model interpretability or predictive accuracy, this study seeks to harmonize both objectives. By leveraging the inherent feature-ranking capability of ensemble algorithms, particularly Random Forest and Gradient Boosting, the developed framework not only achieves high classification performance but also produces interpretable insights regarding the relative importance of predictor variables. This dual capacity ensures that the framework remains both practically applicable and theoretically grounded.

The remainder of the paper is organized as follows: Section 2 presents a review of related literature, situating this work within the broader context of crash severity modeling. Section 3 details the data sources, preprocessing procedures, and modeling methodology. Section 4 reports the empirical findings and discusses the implications of the results. Finally, Section 5 outlines practical applications, policy considerations, and potential directions for future research.

2. Related Work

The modeling of road traffic injury severity has traditionally relied on classical count-data regression techniques, particularly Poisson and negative binomial models. These methods have been extensively used to estimate crash frequency and severity as functions of roadway geometry, traffic exposure, and environmental conditions [24,25]. Consequently, such models often struggle to capture the complex, non-linear interactions and threshold effects that characterize real-world traffic incidents, particularly when modeling varying levels of injury severity.

In response to these limitations, the traffic safety literature has increasingly embraced machine learning (ML) techniques as more flexible, data-driven alternatives. Numerous studies employing algorithms such as support vector machines (SVMs), gradient-boosted decision trees (GBDTs), and other ensemble methods have demonstrated enhanced capabilities in uncovering latent patterns and modeling non-linear relationships in crash data [26,27,28]. Despite their improved predictive performance, most existing ML-based models approach crash severity as a binary classification task, typically distinguishing fatal from non-fatal crashes, thereby oversimplifying a multifaceted outcome and limiting the granularity required for targeted policy interventions [29].

Another persistent methodological challenge concerns class imbalance. Severe and fatal crashes comprise a small proportion of total incidents, which are typically dominated by minor injuries or property-damage-only events. This imbalance skews model training, often resulting in biased classifiers that underperform in predicting rare but critical outcomes. To mitigate this, techniques such as the Synthetic Minority Over-sampling Technique (SMOTE) have been applied with success in binary classification contexts, enabling improved sensitivity to minority classes without distorting the underlying data distribution [30,31]. However, their use in multi-class or count-based severity prediction remains limited, thereby constraining broader applicability in multi-output modeling frameworks. Among ensemble learning methods, Random Forests (RF) has emerged as a particularly robust and interpretable approach for traffic crash modeling. As a bagging-based technique, RF constructs an ensemble of decision trees using bootstrapped subsets of the data and aggregates their outputs to improve generalization performance. The algorithm is well-suited for handling high-dimensional inputs, non-linear relationships, multicollinearity, and noisy data, all common characteristics of transportation safety datasets [32,33]. Furthermore, RF models offer intrinsic measures of variable importance, facilitating post hoc interpretability and yielding in-sights into the relative influence of predictor variables. Comparative studies have consistently highlighted the superior performance of Random Forests relative to both traditional statistical models and other ML algorithms in crash severity classification tasks. Gains in key evaluation metrics, including accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic (ROC) curve, have been reported even under conditions of class imbalance and data heterogeneity [34,35]. Nevertheless, much of the existing literature continues to treat crash severity as a classification problem and either aggregates injury categories or disregards the need for predicting real-valued severity counts. This limits the operational utility of such models in segment-level resource allocation and proactive safety planning. Moreover, the integration of detailed roadway design features and high-resolution traffic composition variables in machine learning models remains underdeveloped. While individual geometric attributes such as curvature or horizontal alignment have been explored in isolation, the combined use of geometric variables and vehicle-type-specific flow metrics is still underrepresented in the literature, despite their demonstrated relevance in determining injury severity outcomes. Table 1 summarizes the methods and the key findings in the relevant studies.

To address these research gaps, this study develops a comprehensive ensemble-learning framework comprising Random Forest, Gradient Boosting, and K-Nearest Neighbors models to predict segment-level counts of fatalities, serious injuries, and slight injuries. A key limitation in prior crash-severity research is the restricted resolution and geographic diversity of commonly used datasets, which often rely on aggregated traffic measures and coarse roadway classifications [18,19,26,46,47,48,49]. In contrast, this study introduces a novel contribution by utilizing a uniquely detailed national dataset from Hungary that integrates high-resolution geometric attributes with disaggregated traffic-flow indicators for multiple vehicle classes, enabling a more precise examination of how vehicle-type composition interacts with segment-level design features. Building on this rich feature set, the modeling framework incorporates roadway geometry, traffic volume, and vehicle-class distributions, while addressing class imbalance through SMOTE and ensuring statistical consistency via stratified sampling. Hyperparameter tuning through grid-search cross-validation further enhances model robustness, resulting in an interpretable and scalable approach to segment-level crash severity counts prediction.

3. Materials and Methods

This study implements a data-driven ensemble-learning framework to model segment-level road traffic injury counts using multiple supervised machine-learning algorithms. Specifically, three independent predictive models are developed—Random Forest (RF), Gradient Boosting (GB), and K-Nearest Neighbors (KNN), as shown in Figure 1. Each model is tailored to predict the number of injury outcomes per road segment across three distinct severity categories: (1) fatalities, (2) serious injuries, and (3) slight injuries. The models are trained on a rich set of covariates that characterize roadway geometric design, traffic volume, and vehicle composition, enabling nuanced, high-resolution injury forecasting grounded in empirical roadway conditions.

3.1. Data Description and Feature Engineering

Crash severity classification in this study follows the Hungarian national road crash database, which adheres to the European Union CARE (Community Road Accident Database) definition. In this classification, a fatal crash is one in which at least one person dies within 30 days of the crash; a serious (severe) injury crash involves at least one person who sustains injuries requiring hospital treatment for more than 24 h; and a slight injury crash involves injuries requiring medical attention but less than 24 h of hospitalization [50]. Furthermore, in this study, a road segment is defined according to the specifications provided by the Hungarian Institute for Transport Sciences and Logistics Non-profit Ltd. (KTI). Each segment is uniquely identified by the road number and precise start and end locations, including kilometer and meter coordinates, as well as segment codes. This definition provides spatial consistency and enables accurate aggregation of traffic, geometric, and crash data for modeling purposes. This research utilizes a detailed segment-level crash dataset compiled from the Hungarian national road network for the year 2023. Each record represents the actual number of persons injured or fatally wounded in traffic crashes along a specific roadway segment. dataset integrates geometric design characteristics, traffic exposure indicators, and vehicle composition metrics, forming a robust basis for developing machine learning-based models of crash injury severity.

The dataset comprises 12,216 crash-related injury records representing the actual number of individuals with severe injuries or fatal outcomes across the Hungarian road network. Of these, 9571 correspond to slight injuries (≈67%), 4071 to serious injuries (≈29%), and 626 to fatal outcomes (≈4%). This proportional distribution reflects the empirical injury burden within the Hungarian network, where slight injuries dominate, yet fatal and serious crashes remain concentrated on specific high-risk corridors [51,52]. The inclusion of segment-level geometric and operational indicators enables a comprehensive representation of crash exposure conditions across diverse roadway environments.

To ensure analytical rigor and reproducibility, multiple preprocessing steps were undertaken. Alphanumeric road identifiers were numerically encoded using a LabelEncoder, ensuring consistency across all feature types and eliminating non-numeric symbols from feature names to facilitate model processing. Missing entries were imputed using median substitution, while extreme values were filtered using the interquartile range (IQR) method to reduce statistical distortion [53,54].

A summary of key variables is presented in Table 2, highlighting the geometric and operational distinctions across injury severity levels. Fatal and serious crashes were predominantly concentrated on rural, undivided two-lane roads (over 80%), emphasizing the influence of high-speed, bidirectional traffic with limited separation. Primary and secondary main roads accounted for the majority of severe outcomes, while expressways exhibited a relatively higher share of slight injuries, reflecting improved geometric standards and access control.

Traffic exposure indicators further underscore the risk disparity. The Annual Average Daily Traffic (AADT) averaged approximately 13,000 veh/day for fatal segments, marginally higher than for serious and slight injury segments (12,640 and 11,849 veh/day, respectively). Heavy vehicle presence was a notable differentiator, with truck and bus flows constituting a higher proportion of traffic on high-severity segments. Table 3 provides an overview of the dataset variables used in this study.

3.2. Machine Learning Models

Three machine learning classification algorithms were selected based on their proven utility in transportation safety modeling and ability to handle mixed-type, high-dimensional datasets:

3.2.1. Random Forest (RF)

Random Forest (RF) is a popular ensemble learning method for classification tasks, it was first introduced by Ho (1995) [55] and later improved by Breiman (2001) [56]. This method leverages the bagging technique, which builds multiple independent decision trees during the training phase by utilizing feature randomness to create a diverse set of uncorrelated trees. Each tree in the forest is trained on a different bootstrap sample, where bootstrapping refers to the sampling of data with replacement, ensuring that each tree is exposed to a slightly different subset of the data [57]. The underlying principle of this ensemble method is that combining multiple models leads to improved generalization and consequently, higher prediction accuracy. Mathematically, the prediction for a given instance

X

is determined by Equation (1):

\hat{y} = \frac{1}{T} \sum_{t = 1}^{T} f_{t} (X)

(1)

where

T

represents the total number of trees in the forest

f_{t} (X)

the prediction of the t-th tree for the input X

\hat{y}

is the final predicted output, typically determined by majority vote in classification tasks

For classification tasks, each tree in the forest produces a class label

\hat{y}

, and the final prediction is given by:

\hat{y}

= mode (

y_{1}, y_{2}, \dots, y_{T})

, where the mode function returns the most frequent class label from the predictions of all the trees. This ensemble approach allows the random forest to significantly improve the model’s accuracy compared to individual decision trees, especially in terms of generalization to unseen data.

3.2.2. Gradient Boosting Machines (GBM)

Constitute a class of powerful ensemble learning algorithms that construct predictive models through the sequential integration of weak learners, most commonly shallow decision trees, via gradient-based optimization in function space [58]. At each iteration, the algorithm fits a new tree

h_{m} (x)

to the negative gradient of the selected loss function

L

(y, F(

x

)), yielding a pseudo-residual defined as:

r_{i m} = - \frac{\partial L (y_{i}, F (x_{i}))}{\partial F (x_{i})}

(2)

The ensemble model is then updated using:

F_{m} (x) = F_{m - 1} (x) + ν \cdot γ_{m} h_{m} (x)

(3)

where ν ∈ (0, 1] is the learning rate, and

γ_{m}

is a step-size parameter optimized at each iteration by minimizing the loss function:

γ_{m} = \arg \min_{γ} \sum_{i = 1}^{N} L (y_{i}, F_{m - 1} (x_{i}) + γ_{m} h_{m} (x_{i}))

(4)

To enhance generalization and reduce overfitting, GBM incorporates regularization through shrinkage (via the learning rate ν) and stochastic subsampling of the training data at each boosting iteration. In this study, GBM was applied to the binary classification problem of predicting pedestrian priority at unsignalized intersections, capturing complex nonlinear dependencies among geometric, traffic, and behavioral variables. The logistic loss function was employed as the objective criterion for model optimization.

3.2.3. K-Nearest Neighbors (KNN)

The K-Nearest Neighbors (KNN) method [59] determines the class label of a test sample based on the labels of its

k

nearest neighbors. Suppose a distance metric is predefined (such as Euclidean distance, Mahalanobis distance, etc.). For any test sample

x

, its

k

nearest neighbors can be identified and denoted as

N_{k} (x)

. The class label of

x

is then determined by the labels of the training samples in

N_{k} (x)

, which can be expressed as:

f (x) = \underset{c_{j} \in C}{\arg \max} \sum_{x_{i} \in N_{k} (x)} I (y_{i} = c_{j})

(5)

where C = {c1, c2, …, cm},

y_{i}

is the class label of

x_{i}

, m is the number of classes.

Here

f (x)

, represents the decision function, and

I (.)

is the indicator function. For each class

c_{j}

, the indicator function is defined as:

(y_{i} = c_{j}) = \{\begin{matrix} 1, & i f y_{i} = c_{j} \\ 0, & o t h e r w i s e \end{matrix}

(6)

In the KNN model, the form of the decision function may vary depending on the chosen strategy. In this study, the decision function is derived according to a voting strategy, where the majority label among the

k

nearest neighbors determines the predicted class.

3.3. Hyperparameter Tuning and Model Optimization

For each injury severity category, feature matrices and label vectors were constructed by isolating the corresponding target count variable and excluding it from the predictor set. Due to the inherent class imbalance, particularly for fatal and serious injury outcomes, the Synthetic Minority Over-sampling Technique (SMOTE) was applied. This procedure generates synthetic observations for minority classes in the training data, thereby improving model sensitivity without overfitting to overrepresented outcomes [60,61].

To preserve outcome distribution proportions, an 80/20 stratified train/test split was implemented using the StratifiedShuffleSplit function. This ensured that the severity classes remained proportionally represented in both training and testing datasets, a crucial consideration for reliable model evaluation. All computational procedures were conducted using Python 3.10. Data preprocessing and manipulation were performed using the pandas and numpy libraries. Model development and evaluation leveraged scikit-learn, while oversampling of minority classes was handled with imblearn. The complete modeling pipeline, including, feature engineering, class rebalancing via SMOTE, model training, hyperparameter tuning, and performance evaluation, was encapsulated within a modular and fully reproducible script. Final outputs, including performance metrics and ranked feature importance, were programmatically exported into structured Excel workbooks using pandas. ExcelWriter ensured transparency, auditability, and ease of downstream analysis. Each (Random Forest, Gradient Boosting, and K-Nearest Neighbors) model was instantiated using corresponding scikit-learn class from the scikit-learn library, with a fixed random seed for reproducibility. To optimize model complexity and generalization, a grid search over the hyperparameter n_estimators was conducted. Five-fold cross-validation on the training set was used to identify the hyperparameter combination that maximized classification accuracy. This method was selected for its balance between computational efficiency and statistical reliability, providing a stable estimate of each model’s performance. By averaging results over multiple folds, it mitigates overfitting and enhances the robustness and generalizability of the final models [62,63,64].

The hyperparameter grids are summarized in Table 4. Grid-search cross-validation was thus instrumental in reducing overfitting risk and enhancing model robustness across all severity categories [22,65].

3.4. Model Evaluation and Performance Metrics

Model performance was assessed using accuracy, weighted precision, F1-score, G-Mean, MCC, and the coefficient of determination (R²) [66,67]. The accuracy score provides the proportion of correct predictions, while precision and F1-score offer a more detailed view of how well each model performs across different classes [68]. These metrics are computed for each classifier. Equations (7)–(9) represent mathematical formulations used for metrics evaluation, as follows:

P r e c i s i o n = \frac{T P}{T P + F P}

(7)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(8)

F 1 = 2 \cdot \frac{p r e c i s o n \cdot r e c a l l}{p r e c i s o n + r e c a l l}

(9)

where true positive (TP) denotes positive samples predicted by the model as positive classes, and false positive (FP) denotes negative samples predicted by the model as a positive class; false negative (FN) denotes positive samples predicted by the model as a negative class.

The R² is a statistical metric used to measure how much of the outcome is expected (crashes). The R² values range from zero to one [0, 1]. Zero (0) illustrates that the crashes cannot be predicted based on the historically recorded crashes cases, while One (1) implies the perfect prediction and is given by Equation (10).

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{l})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{l})}^{2}}

(10)

where

{\hat{y}}_{l}

is the predicted value of the i th sample and

y_{i}

is the corresponding true value for the total n samples.

G-mean score, as shown in (11), is also important in the context of imbalanced data, as it measures the geometric mean of sensitivity and specificity, providing a balanced assessment of the model’s performance across classes.

G ‑ mean = \sqrt{(\frac{T P}{T P + F N}) \times (\frac{T N}{T N + F P})}

(11)

Matthew’s Correlation Coefficient (MCC) is highly valuable for imbalanced datasets as it considers all four quadrants of the confusion matrix (TP, TN, FP, FN), providing a more comprehensive evaluation of predictive performance that is robust to class distribution, as depicted in (12).

M C C = \frac{(T P \times T N) - (F P \times F N)}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(12)

The MCC quantifies the correlation between predicted and actual classifications, yielding a value between (−1) and (+1). A coefficient of (+1) signifies an ideal prediction, while (0) represents an average performance, and (−1) indicates a poor prediction. To enhance interpretability, feature importance scores were extracted from each trained model. These scores quantify the relative contribution of each predictor to the model’s output and provide practical insights into which geometric or traffic flow characteristics most influence injury counts at the segment level.

4. Results and Discussion

This section presents a comprehensive evaluation of the predictive performance of the models developed for estimating road traffic injury counts at the segment level across three severity categories: fatal, serious, and slight injuries. The analysis emphasizes both predictive fidelity and interpretability by examining model evaluation metrics, including accuracy, precision, F1-score, MCC, G-Mean, MSE, RMSE, and coefficient of determination (R²), as well as feature importance rankings derived from the trained models. These insights collectively inform a thematic synthesis of risk factors and policy implications for data-driven road safety interventions.

4.1. Model Performance Overview

The analysis that follows delineates the predictive behavior of the developed models across the three crash-severity categories, providing a coherent and rigorous examination of model performance for fatal, serious, and slight injury counts.

4.1.1. Fatal Injury Model

The optimal hyperparameter configurations were obtained through the grid-search cross-validation process. For the fatal-injury model, the Random Forest achieved its best performance with max_features = sqrt, max_depth = 10, min_samples_leaf = 3, min_samples_split = 5, and n_estimators = 300, while Gradient Boosting reached optimal results with max_depth = 3, min_samples_split = 5, n_estimators = 200, learning_rate = 0.05, and subsample = 0.9. The KNN model yielded its best predictive performance with n_neighbors = 3. These optimal settings reflect a balance between model complexity and generalization capacity, ensuring that the models capture the nonlinear relationships inherent in fatal crash occurrence data without overfitting. The Random Forest (RF) model demonstrated a robust predictive capability for segment-level fatal crash occurrence, outperforming both the Gradient Boosting (GB) and K-Nearest Neighbors (KNN) models across all evaluated performance metrics (Table 5). In the training phase, RF achieved exceptionally high accuracy (0.991), precision (0.992), and F1-score (0.992), all accompanied by a Matthews Correlation Coefficient (MCC) of 0.988 and a G-Mean of 0.988. These results indicate a high level of internal consistency and model stability, with minimal bias-variance tradeoff effects. While, in the test set, Gradient Boosting, achieves the higher accuracy (0.946), precision (0.0.947), and F1-score (0.946), outperforming the RF model (accuracy = 0.937) and substantially exceeding KNN (accuracy = 0.722).

The GB model also yielded superior statistical indicators of fit, including a test (R²) of 0.867, signifying that approximately 87% of the variance in fatal crash frequency was successfully explained by the model. Its error measures, namely MSE (0.0896) and RMSE (0.2993), were notably lower than KNN (MSE = 0.4836, RMSE = 0.6954), further reinforcing the model’s precision in capturing nonlinear relationships within the data. The RF model exhibited comparable results in terms of (R²) (0.839) and RMSE (0.328), suggesting competitive performance, though the GB model’s interpretability and stability favored its selection as the optimal classifier.

The feature-importance assessment yielded substantive insights into the variables contributing most strongly to the predictive performance of the fatality-count model (Figure 2). Geometric design characteristics emerged as the most influential predictors within the model structure, with the direction of horizontal alignment (0.0988), the radius of horizontal curvature (0.0962), and the urban–rural section classification (0.0938) exhibiting the highest contributions to classification accuracy. This ordering indicates that roadway alignment and the contextual operating environment provide substantial discriminative information for distinguishing among fatality count levels, underscoring the relevance of geometric–operational context in shaping the model’s internal decision structure. The prominence of alignment direction and curvature is consistent with established safety literature, although in this case the model’s importance ranking reflects their predictive value rather than their causal effect on crash outcomes.

Bicycle traffic (0.0805) also demonstrated considerable importance to the model’s predictive capability. The relatively high contribution of this variable suggests that segments with elevated bicycle exposure display distinctive patterns that assist the classifier in differentiating fatal-count categories. Within motorized traffic classes, total single-unit bus flow (0.0788), tractor-trailer activity (0.0744), and roadway category classification (0.0744) collectively exhibited strong predictive contributions. These findings indicate that heavy-vehicle activity and functional roadway hierarchy provide meaningful signals to the model regarding the likelihood of observing higher fatality counts, aligning with broader empirical trends but remaining interpretive rather than causal within this modeling framework.

Additional predictors—including the number of traffic lanes (0.0633), articulated bus traffic (0.0550), motorcycle and moped volumes (0.0479), and trailer-based truck flows (0.0463)—further enhanced model accuracy through their representation of cross-sectional characteristics, traffic heterogeneity, and vulnerable-road-user dynamics. Heavy-truck (0.0410) and light-truck traffic (0.0391) displayed moderate importance, suggesting that while they contribute meaningfully to the predictive task, their informational value is secondary relative to the higher-ranked geometric and heavy-vehicle variables.

Lower-ranking features, such as capacity utilization (0.0327) and slow-vehicle or agricultural-tractor traffic (0.0271), still contributed to model performance but with diminished influence. These variables likely capture localized effects associated with variability in traffic flow and speed dispersion. Medium-heavy two-axle truck volumes (0.0237) and track configuration (0.0168) displayed similarly modest predictive contributions. In contrast, vertical-alignment features—including slope or gradient, type of vertical curve, and vertical-curve radius—exhibited minimal importance, indicating that these attributes provided only limited discriminative value for the fatality-count classification task within the context of this roadway network.

Overall, the feature-importance analysis highlights that the model predominantly relies on horizontal geometric configuration, roadway environment classification, and the composition and intensity of mixed traffic flows to differentiate between fatality-count levels. The relevance of heavy-vehicle exposure and vulnerable-road-user presence is evident in their predictive contributions, reflecting their influence on the underlying data patterns.

4.1.2. Serious Injury Model

In the serious-injury category, the Random Forest model achieved optimal performance with n_estimators = 300, max_depth = 20, max_features = sqrt, min_samples_leaf = 3, and min_samples_split = 5. The best-performing Gradient Boosting model used learning_rate = 0.05, max_depth = 3, min_samples_split = 5, n_estimators = 200, and subsample = 0.9, while KNN again demonstrated the highest stability with n_neighbors = 7. Table 6 summarizes the comparative results across the training and testing phases. During training, the RF model achieved an accuracy of 0.950, precision of 0.956, and F1-score of 0.951, with a Matthews Correlation Coefficient (MCC) of 0.944 and G-Mean of 0.971, indicating strong predictive coherence across class distributions. Its generalization capability remained robust in the test phase, with accuracy (0.938), precision (0.943), and F1-score (0.939) all exceeding those of KNN (accuracy = 0.882) and nearly matching GB (accuracy = 0.898).

Error-based measures further substantiate the Random Forest model’s superior performance. The test mean squared error (MSE = 0.186) and root mean squared error (RMSE = 0.431) were markedly lower than those of KNN (MSE = 0.578, RMSE = 0.761), reflecting more precise estimation of serious-injury counts. The coefficient of determination (R²) for the RF model was 0.99 on the test set, demonstrating that nearly 99% of the variability in serious-injury crash frequency was explained by the selected predictors. This degree of explanatory power emphasizes the model’s effectiveness in capturing nonlinear and interaction effects between geometric design and traffic flow variables.

Although Gradient Boosting produced slightly higher training (R²) (0.978) and test (R²) (0.973) values, its marginal advantage came at the cost of increased computational intensity. Therefore, considering predictive fidelity, generalization stability, and interpretability, the Random Forest model was identified as the most effective classifier for serious-injury crash prediction.

The feature-importance evaluation derived from the Gradient Boosting model (Figure 3) indicates that serious-injury counts are predicted through a multifaceted combination of traffic composition, operational characteristics, and roadway geometry. The model identified heavy truck traffic (0.080) as the most influential contributor to predictive performance, suggesting that segments with substantial high-mass vehicle presence exhibit distinctive data patterns that aid the classifier in differentiating serious-injury count levels. Light truck traffic (0.075) and trailer-based truck flows (0.071) also provided strong predictive value, reflecting the structural relevance of freight-related vehicle activity within the model, rather than implying a direct causal relationship with injury severity.

Road category (0.070) demonstrated similarly high importance, indicating that roadway functional classification offers meaningful discriminative information regarding the operating environment in which serious-injury crashes occur. Variables of moderate influence—including articulated bus traffic (0.063), slow-vehicle and agricultural-tractor flows (0.063), direction of horizontal alignment (0.062), tractor-trailer with semitrailer activity (0.060), total single-unit bus traffic (0.059), and capacity utilization (0.058)—collectively highlight the informational role of vehicle heterogeneity, horizontal geometry, and demand-related operational conditions in enhancing the model’s classification accuracy.

Additional predictors such as medium-heavy two-axle truck volumes (0.056), section type (0.056), motorcycle and moped traffic (0.056), bicycle traffic (0.050), and the radius of horizontal curves (0.045) contributed moderately to the model structure. Their importance indicates that roadway configuration, vulnerable road-user exposure, and curvature characteristics provide supplementary discriminative cues for distinguishing serious-injury levels, consistent with their roles in shaping the variability observed in the dataset.

Variables associated with administrative attributes and basic geometric features—including the number of lanes (0.036), slope or gradient (0.023), track code (0.018), and vertical-alignment parameters—exhibited minimal importance. Their limited contribution suggests that these features provided comparatively weak informational value for predicting serious-injury counts within the context of the studied network, though this result reflects model-based discrimination rather than the intrinsic safety relevance of the variables.

Overall, the feature-importance patterns indicate that the Gradient Boosting model relies predominantly on freight-related traffic activity, horizontal-alignment characteristics, operational demand, and vulnerable road-user presence to differentiate between serious-injury count levels. Conversely, vertical-alignment features and foundational cross-sectional descriptors played a secondary role in the predictive structure. These findings offer model-based insights into the types of roadway and traffic characteristics that most effectively support the classification of serious-injury outcomes.

4.1.3. Slight Injury Model

For the slight-injury model, the Random Forest model achieved optimal performance with n_estimators = 300, max_depth = 25, max_features = sqrt, min_samples_leaf = 2, and min_samples_split = 5, whereas Gradient Boosting performed best with max_depth = 3, n_estimators = 200, learning_rate = 0.05, min_sample_split = 5, and subsample = 0.9. The KNN classifier maintained its consistency with an optimal configuration of n_neighbors = 7.

The comparative evaluation of the three algorithms (Table 7) highlights the Random Forest model’s notable ability to balance predictive accuracy and generalization stability. During training, RF achieved an accuracy of 0.971, precision of 0.976, and F1-score of 0.972, accompanied by a Matthews Correlation Coefficient (MCC) of 0.968 and a G-Mean of 0.984. These values reflect the model’s strong discriminatory power across both classes of slight-injury crash presence and absence.

When applied to the test dataset, the RF model maintained high generalization capability, with accuracy (0.906), precision (0.911), and F1-score (0.907). Its performance was consistently superior to both KNN (accuracy = 0.839) and GB (accuracy = 0.716), reaffirming the robustness of the ensemble-based approach in managing the inherent nonlinearity and multicollinearity present in the data.

In terms of goodness-of-fit measures, RF attained a test (R²) of 0.949, indicating that approximately 93% of the variance in slight-injury crash counts was explained by the model. This is a substantial predictive capability given the stochastic nature of crash occurrence data. The model also achieved relatively low error indicators (MSE = 0.602, RMSE = 0.776), markedly outperforming KNN (MSE = 1.883, RMSE = 1.372) and GB (MSE = 3.315, RMSE = 1.821). Collectively, these findings affirm the Random Forest model’s capacity to provide stable and interpretable predictions for the counts of slight-injury crashes, a category often characterized by higher data variability and less distinct patterning than fatal or serious-injury cases.

The feature-importance results derived from the Random Forest model (Figure 4) reveal that slight-injury counts are primarily differentiated by operational load indicators and traffic-composition variables, supplemented by contributions from horizontal geometric characteristics. Capacity utilization exhibited the highest importance value (0.087), indicating that variation in demand intensity and near-capacity flow conditions provides substantial discriminative information for predicting slight-injury levels. This prominence reflects the model’s sensitivity to operational states in which increased interaction frequency and reduced speed variance are characteristic of environments where lower-severity crash outcomes are more prevalent. It is important to note, however, that this relationship emerges from predictive structure rather than from a mechanistic or causal interpretation.

Traffic-composition attributes formed the next major tier of influential predictors. Total single-unit bus traffic (0.073), heavy truck volumes (0.071), light truck traffic (0.071), and medium-heavy two-axle truck flows (0.070) all contributed markedly to the model’s performance. Their importance suggests that roadway segments with pronounced heavy-vehicle presence exhibit distinguishable patterns in slight-injury crash occurrences, enabling the classifier to more accurately separate injury-count classes. Similar predictive value was observed for tractor-trailer with semitrailer activity (0.065), road-category classification (0.064), and truck-trailer or semitrailer traffic (0.063), underscoring the informational relevance of freight-vehicle heterogeneity and roadway functional hierarchy in describing the operational contexts associated with lower-severity outcomes.

Vulnerable-road-user variables, including bicycle traffic (0.059) and motorcycle or moped activity (0.057), displayed moderate yet meaningful importance. Their contribution reflects the classifier’s ability to extract discriminative patterns from exposure to unprotected users, even when such interactions tend to align with lower-energy collisions characteristic of slight-injury events. Horizontal geometric attributes—the radius of horizontal curves (0.059) and the direction of horizontal alignment (0.054)—also demonstrated notable predictive value. These results indicate that lateral curvature and alignment orientation continue to inform the classification process, though their relative importance is reduced compared with their influence in more severe crash models.

Operational and traffic-mix variables, such as articulated bus traffic (0.050) and slow-vehicle or agricultural-tractor flows (0.048), provided additional discriminative insight. Their importance suggests that speed differentials introduced by slow-moving vehicles contribute to the variability that the model uses to differentiate slight-injury levels. Roadway environment indicators—including section type (0.041), number of lanes (0.038), and track code (0.022)—offered moderate predictive contribution, highlighting the supporting role of cross-sectional configuration and administrative classification in the model’s internal decision structure. Slope or gradient displayed limited importance (0.009), consistent with the tendency of vertical grade to exert a comparatively minor influence on the classification of low-severity outcomes within this context.

Vertical alignment descriptors, including the radius and type of vertical curves, exhibited negligible importance. Their minimal predictive contribution suggests that vertical geometry provides little discriminative value for slight-injury count prediction compared with the dominant effects of operational demand, heavy-vehicle composition, vulnerable-user exposure, and horizontal curvature characteristics.

Overall, the feature-importance analysis demonstrates that the slight-injury model predominantly relies on operational conditions, freight-vehicle activity, and selected horizontal geometric factors to distinguish between injury-count categories.

4.2. Cross-Model Insights and Thematic Synthesis

A comparative assessment of the three severity-specific models reveals several consistent predictive patterns, alongside notable differences in the variables contributing to the discrimination of fatal, serious-injury, and slight-injury count levels. Across all models, heavy-vehicle flows—including buses, tractor-trailers, and various truck classes—emerged as dominant predictors. Their recurrent importance reflects the strong discriminative value that freight-related and large-vehicle traffic volumes provide in distinguishing severity. These findings are consistent with previous studies indicating that heavy vehicles contribute disproportionately to crash severity due to their mass and kinetic energy, corroborating the predictive trends observed in European and North American road networks [69,70,71,72].
Horizontal alignment features, particularly curve radii, and alignment direction, exhibited varying degrees of importance across severity levels. These features demonstrated higher predictive contribution in the slight-injury model and progressively lower contributions in the serious-injury and fatal-injury models. This gradient suggests that the informational value of geometric complexity differs across severity types, enabling the models to extract distinct patterns from horizontal-alignment variability. This observation is in agreement with prior empirical studies that report sharper curves and alignment irregularities as key contributors to minor injury crashes, whereas severe outcomes tend to be influenced more by traffic composition and operational context [73,74].
Capacity utilization displayed a comparatively strong contribution in the slight-injury and serious-injury models, indicating that interactions linked to traffic demand and congestion offer meaningful predictive information for moderate-severity outcomes. This result highlights the relevance of dynamic operational states—captured indirectly through demand-related variables—in providing supplementary discriminatory signals beyond those embedded in static geometric or traffic-volume indicators [62,65,75].
Additional observations show that vulnerable road users, including motorcycles, mopeds, and bicycles, are of moderate importance across all models, with a greater influence in fatal crashes. These patterns align with previous research indicating that crashes involving unprotected road users disproportionately contribute to fatalities, while functional road hierarchy is consistently associated with exposure-related severity trends [18,76]. Road category likewise appeared as a consistently informative predictor across severity levels, suggesting that functional classification encapsulates multiple contextual attributes—such as access control, expected speed environments, and modal interactions—that collectively enhance predictive accuracy [77,78,79,80]. Furthermore, segment-level counts of articulated buses and slow-moving or agricultural vehicles exhibited a notable importance in both Random Forest and Gradient Boosting models, despite their limited prevalence in the traffic flow. These findings indicate that, although rare, these vehicle classes carry substantial predictive information for severe crash counts. Ensemble methods can exploit such low-frequency yet informative patterns, where a small number of high-information splits significantly enhance predictive performance [81,82]. This behavior is consistent with previous traffic safety studies, which show that even relatively infrequent vehicle types can be involved in severe crashes when operating in particular contexts. For instance, articulated buses account for 39% of bus-involved collisions with cyclists in Germany, second only to city buses [83]. Their unique articulated design also poses challenges for autonomous vehicles: in March 2023, a Cruise self-driving car rear-ended an articulated municipal bus in San Francisco because the AI mispredicted the motion of the bus’s rear section, despite detecting it with sensors [84]. These cases demonstrate that rare or less common vehicle types can still contribute disproportionately to severe outcomes [85,86]. Similarly, studies of rural corridors and arterial systems in developing regions show that agricultural vehicles, although infrequent, frequently co-occur with roadway conditions predictive of high-severity crashes [87,88]. These insights collectively affirm the ability of ensemble models (RF, and GB) to disentangle nuanced interactions among geometric, operational, and vehicular factors in the prediction of crash severity counts.

4.3. Policy and Planning Implications

The findings offer actionable insights for traffic safety policymakers and transportation engineers. The consistent association of heavy vehicle flows with all injury severities points to the need for:

Dedicated freight lanes and time-of-day restrictions for heavy vehicles, particularly along high-capacity corridors and primary arterials. Similar measures implemented in Germany and the Netherlands, through dedicated freight corridors have demonstrated measurable reductions in heavy-vehicle conflict points and crash rates.
Enhanced driver visibility infrastructure, particularly at curves and intersections, particularly at curves and intersections, where line-of-sight limitations exacerbate heavy-vehicle crash risks. Such interventions have proven effective in improving driver response times and reducing side-impact collisions on European rural arterials.
Vehicle-type-specific speed enforcement, which has shown positive outcomes in several European contexts. For instance, differentiated speed limits for heavy vehicles on rural expressways in Sweden and Finland have effectively reduced crash frequency and injury severity by mitigating speed dispersion between heavy and light vehicles.

Additionally, the strong influence of horizontal alignment supports geometric enhancements such as wider shoulders, super-elevation adjustments, and better signage at curves [89,90,91,92]. These design-oriented countermeasures are consistent with proven practices in European Vision Zero programs, where localized geometric corrections have contributed to substantial safety gains. The sensitivity of the prediction models to capacity utilization underscores the importance of real-time traffic management strategies, such as adaptive signal control, incident response systems, and dynamic lane allocation, to mitigate mid-severity crashes. Similar systems, implemented along European TEN-T (Trans-European Transport Network) corridors, have shown demonstrable reductions in mid-severity crashes by stabilizing flow conditions during congestion periods [93]. Collectively, the high performance of the Random Forest, and Gradient Boosting models indicates their practical utility as decision-support tools in proactive road safety management, by integrating these evidence-based interventions within Hungary’s strategic safety framework, policymakers can enable data-informed resource allocation, enhance network resilience, and achieve measurable reductions in crash severity across diverse roadway environments.

5. Conclusions

This study developed and evaluated a data-driven predictive framework for estimating segment-level road traffic injury counts on the Hungarian national road network using three supervised machine learning algorithms: Random Forest (RF), Gradient Boosting (GB), and K-Nearest Neighbors (KNN). The framework integrated a comprehensive set of explanatory variables encompassing roadway geometric design, traffic flow characteristics, and vehicle-type composition, while addressing class imbalance using the Synthetic Minority Over-sampling Technique (SMOTE) and optimizing model configurations through grid-search cross-validation.

For the prediction of serious- and slight-injury crash counts, the Random Forest (RF) ensemble model demonstrated the most robust performance, consistently attaining test accuracies above 0.91 and coefficient of determination (R²) values exceeding 0.95. In contrast, for fatalities count prediction, the Gradient Boosting (GB) model achieved the highest accuracy (0.95), with an R² value greater than 0.87. The feature importance analysis revealed consistent and interpretable patterns across severity levels. Heavy vehicle flows, including buses, and truck combinations, emerged as key predictors in all models, reaffirming their central role in shaping severe crash outcomes. Horizontal alignment features were most influential for fatal crashes, indicating that geometry governs crash occurrence, while capacity utilization was more prominent for slight and serious injuries, reflecting operational and congestion-related risk. Collectively, these findings highlight that traffic composition determines crash severity, geometry influences crash frequency, and operational factors modulate crash exposure dynamics.

From a practical standpoint, the results underscore the applicability of ensemble learning approaches as decision-support instruments for proactive road safety management. The models provide interpretable outputs that can inform infrastructure design, network monitoring, and safety audits, particularly for high-risk segments with elevated heavy vehicle exposure or geometric complexity.

Despite its robust performance, this study is subject to several limitations. First, the analysis relied on segment-level aggregated data, which constrains the capacity to capture driver-level behavioral heterogeneity, weather effects, or temporal dynamics. Second, while the Hungarian national road network provides a diverse dataset, the transferability of the model to other countries may be affected by contextual differences in traffic composition, enforcement practices, geometric design standards, and reporting accuracy.

Future research should aim to enhance model generalizability and spatial transferability by integrating multi-regional or cross-country datasets, enabling comparative validation across distinct roadway and traffic contexts. Incorporating temporal and environmental variables such as weather conditions, time-of-day, and seasonal patterns could further refine crash risk prediction accuracy. The integration of real-time traffic sensing data, such as probe vehicle trajectories or connected vehicle telemetry, would also allow for the development of dynamic crash risk forecasting systems. Moreover, the application of explainable methods, such as SHAP, is recommended to elucidate causal pathways and enhance model transparency for policy interpretation.

In conclusion, this study demonstrates that machine learning-based predictive models, particularly ensemble techniques, provide an effective, scalable, and interpretable foundation for understanding and mitigating road traffic injuries. The proposed framework contributes both methodologically and empirically to the advancement of data-driven road safety analytics, supporting the global pursuit of Vision Zero and the Sustainable Development Goal of reducing road fatalities and serious injuries through intelligent, evidence-based transport system design.

Author Contributions

Conceptualization, N.H. and T.S.; methodology, N.H.; software, N.H.; validation, N.H. and T.S.; formal analysis, N.H.; investigation, N.H.; resources, N.H.; data curation, N.H. and T.S.; writing—original draft preparation, N.H.; writing—review and editing, N.H. and T.S.; visualization, N.H.; supervision, T.S.; project administration, T.S.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is available based on a request from the corresponding author.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT-Open AI 5.1, and Grammarly for the purposes of improving the language of this research. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

Author Tibor Sipos was employed by the company KTI Hungarian Institute for Transport Sciences Nonprofit Ltd. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Al-lami, A.; Török, Á. Assessing sustainability indicators of public transportation using PAHP. Sustain. Futures 2025, 9, 100500. [Google Scholar] [CrossRef]
Sipos, T.; Afework Mekonnen, A.; Szabó, Z. Spatial Econometric Analysis of Road Traffic Crashes. Sustainability 2021, 13, 2492. [Google Scholar] [CrossRef]
Ötvös, V.; Török, Á. Measurement of Accident Risk and a Case Study from Hungary. Period. Polytech. Transp. Eng. 2024, 52, 159–165. [Google Scholar] [CrossRef]
Al-lami, A.; Török, Á. Regional forecasting of driving forces of CO2 emissions of transportation in Central Europe: An ARIMA-based approach. Energy Rep. 2025, 13, 1215–1224. [Google Scholar] [CrossRef]
Jima, D.; Sipos, T. The Impact of Road Geometric Formation on Traffic Crash and Its Severity Level. Sustainability 2022, 14, 8475. [Google Scholar] [CrossRef]
Pei, Y.; Hou, L. Safety Assessment and Risk Management of Urban Arterial Traffic Flow Based on Artificial Driving and Intelligent Network Connection: An Overview. Arch. Comput. Methods Eng. 2024, 31, 2925–2943. [Google Scholar] [CrossRef]
Lord, D.; Mannering, F. The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives. Transp. Res. Part A Policy Pract. 2010, 44, 291–305. [Google Scholar] [CrossRef]
Mannering, F.L.; Bhat, C.R. Analytic methods in accident research: Methodological frontier and future directions. Anal. Methods Accid. Res. 2014, 1, 1–22. [Google Scholar] [CrossRef]
Maji, A.; Ghosh, I. A systematic review on roundabout safety incorporating the safety assessment methodologies, data collection techniques, and driver behavior. Saf. Sci. 2025, 181, 106661. [Google Scholar] [CrossRef]
Khattak, M.W.; De Backer, H.; De Winne, P.; Brijs, T.; Pirdavani, A. Analysis of Road Infrastructure and Traffic Factors Influencing Crash Frequency: Insights from Generalised Poisson Models. Infrastructures 2024, 9, 47. [Google Scholar] [CrossRef]
Al-Mahamid, H.; Al-Nabulsi, D.; Torok, A. Developing safety performance functions incorporating pavement roughness using Poisson regression and Machine learning models on Jordan’s Desert Highway. Transp. Res. Interdiscip. Perspect. 2025, 34, 101659. [Google Scholar] [CrossRef]
Hamdan, N.; Sipos, T. Classification of Traffic Accident Severity Using Machine Learning Models. In Proceedings of the 2nd Cognitive Mobility Conference, Budapest, Hungary, 19–20 October 2025; pp. 177–186. [Google Scholar] [CrossRef]
Hamdan, N.; Sipos, T. Traffic Accidents Severity Prediction Using Support Vector Machine Models. In Proceedings of the 3rd Cognitive Mobility Conference, Budapest, Hungary, 7–8 October 2024; pp. 153–161. [Google Scholar] [CrossRef]
Jamal, A.; Zahid, M.; Tauhidur Rahman, M.; Al-Ahmadi, H.M.; Almoshaogeh, M.; Farooq, D.; Ahmad, M. Injury severity prediction of traffic crashes with ensemble machine learning techniques: A comparative study. Int. J. Inj. Control Saf. Promot. 2021, 28, 408–427. [Google Scholar] [CrossRef]
Yu, R.; Abdel-Aty, M. Utilizing support vector machine in real-time crash risk evaluation. Accid. Anal. Prev. 2013, 51, 252–259. [Google Scholar] [CrossRef]
Altaf, I.; Kaul, A. Classifying victim degree of injury in road traffic accidents: A novel stacked DCL-X approach. Multimed. Tools Appl. 2024, 83, 66691–66723. [Google Scholar] [CrossRef]
Wen, X.; Xie, Y.; Jiang, L.; Pu, Z.; Ge, T. Applications of machine learning methods in traffic crash severity modelling: Current status and future directions. Transp. Rev. 2021, 41, 855–879. [Google Scholar] [CrossRef]
Macioszek, E.; Granà, A. The Analysis of the Factors Influencing the Severity of Bicyclist Injury in Bicyclist-Vehicle Crashes. Sustainability 2021, 14, 215. [Google Scholar] [CrossRef]
Santos, D.; Saias, J.; Quaresma, P.; Nogueira, V.B. Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction. Computers 2021, 10, 157. [Google Scholar] [CrossRef]
Boo, Y.; Choi, Y. Comparison of mortality prediction models for road traffic accidents: An ensemble technique for imbalanced data. BMC Public Health 2022, 22, 1476. [Google Scholar] [CrossRef] [PubMed]
Kuo, P.-F.; Hsu, W.-T.; Lord, D.; Putra, I.G.B. Classification of autonomous vehicle crash severity: Solving the problems of imbalanced datasets and small sample size. Accid. Anal. Prev. 2024, 205, 107666. [Google Scholar] [CrossRef]
Almahdi, A.; Al Mamlook, R.E.; Bandara, N.; Almuflih, A.S.; Nasayreh, A.; Gharaibeh, H.; Alasim, F.; Aljohani, A.; Jamal, A. Boosting Ensemble Learning for Freeway Crash Classification under Varying Traffic Conditions: A Hyperparameter Optimization Approach. Sustainability 2023, 15, 15896. [Google Scholar] [CrossRef]
Aziz, K.; Chen, F.; Khan, I.; Hussain Khahro, S.; Malik, M.A.; Ahmed Memon, Z.; Khattak, A. Road Traffic Crash Severity Analysis: A Bayesian-Optimized Dynamic Ensemble Selection Guided by Instance Hardness and Region of Competence Strategy. IEEE Access 2024, 12, 139540–139559. [Google Scholar] [CrossRef]
Azimian, A.; Dimitra Pyrialakou, V.; Lavrenz, S.; Wen, S. Exploring the effects of area-level factors on traffic crash frequency by severity using multivariate space-time models. Anal. Methods Accid. Res. 2021, 31, 100163. [Google Scholar] [CrossRef]
Mussone, L.; Bassani, M.; Masci, P. Analysis of factors affecting the severity of crashes in urban road intersections. Accid. Anal. Prev. 2017, 103, 112–122. [Google Scholar] [CrossRef] [PubMed]
Manirul Islam, S.; Washington, S.; Kim, J.; Haque, M. A comprehensive analysis on the effects of signal strategies, intersection geometry, and traffic operation factors on right-turn crashes at signalised intersections: An application of hierarchical crash frequency model. Accid. Anal. Prev. 2022, 171, 106663. [Google Scholar] [CrossRef]
Grigorev, A.; Mihaita, A.-S.; Chen, F.; Truong, L. Traffic Incident Duration Prediction: A Systematic Review of Techniques. J. Adv. Transp. 2024, 2024, 3748345. [Google Scholar] [CrossRef]
Kitali, A.E.; Mokhtarimousavi, S.; Kadeha, C.; Alluri, P. Severity analysis of crashes on express lane facilities using support vector machine model trained by firefly algorithm. Traffic Inj. Prev. 2020, 22, 79–84. [Google Scholar] [CrossRef]
Hamdan, N.; Sipos, T. Advancements in Machine Learning for Traffic Accident Severity Prediction: A Comprehensive Review. Period. Polytech. Transp. Eng. 2025, 53, 347–355. [Google Scholar] [CrossRef]
Cai, Q.; Abdel-Aty, M.; Yuan, J.; Lee, J.; Wu, Y. Real-time crash prediction on expressways using deep generative models. Transp. Res. Part C Emerg. Technol. 2020, 117, 102697. [Google Scholar] [CrossRef]
Chen, J.; Pu, Z.; Zheng, N.; Wen, X.; Ding, H.; Guo, X. A novel generative adversarial network for improving crash severity modeling with imbalanced data. Transp. Res. Part C Emerg. Technol. 2024, 164, 104642. [Google Scholar] [CrossRef]
Al-Yarimi, F.A.M. Enhancing road safety through advanced predictive analytics in V2X communication networks. Comput. Electr. Eng. 2024, 115, 109134. [Google Scholar] [CrossRef]
Yang, J.; Han, S.; Chen, Y.; Ghosh, I. Prediction of Traffic Accident Severity Based on Random Forest. J. Adv. Transp. 2023, 2023, 7641472. [Google Scholar] [CrossRef]
Islam, M.K.; Reza, I.; Gazder, U.; Akter, R.; Arifuzzaman, M.; Rahman, M.M. Predicting Road Crash Severity Using Classifier Models and Crash Hotspots. Appl. Sci. 2022, 12, 11354. [Google Scholar] [CrossRef]
Rahim, M.A.; Hassan, H.M. A deep learning based traffic crash severity prediction framework. Accid. Anal. Prev. 2021, 154, 106090. [Google Scholar] [CrossRef]
Dadashova, B.; Arenas-Ramires, B.; Mira-McWillaims, J.; Dixon, K.; Lord, D. Analysis of crash injury severity on two trans-European transport network corridors in Spain using discrete-choice models and random forests. Traffic Inj. Prev. 2020, 21, 228–233. [Google Scholar] [CrossRef]
Yassin, S.S.; Pooja. Road accident prediction and model interpretation using a hybrid K-means and random forest algorithm approach. SN Appl. Sci. 2020, 2, 1576. [Google Scholar] [CrossRef]
Atumo, E.A.; Fang, T.; Jiang, X. Spatial statistics and random forest approaches for traffic crash hot spot identification and prediction. Int. J. Inj. Control Saf. Promot. 2021, 29, 207–216. [Google Scholar] [CrossRef]
Akin, D.; Sisiopiku, V.P.; Alateah, A.H.; Almonbhi, A.O.; Al-Tholaia, M.M.H.; Al-Sodani, K.A.A. Identifying Causes of Traffic Crashes Associated with Driver Behavior Using Supervised Machine Learning Methods: Case of Highway 15 in Saudi Arabia. Sustainability 2022, 14, 16654. [Google Scholar] [CrossRef]
Gatera, A.; Kuradusenge, M.; Bajpai, G.; Mikeka, C.; Shrivastava, S. Comparison of random forest and support vector machine regression models for forecasting road accidents. Sci. Afr. 2023, 21, e01739. [Google Scholar] [CrossRef]
Nikolaou, D.; Ziakopoulos, A.; Dragomanovits, A.; Roussou, J.; Yannis, G. Comparing Machine Learning Techniques for Predictions of Motorway Segment Crash Risk Level. Safety 2023, 9, 32. [Google Scholar] [CrossRef]
Ahmed, S.; Hossain, M.A.; Ray, S.K.; Bhuiyan, M.M.I.; Sabuj, S.R. A study on road accident prediction and contributing factors using explainable machine learning models: Analysis and performance. Transp. Res. Interdiscip. Perspect. 2023, 19, 100814. [Google Scholar] [CrossRef]
Wang, X.; Su, Y.; Zheng, Z.; Xu, L. Prediction and interpretive of motor vehicle traffic crashes severity based on random forest optimized by meta-heuristic algorithm. Heliyon 2024, 10, e35595. [Google Scholar] [CrossRef] [PubMed]
Uzunov, H.V.; Matzinski, P.G.; Uzunov, V.H.; Dechkova, S.V. Comparative Analysis of the Proportional Distribution Method and the Random Forest Algorithm for Predicting Pedestrian Traffic Accident Risk. IEEE Access 2025, 13, 129828–129844. [Google Scholar] [CrossRef]
Daoud, R.; Vechione, M.; Gurbuz, O.; Sundaravadivel, P.; Tian, C.J.V. Comparison of machine learning models to predict nighttime crash severity: A case study in Tyler, Texas, USA. Vehicles 2025, 7, 20. [Google Scholar] [CrossRef]
AlKheder, S.; Gharabally, H.A.; Mutairi, S.A.; Mansour, R.A. An Impact study of highway design on casualty and non-casualty traffic accidents. Injury 2022, 53, 463–474. [Google Scholar] [CrossRef]
Li, J.; Li, C.; Zhao, X. Optimizing crash risk models for freeway segments: A focus on the heterogeneous effects of road geometric design features, traffic operation status, and crash units. Accid. Anal. Prev. 2024, 205, 107665. [Google Scholar] [CrossRef]
Vayalamkuzhi, P.; Amirthalingam, V. Influence of geometric design characteristics on safety under heterogeneous traffic flow. J. Traffic Transp. Eng. 2016, 3, 559–570. [Google Scholar] [CrossRef]
Zhao, J.; Guo, Y.; Liu, P. Safety impacts of geometric design on freeway segments with closely spaced entrance and exit ramps. Accid. Anal. Prev. 2021, 163, 106461. [Google Scholar] [CrossRef]
Jaber, A.; Juhász, J.; Csonka, B.J.S. An analysis of factors affecting the severity of cycling crashes using binary regression model. Sustainability 2021, 13, 6945. [Google Scholar] [CrossRef]
Jaber, A.; Csonka, B.J.S. Towards a sustainable and safe future: Mapping bike accidents in urbanized context. Safety 2023, 9, 60. [Google Scholar] [CrossRef]
Sánta, E.; Szűcs, P.; Patocskai, G.; Lakatos, I.J.E.P. Prevalence and Characteristics of Traffic Accidents Endangering Vulnerable Pedestrians in Hungary. Eng. Proc. 2024, 79, 94. [Google Scholar] [CrossRef]
Cantisani, G.; Del Serrone, G.; Mauro, R.; Peluso, P.; Pompigna, A.J.S. From Radar Sensor to Floating Car Data: Evaluating Speed Distribution Heterogeneity on Rural Road Segments Using Non-Parametric Similarity Measures. Sci 2024, 6, 52. [Google Scholar] [CrossRef]
Faruga, Ł.; Filapek, A.; Kraszewska, M.; Baranowski, J.J.A.S. Dataset for Traffic Accident Analysis in Poland: Integrating Weather Data and Sociodemographic Factors. Appl. Sci. 2025, 15, 7362. [Google Scholar] [CrossRef]
Ho, Y.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; pp. 278–282. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Zhang, W.; Wang, K.; Wang, S.; Jiang, Z.; Mondschein, A.; Noland, R.B. Synthesizing neighborhood preferences for automated vehicles. Transp. Res. Part C Emerg. Technol. 2020, 120, 102774. [Google Scholar] [CrossRef]
Li, K.; Xu, H.; Liu, X.J.C. Analysis and visualization of accidents severity based on LightGBM-TPE. Chaos Solitons Fractals 2022, 157, 111987. [Google Scholar] [CrossRef]
Gou, J.; Du, L.; Zhang, Y.; Xiong, T.J.J.I.C.S. A new distance-weighted k-nearest neighbor classifier. J. Inf. Comput. Sci. 2012, 9, 1429–1436. [Google Scholar]
Adeel, M.; Khattak, A.J.; Mishra, S.; Thapa, D. Enhancing work zone crash severity analysis: The role of synthetic minority oversampling technique in balancing minority categories. Accid. Anal. Prev. 2024, 208, 107794. [Google Scholar] [CrossRef]
Alrumaidhi, M.; Farag, M.M.G.; Rakha, H.A. Comparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniques. Sustainability 2023, 15, 9878. [Google Scholar] [CrossRef]
Gong, X.; Bo, W.; Chen, F.; Wu, X.; Zhang, X.; Li, D.; Gou, F.; Ren, H. Safety Evaluation of Highways with Sharp Curves in Highland Mountainous Areas Using an Enhanced Stacking and Low-Cost Dataset Production Method. Sustainability 2025, 17, 5857. [Google Scholar] [CrossRef]
Khan, W.A.; Moomen, M.; Rahman, M.A.; Terkper, K.A.; Codjoe, J.; Gopu, V. Predicting Crash-Related Incident Clearance Time on Louisiana’s Rural Interstate Using Ensemble Tree-Based Learning Methods. Appl. Sci. 2024, 14, 10964. [Google Scholar] [CrossRef]
Vincent, A.M.; Jidesh, P. An improved hyperparameter optimization framework for AutoML systems using evolutionary algorithms. Sci. Rep. 2023, 13, 4737. [Google Scholar] [CrossRef] [PubMed]
Alotaibi, J. Enhancing Traffic Accident Severity Prediction: Feature Identification Using Explainable AI. Vehicles 2025, 7, 38. [Google Scholar] [CrossRef]
Aldhari, I.; Almoshaogeh, M.; Jamal, A.; Alharbi, F.; Alinizzi, M.; Haider, H. Severity Prediction of Highway Crashes in Saudi Arabia Using Machine Learning Techniques. Appl. Sci. 2022, 13, 233. [Google Scholar] [CrossRef]
Kim, S.; Lym, Y.; Kim, K.-J. Developing Crash Severity Model Handling Class Imbalance and Implementing Ordered Nature: Focusing on Elderly Drivers. Int. J. Environ. Res. Public Health 2021, 18, 1966. [Google Scholar] [CrossRef]
Skaug, L.; Nojoumian, M.; Dang, N.; Yap, A. Road Crash Analysis and Modeling: A Systematic Review of Methods, Data, and Emerging Technologies. Appl. Sci. 2025, 15, 7115. [Google Scholar] [CrossRef]
Chen, M.-M.; Chen, M.-C. Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest. Information 2020, 11, 270. [Google Scholar] [CrossRef]
Du, X.; Wang, G. Analysis of Operating Safety of Tractor-Trailer under Crosswind in Cold Mountainous Areas. Appl. Sci. 2022, 12, 12755. [Google Scholar] [CrossRef]
Montoya-Alcaraz, M.; Mungaray-Moctezuma, A.; Calderón-Ramírez, J.; García, L.; Martinez-Lazcano, C. Road Safety Analysis of High-Risk Roads: Case Study in Baja California, México. Safety 2020, 6, 45. [Google Scholar] [CrossRef]
Shahdah, U.E.; Alanazi, F.; Azam, A.; Elbany, M. Safety and Mobility Performance Comparison of Two-Plus-One and Two-Lane Two-Way Roads: A Simulation Study. Appl. Sci. 2024, 14, 4352. [Google Scholar] [CrossRef]
Al-Sheikh, O.; Ghasemi, S.H.; Jalayer, M. Reliability-based analysis of horizontal curve design by evaluating the impact of vehicle automation on roadway departure crashes and safety performance. Heliyon 2024, 10, e25346. [Google Scholar] [CrossRef]
Ma, Q.; Yang, H.; Wang, Z.; Xie, K.; Yang, D. Modeling crash risk of horizontal curves using large-scale auto-extracted roadway geometry data. Accid. Anal. Prev. 2020, 144, 105669. [Google Scholar] [CrossRef] [PubMed]
Samerei, S.A.; Aghabayk, K.; Montella, A. Analyzing Pile-Up Crash Severity: Insights from Real-Time Traffic and Environmental Factors Using Ensemble Machine Learning and Shapley Additive Explanations Method. Safety 2024, 10, 22. [Google Scholar] [CrossRef]
Wang, M.-H. Investigating the Difference in Factors Contributing to the Likelihood of Motorcyclist Fatalities in Single Motorcycle and Multiple Vehicle Crashes. Int. J. Environ. Res. Public Health 2022, 19, 8411. [Google Scholar] [CrossRef]
Champahom, T.; Se, C.; Aryuyo, F.; Banyong, C.; Jomnonkwao, S.; Ratanavaraha, V. Crash Severity Analysis of Young Adult Motorcyclists: A Comparison of Urban and Rural Local Roadways. Appl. Sci. 2023, 13, 11723. [Google Scholar] [CrossRef]
Huang, H.; Ding, X.; Yuan, C.; Liu, X.; Tang, J. Jointly analyzing freeway primary and secondary crash severity using a copula-based approach. Accid. Anal. Prev. 2023, 180, 106911. [Google Scholar] [CrossRef]
Kumara, S.D.D.R.; Walgampaya, C.K. Identification of Severity Factors and Risk Areas of Southern Expressway Accidents. Eng. J. Inst. Eng. Sri Lanka 2021, 54, 61–75. [Google Scholar] [CrossRef]
Wei, X.; Tian, S.; Dai, Z.; Li, P. Statistical Analysis of Major and Extra Serious Traffic Accidents on Chinese Expressways from 2011 to 2021. Sustainability 2022, 14, 15776. [Google Scholar] [CrossRef]
Rajbahadur, G.K.; Wang, S.; Oliva, G.A.; Kamei, Y.; Hassan, A.E. The Impact of Feature Importance Methods on the Interpretation of Defect Classifiers. IEEE Trans. Softw. Eng. 2022, 48, 2245–2261. [Google Scholar] [CrossRef]
Zhang, X.; Waller, S.T.; Jiang, P. An ensemble machine learning-based modeling framework for analysis of traffic crash frequency. Comput.-Aided Civ. Infrastruct. Eng. 2019, 35, 258–276. [Google Scholar] [CrossRef]
Schindler, R.; Jeppsson, H. In-depth analysis of scenarios and injuries in crashes between cyclists and commercial vehicles in Germany. Traffic Saf. Res. 2024, 7, e000067. [Google Scholar] [CrossRef]
Cummings, M.L. Identifying AI Hazards and Responsibility Gaps. IEEE Access 2025, 13, 54338–54349. [Google Scholar] [CrossRef]
Liu, Q.; Zhang, C.; Gordon, T.J.; Wang, J. Dynamics and control of articulated passenger vehicles on roads. Veh. Syst. Dyn. 2025, 63, 1395–1457. [Google Scholar] [CrossRef]
Useche, S.A.; Cendales, B.; Alonso, F.; Montoro, L. Multidimensional prediction of work traffic crashes among Spanish professional drivers in cargo and passenger transportation. Int. J. Occup. Saf. Ergon. 2020, 28, 20–27. [Google Scholar] [CrossRef] [PubMed]
Franklin, R.C.; King, J.C.; Riggs, M. A Systematic Review of Large Agriculture Vehicles Use and Crash Incidents on Public Roads. J. Agromed. 2019, 25, 14–27. [Google Scholar] [CrossRef]
McFalls, M.; Ramirez, M.; Harland, K.; Zhu, M.; Morris, N.L.; Hamann, C.; Peek-Asa, C. Farm vehicle crashes on public roads: Analysis of farm-level factors. J. Rural Health 2021, 38, 537–545. [Google Scholar] [CrossRef]
De Santos-Berbel, C.; Ferreira, S.; Couto, A.; Lobo, A. Development of Motorway Horizontal Alignment Databases for Accurate Accident Prediction Models. Sustainability 2024, 16, 7296. [Google Scholar] [CrossRef]
Jeon, H.; Benekohal, R.F. Speed and Lane Change Management Strategies for CAV in Mixed Traffic for Post-Incident Operation. Future Transp. 2025, 5, 51. [Google Scholar] [CrossRef]
Pei, Y.-L.; He, Y.-M.; Ran, B.; Kang, J.; Song, Y.-T. Horizontal Alignment Security Design Theory and Application of Superhighways. Sustainability 2020, 12, 2222. [Google Scholar] [CrossRef]
Wu, X.; Chen, F.; Bo, W.; Shuai, Y.; Zhang, X.; Da, W.; Liu, H.; Chen, J. Analysis of Factors Influencing Driving Safety at Typical Curve Sections of Tibet Plateau Mountainous Areas Based on Explainability-Oriented Dynamic Ensemble Learning Strategy. Sustainability 2025, 17, 7820. [Google Scholar] [CrossRef]
Rehak, D.; Vlkovsky, M.; Manas, P.; Apeltauer, J.; Apeltauer, T.; Hromada, M. Sustainability of the Trans-European Transport Networks Land Infrastructure to Address Large-Scale Disasters: A Case Study in the Czech Republic. Sustainability 2025, 17, 2509. [Google Scholar] [CrossRef]

Figure 1. Research flowchart.

Figure 2. Feature Importance (Gradient Boosting Fatalities Model).

Figure 3. Feature Importance (Serious Injuries Random Forest Model).

Figure 4. Feature Importance (Slight Injuries Model).

Table 1. Previous Relevant Studies.

Authors	Description	Methods	Key Findings
Dadashova et al. (2020) [36]	Investigated crash injury severity using discrete-choice models and Random Forest on Spanish trans-European corridors.	Logistic regression and RF, with crash types disaggregated by roadway, driver, and environmental variables.	Roadway design elements (curvature, super elevation, lane width) were significant predictors. Logistic regression highlighted conditional effects by crash type, whereas RF identified critical factors across crash categories.
Yassin (2020) [37]	Developed a hybrid framework integrating K-means clustering with Random Forest for crash severity prediction	K-means was used to extract hidden features; RF was employed for classification against alternative classifiers.	The hybrid approach achieved outstanding accuracy (99.86%). Driver experience, lighting conditions, driver age, and vehicle service year emerged as dominant factors in predicting severity outcomes.
Atumo et al. (2021) [38]	Applied spatial statistics and Random Forest to traffic crash hot spot identification.	Getis-Ord statistics for spatial clustering complemented with RF-based crash prediction using 2010–2017 data.	Identified crash hot spots on interstate routes; RF achieved validation and prediction accuracy of 76.7% and 74%, respectively. Results highlighted the spatial dependence of crash distributions and confirmed predictive robustness.
Akin et al. (2022) [39]	Identified behavioral causes of crashes on Saudi Arabian highways using supervised ML.	Logistic regression, RF, and KNN applied to driver error-related crashes on the highway.	RF and logistic regression achieved the highest accuracy (78.7%), with RF attaining the largest AUC (0.712). Findings revealed that traffic flow speed and lane count reduced driver error–related crashes, while higher AADT and curve sections increased risk.
Gatera et al. (2023) [40]	Compared Random Forest and Support Vector Machine regression models for short-term road crash forecasting	RF and SVM were evaluated using error indices (MAE, MSE, RMSE) and R² values.	RF demonstrated superior predictive capacity (R² = 0.91) compared to SVM (R² = 0.86), reinforcing the promise of ML in traffic crash forecasting.
Nikolaou et al. (2023) [41]	Compared ML techniques for motorway segment crash risk prediction	Logistic Regression, Decision Tree, RF, SVM, and kNN applied to road design and naturalistic driving datasets.	RF achieved the highest accuracy (89.3%) and superior precision-recall-F1 balance. Shapley additive explanations enhanced interpretability, highlighting RF’s effectiveness for crash risk assessment.
Yang et al. (2023) [33]	Proposed a Random Forest based framework for predicting crash severity using enriched feature sets	RF compared with BP neural network, SVM, and radial basis neural network; feature importance ranking applied to 12 variables.	RF outperformed other models, achieving higher recall (0.83) and F1 scores, with a lower false alarm rate. Results confirmed its reliability and stability in severity prediction.
Ahmed et al. (2023) [42]	Examines crash prediction and contributing factors using explainable ensemble machine learning models	Random Forest (RF), Decision Jungle, AdaBoost, XGBoost, LightGBM, and CatBoost, with interpretability through SHAP analysis.	RF achieved the highest predictive accuracy (81.45%), with road category and number of vehicles identified as the most influential factors affecting injury severity.
Wang et al. (2024) [43]	Developed a meta-heuristic optimized Random Forest model for traffic crash severity prediction.	Compared nine meta-heuristic RF variants (e.g., CPO-RF, SSA-RF) against standard ensemble and single classifiers using U.S. crash data.	CPO-RF yielded superior accuracy (95.2%) and F1 scores exceeding 90%. Application of inverse SMOTE improved accuracy to 99.6%. Key predictors included temperature, weather, pressure, GDP, population density, and time of day.
Uzunov et al. (2025) [44]	Comparative analysis of proportional distribution methods and Random Forest for pedestrian crash risk prediction	Proportional risk distribution and RF applied to data derived from court cases, with risk factors quantified through expert evaluation.	Both approaches were valid; however, RF provided superior accuracy and robustness. Significant correlation between methods confirmed validity, with graphical visualizations aiding the interpretability of risk severity.
Daoud et al. (2025) [45]	Examined nighttime crash severity and the role of roadway illumination	Developed and compared seven machine learning models (logistic regression, k-NN, naïve Bayes, random forest, ANN, XGBoost, LSTM) using TxDOT crash data.	The random forest model produced the most promising results by predicting severe crashes with 97.6% accuracy.

Table 2. Summary of Key Segment-Level Variables by Injury Severity.

Variable Category	Representative Variable	Fatal	Serious	Slight
Track Code	Undivided	83%	84%	85%
	Left track	9%	9%	8%
	Right track	8%	7%	7%
Road Category	Motorway	18%	16%	14%
	Expressway	4%	5%	5%
	Primary Main Road	22%	25%	26%
	Secondary Main Road	57%	54%	55%
Section Type	Rural	81%	62%	63%
Section Type	Urban	19%	38%	37%
Number of Lanes	Two lanes	85%	83%	83%
Horizontal Alignment	Straight	48%	44%	46%
Horizontal Alignment	Right/Left	34%	36%	35%

Table 3. Dataset Variables Overview.

Category	Variable Name	Description
Road Attributes	Track Code	(0: undivided, 1: left track, 2: right track).
	Road Category	(1: motorway, 2: expressway, 3: primary main road, 4: secondary main road).
	Section Type	(1: rural or 2: urban).
	Number of Traffic Lanes	Numeric
	Radius of Horizontal Curve	(In meters).
	Direction of Horizontal Curve	(1: right, 2: left, 3: straight).
	Slope/Gradient	(In %.).
	Type of Vertical Curve	Type of vertical alignment curve (e.g., crest, sag).
	Radius of Vertical Curve	(In meters).
Traffic Attributes	Capacity Utilization	Ratio of AADT to road capacity (%).
	Heavy Truck Traffic	Numeric
	Medium Heavy (2-Axle) Truck Traffic	Numeric
	Truck with Trailer or Semi-Trailer Traffic	Numeric
	Tractor-Trailer with Semi-Trailer Traffic	Numeric
	Light Truck Traffic	Numeric
	Total Bus (Single) Traffic	Numeric
	Bus (Articulated) Traffic	Numeric
	Motorcycle and Moped Traffic	Numeric
	Bicycle Traffic	Numeric
	Slow Vehicle and Agricultural Tractor Traffic	Numeric
Derived Variables	Severity	Target variable: crash severity counts (slight injuries, serious injuries, fatalities).

Table 4. Hyperparameters optimization.

Model	Hyperparameter	Type	Search Range/Values
Random Forest	n_estimators	Discrete	{100, 200, 300}
	max_depth	Discrete	{10, 15, 20, 25}
	min_samples_split	Discrete	{2, 5, 10}
	min_samples_leaf	Discrete	{2, 3, 4}
	max_features	Categorical	{‘sqrt’, ‘log2’}
K-Nearest Neighbors	n_neighbors	Discrete	{3, 5, 7}
Gradient Boosting	n_estimators	Discrete	{100, 150, 200}
	learning_rate	Continuous	{0.01, 0.05, 0.1}
	max_depth	Discrete	{2, 3, 5}
	min_samples_split	Discrete	{2, 5, 10}
	subsample	Continuous	{0.7, 0.9, 1.0}

Table 5. Performance metrics of the predicting models for the counts of fatalities.

	Train			Test
Metric	RF	KNN	GB	RF	KNN	GB
Accuracy	0.9918	0.8499	0.9895	0.9373	0.7224	0.9463
Precision	0.9920	0.8504	0.9897	0.9381	0.7296	0.9471
F1-Score	0.9918	0.8487	0.9895	0.9371	0.7168	0.9460
MCC	0.9878	0.7762	0.9844	0.9065	0.5899	0.9200
G-Mean	0.9938	0.8866	0.9922	0.9528	0.7886	0.9596
MSE	0.0082	0.2801	0.0149	0.1075	0.4836	0.0896
RMSE	0.0906	0.5292	0.1222	0.3278	0.6954	0.2993
R²	0.9877	0.5801	0.9776	0.8386	0.2735	0.8655

Table 6. Performance metrics of the predicting models for the counts of serious injury c.

	Train			Test
Metric	RF	KNN	GB	RF	KNN	GB
Accuracy	0.9503	0.9093	0.9109	0.9377	0.8815	0.8976
Precision	0.9559	0.9170	0.9155	0.9428	0.8858	0.9022
F1-Score	0.9512	0.9114	0.9104	0.9385	0.8830	0.8967
MCC	0.9438	0.8969	0.8989	0.9294	0.8648	0.8839
G-Mean	0.9714	0.9474	0.9483	0.9641	0.9309	0.9404
MSE	0.1057	0.4293	0.2364	0.1859	0.5784	0.2890
RMSE	0.3251	0.6552	0.4862	0.4311	0.7605	0.5376
R²	0.9899	0.9591	0.9775	0.9823	0.9449	0.9725

Table 7. Performance metrics of the predicting models for the counts of slight injury.

	Train			Test
Metric	RF	KNN	GB	RF	KNN	GB
Accuracy	0.9706	0.9074	0.7425	0.9064	0.8391	0.7155
Precision	0.9761	0.9117	0.7362	0.9108	0.8374	0.7041
F1-Score	0.9718	0.9072	0.7294	0.9068	0.8352	0.7003
MCC	0.9683	0.8994	0.7210	0.8983	0.8252	0.6917
G-Mean	0.9839	0.9485	0.8515	0.9480	0.9093	0.8349
MSE	0.2103	1.0197	3.1294	0.6024	1.8831	3.3148
RMSE	0.4586	1.0098	1.7690	0.7761	1.3723	1.8207
R²	0.9824	0.9144	0.7374	0.9494	0.8419	0.7218

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hamdan, N.; Sipos, T. Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors. Future Transp. 2025, 5, 197. https://doi.org/10.3390/futuretransp5040197

AMA Style

Hamdan N, Sipos T. Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors. Future Transportation. 2025; 5(4):197. https://doi.org/10.3390/futuretransp5040197

Chicago/Turabian Style

Hamdan, Noura, and Tibor Sipos. 2025. "Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors" Future Transportation 5, no. 4: 197. https://doi.org/10.3390/futuretransp5040197

APA Style

Hamdan, N., & Sipos, T. (2025). Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors. Future Transportation, 5(4), 197. https://doi.org/10.3390/futuretransp5040197

Article Menu

Predicting Segment-Level Road Traffic Injury Counts Using Machine Learning Models: A Data-Driven Analysis of Geometric Design and Traffic Flow Factors

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Data Description and Feature Engineering

3.2. Machine Learning Models

3.2.1. Random Forest (RF)

3.2.2. Gradient Boosting Machines (GBM)

3.2.3. K-Nearest Neighbors (KNN)

3.3. Hyperparameter Tuning and Model Optimization

3.4. Model Evaluation and Performance Metrics

4. Results and Discussion

4.1. Model Performance Overview

4.1.1. Fatal Injury Model

4.1.2. Serious Injury Model

4.1.3. Slight Injury Model

4.2. Cross-Model Insights and Thematic Synthesis

4.3. Policy and Planning Implications

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI