1. Introduction
Urban traffic safety remains one of the most pressing public health challenges of the 21st century. The World Health Organization [
1] estimates that over 1.19 million people die annually in road traffic crashes, with a disproportionate burden falling on rapidly urbanizing cities in low- and middle-income countries. In cities like Jeddah, Saudi Arabia, rapid motorization, extreme climatic conditions, and complex road networks contribute to a high incidence of severe crashes [
2,
3], necessitating advanced modeling tools for risk assessment and mitigation.
Traditional crash severity models have relied on statistical regression techniques, such as multinomial logit (MNL), ordered probit, and logistic regression, to identify significant risk factors [
4]. While these models offer interpretability, they often assume linear relationships and independence among variables, which fails to capture the nonlinear, compound nature of real-world crash causation [
5]. For instance, the combined effect of over-speeding at night during a dust storm is not merely additive but synergistic, leading to a disproportionate escalation in injury severity risk.
The advent of machine learning (ML) has significantly improved predictive accuracy in crash severity modeling. Ensemble methods such as Random Forest, Extreme Gradient Boosting (XGBoost), and CatBoost have demonstrated superior performance over traditional models [
6,
7]. However, most ML applications in road safety remain “black-box” in nature, lacking statistical rigor in modeling dependence structures [
8]. Moreover, they often treat risk factors as independent inputs, ignoring the joint tail behaviour that characterizes high-severity crashes [
9].
A critical gap in the literature is the disconnect between dependence modeling and predictive analytics. On one hand, copula-based models have emerged as powerful tools for capturing nonlinear dependence and tail co-movements among risk factors. Vine copulas, in particular, enable flexible, high-dimensional dependence modeling and have been applied in finance, hydrology, and climate risk [
10,
11]. Early applications in wildlife-vehicle collision studies demonstrated that Gaussian copulas could effectively model underreporting biases, reducing hotspot identification errors compared to negative binomial models [
12]. Subsequent advancements employed vine copula constructions to disentangle the endogenous relationships between barrier types, shoulder widths, and crash severity, revealing that W-beam barriers increased fatal outcomes when installed on curved mountainous terrains [
13]. These techniques proved particularly valuable for quantifying extreme value dependencies.
On the other hand, interpretable ML, particularly SHAP (SHapley Additive exPlanations), has gained traction for identifying key risk drivers in crash severity models [
14,
15]. However, SHAP values reflect predictive importance within a specific model and may not fully capture underlying statistical dependencies that persist across models. This creates a methodological disagreement: dependence is ignored in ML, and prediction is ignored in copulas.
To bridge this gap, we introduce the Multivariate Risk Index (MRI), a novel hybrid framework that fuses copula-based tail dependence with ML-driven feature importance into a unified, interpretable risk score. Our MRI-Copula framework leverages Vine copulas to quantify the joint extremal behaviour of domain-informed risk indices, Environmental (ERI), Behavioural (BRI), and Systemic (SRI), while using CatBoost-SHAP to calibrate their predictive weights. An optimized blend of these two sources yields a data-driven, statistically rigorous MRI that enhances crash severity prediction.
This study advances intelligent transportation safety through three primary contributions. First, it introduces the MRI-Copula framework, the first hybrid copula, machine learning risk index for crash severity prediction, which seamlessly fuses statistical dependence modeling with predictive analytics. Second, it develops domain-informed Environmental (ERI), Behavioural (BRI), and Systemic (SRI) risk indices, with optimized and interpretable weighting that enhances both accuracy and transparency in risk assessment. Third, it delivers policy-ready outputs, including risk profiles, net benefit analysis, and threshold optimization, providing actionable insights for deployment within intelligent transportation systems. Taken together, these contributions mark a paradigm shift from traditional reactive crash modeling to proactive, multivariate risk indexing, positioning MRI-Copula at the forefront of next-generation road safety intelligence.
To the best of our knowledge, no prior study has fused copula-derived dependence weights with ML-based feature importance into a composite risk index for crash severity prediction. Existing hybrid approaches; such as stacking ensembles [
16], or deep learning frameworks with saliency-based interpretation [
17,
18], do not operationalize multivariate dependence in the form of a decomposable index. MRI-Copula fills this gap by creating a transparent, interpretable multivariate risk index.
The remainder of this paper is structured as follows.
Section 2 presents a comprehensive literature review, critically examining the evolution of crash severity modeling, the limitations of current machine learning approaches in capturing joint risk, and the underutilization of copula models in predictive safety analytics.
Section 3 details the methodology, introducing the MRI-Copula framework, its six-phase implementation, and the integration of Vine copulas with CatBoost-SHAP for multivariate risk indexing.
Section 4 reports the empirical results, including model performance, bootstrap confidence intervals, and interpretability analyses.
Section 5 provides a critical discussion of the findings, their policy implications, and the framework’s advantages over existing methods. Finally,
Section 6 concludes the paper, summarizing the contributions and suggesting directions for future research. The integration of statistical dependence and machine learning in the MRI-Copula framework offers a robust, interpretable, and scalable tool for intelligent transportation safety management.
2. Related Work
The methodological evolution of crash severity modeling reflects successive efforts to balance predictive accuracy with interpretability. Early research relied heavily on classical statistical methods such as multinomial logit (MNL), ordered probit, and binary logistic regression [
19,
20,
21]. These models provided interpretable coefficients and shaped early safety policies but were limited by assumptions of linearity, independence of irrelevant alternatives (IIA), and additivity, which often failed to capture the nonlinear and interdependent nature of crash causation [
22].
The emergence of machine learning (ML) in the early 2000s marked a paradigm shift. Ensemble methods such as Random Forest (RF) [
23], Gradient Boosting Machines (GBM) [
24], and Support Vector Machines (SVM) [
25] demonstrated superior predictive accuracy, outperforming regression-based approaches by 10–15% in AUC across multiple crash severity prediction tasks [
6,
26,
27,
28].
More recently, ML applications in transportation have expanded beyond crash analysis to diverse areas such as lane-changing behaviour modeling [
29], traffic air quality modelling [
30], sensor-driven traffic flow prediction [
31], and pavement subsurface imaging and condition assessment [
32]). These studies illustrate the breadth of data-driven approaches transforming traffic and infrastructure analytics. However, while such models excel at complex pattern recognition, they are often criticized as “black boxes,” lacking the interpretability and causal transparency required for policy-oriented safety analysis [
33].
Recent developments in interpretable ML (IML), such as SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), and Accumulated Local Effects (ALE), have improved transparency, enabling the identification of critical crash factors such as driver age, traffic violations, and environmental conditions [
14,
34]. Despite these advances, most IML applications remain post hoc and model-specific, providing explanations after training rather than shaping feature representation during modeling. This disconnection highlights the need for integrated frameworks where interpretability and prediction are co-designed [
35].
While ML models capture complex nonlinearities, a critical limitation lies in their treatment of risk factors as independent predictors within a flat feature space [
5]. Real-world crash causation is compound and synergistic rather than additive. For instance, a novice driver speeding during a dust storm at night faces an escalation of risk that exceeds the sum of individual factors [
36]. Conventional ML struggles to capture these joint, non-additive effects, particularly in the tails of distributions where rare but severe crashes occur.
Attempts to address interactions using polynomial expansions, decision trees, or attention-based neural networks have yielded improvements but remain prone to overfitting and spurious correlations [
37,
38]. Moreover, traditional ML models are trained on marginal distributions and rarely account for tail dependence, the statistical tendency for extreme values of multiple variables to co-occur [
39]. Since severe crashes often arise under joint extreme conditions (e.g., over-speeding in adverse weather), capturing these dependencies is essential. This methodological gap points toward the potential of copula-based modeling, which is designed to model complex multivariate dependence and tail behaviour.
2.1. Copulas in Transportation Safety: From Frequency to Severity
Copulas, separate marginal distributions from dependence structures, offering a flexible tool for multivariate dependence modeling. Widely applied in finance, hydrology, and climate risk [
40,
41], copulas have more recently entered transportation safety analytics. Early applications used bivariate copulas to model dependence between crash counts [
42,
43] or between vehicle dynamic parameters [
44]. Vine copulas—flexible, high-dimensional structures, have since been applied to traffic flow modeling [
45], and driver risk profiling [
39].
Despite this progress, applications of copulas to crash severity prediction remain limited and mostly descriptive. For example, [
46] applied a copula-based temporal probit model to jointly analyze crash type, injury severity, and driver error, but the framework primarily served descriptive analysis rather than predictive forecasting. This highlights a methodological divide: while copulas excel at dependence modeling, their predictive integration into severity models is rare, leaving their potential underutilized.
2.2. Risk Indexing in Road Safety: From Heuristic to Data-Driven
Risk indexing has long been central to road safety management. Tools such as Safety Performance Functions (SPFs), Crash Prediction Models (CPMs), and Road Safety Audits (RSAs) traditionally guided interventions [
47]. However, these indices were often heuristic, relying on expert-assigned weights or simple regressions. More recent efforts have adopted data-driven approaches, such as Principal Component Analysis (PCA) and Factor Analysis, to derive composite indices from correlated crash variables [
48,
49].
While useful, these methods assume linearity and Gaussian distributions, assumptions rarely met in crash data. More sophisticated methods include Bayesian networks for probabilistic indexing [
50] and fuzzy logic for handling uncertainty [
51]. Yet these approaches are computationally demanding and difficult to operationalize.
The Multivariate Risk Index (MRI) proposed in this study represents a third-generation risk indexing approach. The first generation relied on expert-based additive indices, such as Road Safety Audits (RSA), where subjective weights were assigned to risk factors. The second generation advanced to statistical indices, exemplified by methods like Principal Component Analysis (PCA), which derived composite measures from correlated crash variables but were constrained by assumptions of linearity and normality. The third generation, as demonstrated by MRI, introduces hybrid, dependence-aware indices that integrate copula-based tail dependence with machine learning–derived feature importance, thereby moving beyond marginal assessments to joint risk modeling.
2.3. The Copula–ML Divide: Toward Hybrid Approaches in Safety Analytics
Although both copula models and machine learning (ML) have been applied in traffic safety research, their development has largely proceeded in parallel rather than in integration. On the ML side, ensemble algorithms and interpretable models have demonstrated strong predictive performance. For example, [
14] developed a Self-Paced Ensemble–SHAP framework for work-zone crash severity classification, while [
52] compared ensemble techniques such as Random Forest, XGBoost, and SVM, showing notable improvements in severity prediction. Similarly, [
53] used XGBoost-SHAP with heterogeneity modeling to capture the temporal patterns of multivehicle truck-involved crashes. These approaches illustrate the predictive strength of ML but generally treat features as independent and overlook multivariate dependence structures.
By contrast, copula-based methods focus on capturing dependence and joint distributions. [
54] applied copulas to correct underreporting in wildlife–vehicle collisions, while [
46] developed a multivariate copula temporal framework to jointly estimate severity, crash type, vehicle damage, and driver error. These studies highlight the value of copulas in modeling joint outcomes and tail dependencies, but their focus has largely been descriptive rather than predictive.
Outside the transportation domain, copula–ML hybrids have begun to emerge. Ref. [
55] combined copula-based finite mixture regression with insurance claim modeling, and ref. [
56] employed vine copulas to generate synthetic data for ML pipelines, enhancing gen0eralization. In environmental risk modeling, Bayesian ML ensembles with copula-based uncertainty quantification have shown promise for robust groundwater forecasting [
57]. These applications demonstrate the feasibility of embedding copula-derived dependence structures into predictive frameworks.
Despite these advances, no study in road safety has explicitly fused copula-based dependence with ML-derived feature importance into a unified predictive index. This methodological gap, what we call the copula–ML divide, reflects the tendency of copula research to emphasize dependence without predictive integration, while ML research prioritizes accuracy but neglects joint risk structures. The MRI-Copula framework presented in this study bridges this divide by integrating vine copula tail dependence with CatBoost-SHAP feature importance into a multivariate, interpretable risk index for crash severity.
To situate our contribution,
Table 1 compares representative ML-only, copula-only, and hybrid approaches, highlighting their methodological scope, strengths, and limitations.
2.4. Interpretability and Policy Translation in Safety Analytics
Predictive accuracy alone is insufficient in road safety; decision-making requires interpretability and policy relevance. SHAP and ALE plots offer transparent explanations of nonlinear effects yet remain post hoc [
14,
34]. Moreover, few frameworks incorporate decision-analytic tools such as Decision Curve Analysis (DCA) [
58], which evaluate the net benefit of predictions across different thresholds. While common in healthcare, DCA remains underused in transportation safety [
59,
60].
MRI-Copula directly addresses this gap by embedding interpretability within feature engineering and by producing policy-ready outputs such as risk profiles, intervention thresholds, and cost-aware decision metrics.
2.5. Research Gaps in Crash Risk Modeling
Despite significant advances, current crash risk analytics continue to face several limitations. A persistent challenge is the unresolved trade-off between predictive performance and interpretability, where models tend to favour one at the expense of the other. Furthermore, dependence structures among critical risk factors, such as weather, speeding, and road geometry—are frequently oversimplified or ignored, despite their importance in shaping crash outcomes. Another limitation is the lack of generalizability, as many models are developed for localized datasets and fail to transfer effectively across diverse urban contexts. Finally, the translation of research into practice remains limited, with few models designed to inform real-world enforcement strategies, infrastructure design, or resource allocation.
2.6. Contribution of the Present Study
This study addresses these gaps through the development of the Multivariate Risk Index–Copula (MRI-Copula) framework, which makes three key contributions. Methodologically, it integrates vine copulas with gradient boosting to capture multivariate tail dependencies while maintaining predictive accuracy. In terms of interpretability, it translates crash risk into domain-driven indices—namely the Behavioural Risk Index (BRI), Environmental Risk Index (ERI), and Systemic Risk Index (SRI)—which provide clear, actionable profiles for practitioners. For policy translation, the framework embeds SHAP, ALE, clustering, and Decision Curve Analysis (DCA), enabling targeted and cost-aware interventions such as speed enforcement, driver education, and infrastructure risk audits. Taken together, these contributions position MRI-Copula as a paradigm shift in traffic safety analytics, advancing the field from reactive crash analysis to proactive, dependence-aware multivariate risk indexing.
3. Methodology
This study develops the MRI-Copula framework, a hybrid approach that integrates vine copula–based tail dependence modeling with CatBoost-SHAP interpretability to construct a Multivariate Risk Index (MRI) for predicting crash severity in urban settings. The framework is specifically designed to capture compound risks arising from the joint occurrence of environmental, behavioural, and systemic factors, while ensuring both predictive accuracy and interpretability.
Figure 1 illustrates the methodological flow, which is organized into six sequential phases: (i) data preprocessing and feature engineering, (ii) risk index construction, (iii) vine copula modeling of tail dependence, (iv) CatBoost-SHAP for predictive feature importance, (v) optimized MRI weighting and index construction, and (vi) model training, evaluation, and interpretability.
3.1. Data Preprocessing and Feature Engineering
This study employs a dataset of 877 police-reported crashes from Jeddah, Saudi Arabia (2019–2023), serving as a proof-of-concept case for testing the MRI-Copula framework in an urban setting characterized by rapid motorization and varied crash determinants. Although region-specific, the dataset provides a representative basis for assessing methodological feasibility before scaling to multi-regional applications.
Each record contains 27 attributes spanning temporal, environmental, infrastructural, vehicle, and driver domains, with Injury Severity defined as a binary target: 0 = minor and 1 = severe/fatal. The class distribution (66.6% minor, 33.4% severe/fatal) supports stratified binary classification.
Temporal features (month, day of week, and weekend indicator) were derived from crash dates. Categorical variables (e.g., weather, driver gender, road type) were processed natively in CatBoost and encoded for LightGBM, while numerical variables (e.g., age, speed limit) were median-imputed. Categorical missing values were imputed using the mode [
61]. The cleaned dataset contained no missing values and was normalized for downstream modeling.
This preprocessing pipeline ensured data completeness, interpretability, and consistency across classifiers. Future extensions will validate the framework on larger, multi-regional datasets to enhance generalizability and capture cross-regional variations in crash causation.
3.2. Risk Index Construction (ERI, BRI, SRI)
To ensure domain-informed representation of multidimensional crash risk, three composite indices were constructed. The Environmental Risk Index (ERI) aggregates risks related to adverse weather, night conditions, and infrastructural deficiencies. The Behavioural Risk Index (BRI) reflects driver-cantered risks, including over-speeding, traffic violations, and inexperience. The Systemic Risk Index (SRI) captures traffic system risks linked to congestion, road type, and vehicle characteristics. These indices are designed to be additive, enabling intuitive interpretation and direct translation into targeted policy interventions. The constituent variables for each index, along with their descriptions and coding schemes, are detailed in
Supplementary Table S1.
3.3. CatBoost–SHAP and Vine Copula Integration for MRI Construction
To capture both predictive influence and extremal dependence in crash severity modeling, we designed a two-step integration of CatBoost–SHAP feature attribution and Vine copula-based dependence analysis. The process ensured that the Multi-Risk Index (MRI) reflects both how strongly features predict severity and how risk factors co-escalate in extreme crash conditions.
CatBoost [
62], a gradient boosting algorithm robust to categorical variables and missing data, was trained on the crash dataset with hyperparameters (learning rate, depth, number of iterations) optimized using stratified 5-fold cross-validation to maximize AUC.
Feature importance was quantified via SHapley Additive exPlanations (SHAP) [
63]. For each observation
and feature
, SHAP values
represent the marginal contribution of
to the prediction. Global importance was obtained as the mean absolute SHAP value:
The SHAP weights were normalized to sum to 1 across ERI, BRI, and SRI, producing predictive importance scores that highlight which indices most strongly influence crash severity.
To capture nonlinear, asymmetric, and joint-tail dependencies, the indices (ERI, BRI, SRI) were first rank-transformed to the unit interval:
A regular vine (R-vine) copula [
10] was then fitted to these uniform marginals, with pair-copula families automatically selected from a candidate set (Clayton, Gumbel, Student-t, Frank, Independence) based on the Akaike Information Criterion (AIC).
The upper tail dependence coefficient,
was estimated via Monte Carlo simulation (
n = 10,000) and bootstrapped (
n = 100) to construct 95% confidence intervals. Mean values across simulations provided copula-derived extremal weights, w
Copula.
To integrate predictive and dependence perspectives, we applied a convex blend:
where α balances predictive accuracy and dependence-awareness. The optimal α was selected via stratified 5-fold cross-validation over a grid search (0.0–1.0 in increments of 0.1), with AUC maximization as the criterion. A constraint
ensured that at least 20% weight derived from copula dependence, safeguarding against purely predictive weighting.
The final MRI scores were appended as an additional feature to the dataset, and a CatBoost classifier was retrained on this enhanced feature set. Performance was evaluated on a held-out test set using F1, AUC, Precision, Recall, Accuracy, and Matthews Correlation Coefficient (MCC), with 95% confidence intervals estimated via 1000 bootstrap replications.
For benchmarking, three alternative variants were implemented: (i) MRI-Corr (Spearman correlation in place of copula), (ii) MRI-Interact (explicit CatBoost interaction terms), and (iii) MRI-PCA (first principal component of indices).
3.4. Model Training, Evaluation, and Interpretability
The performance of the was evaluated using F1 score, area under the receiver operating characteristic curve (AUC), precision, recall, accuracy, and Matthews Correlation Coefficient (MCC), with bootstrap 95% confidence intervals computed to ensure robust statistical inference [
64]. Model interpretability was enhanced using SHAP plots to elucidate feature importance [
65] and Decision Curve Analysis (DCA) to assess operational net benefit across decision thresholds [
58]. In addition to bootstrap validation, model performance was evaluated using a repeated 5-fold cross-validation scheme (10 repeats) to ensure stability of the estimates across different data partitions.
4. Results
This section presents the empirical findings of the MRI-Copula framework, evaluating its predictive performance, interpretability, and policy relevance in comparison to benchmark models.
4.1. Comparative Evaluation of MRI Variants
Table 2 presents the performance of the four MRI variants. All models demonstrated high predictive capability, with AUC values consistently around 0.988. The differentiation between models is thus more evident in metrics that balance precision and recall, such as the F1 score and Matthews Correlation Coefficient (MCC). Among the variants, MRI-Copula achieved the highest F1 score (0.914) and MCC (0.872), indicating a superior balance in predictive performance. The MRI-Interact variant performed comparably in MCC (0.872) but slightly lower in F1 score (0.912). The MRI-Corr and MRI-PCA variants delivered moderate results, reflecting their reliance on linear dependence and dimensionality reduction, which appear insufficient for optimally capturing the complex, synergistic patterns in crash data.
Figure 2 presents the upper tail dependence matrix, which quantifies the probability that one variable experiences an extreme high value given that another does. This analysis reveals asymmetric dependencies not captured by standard correlation. A moderate upper tail dependence exists between ERI and BRI (λ = 0.1468), indicating that extreme environmental conditions and extreme risky driving behaviours are likely to occur together. In contrast, the tail dependence between ERI and SRI is negligible (λ = 0.0665), confirming that environmental and systemic risks are independent even during extreme scenarios. The strongest tail dependence is observed between BRI and SRI (λ = 0.1603), suggesting that corridors with extremely high systemic risk are meaningfully coupled with a heightened probability of extreme risky driving behaviours, potentially reflecting concentrated danger in specific high-speed or complex road segments.
4.2. Vine Copula Structure and Extremal Dependencies
The AIC-based selection procedure identified a parsimonious Vine copula structure, revealing the underlying dependency network between the risk indices. The model found only one significant pairwise relationship: a Student-t copula linking the Environmental (ERI) and Behavioural (BRI) Risk Indices (Kendall’s τ = 0.06, correlation parameter = 0.099, df = 3.38). The choice of the Student-t family indicates the presence of symmetric tail dependence between ERI and BRI.
To ensure that the three risk indices represent distinct and independent dimensions, a Variance Inflation Factor (VIF) analysis was conducted. As shown in
Table 3, all three indices recorded VIF values close to 1.0 (ERI = 1.007, BRI = 1.007, SRI = 1.001), indicating negligible multicollinearity. This statistical evidence confirms that the indices are not linearly redundant and can be jointly integrated within the MRI-Copula framework without inflating variance or bias in the subsequent model estimation.
Critically, the Systemic Risk Index (SRI) was found to be conditionally independent of both ERI and BRI. This indicates that once the shared influence of behaviour (BRI) is accounted for, there is no remaining statistical dependence between the road environment and its systemic design. This structure positions BRI as the central, pivotal variable in the network of risk.
The estimated tail dependence coefficients (
Table 4) confirm this structure. The probability of simultaneous extremes is meaningfully elevated only for the ERI-BRI pair (
≈ 0.167). The overall tail-derived importance weights were balanced for ERI (0.407) and BRI (0.407), identifying them as central components in the extremal network, while SRI received a lower weight (0.370).
4.3. Multi-Risk Index Optimization and Model Performance
The optimization of the MRI blending parameter α under the ≤0.8, constraint converged to the maximum allowed value (α = 0.8), yielding a training AUC of 0.6615. This result indicates that while predictive importance (SHAP) was the primary driver of performance, the incorporation of extremal dependence (Copula) provided a necessary theoretical enhancement.
The final blended weights reflect this synthesis (
Table 5). The Behavioural Risk Index (BRI) retained its dominance from the SHAP analysis (0.495), confirming its role as the foremost direct predictor of crashes. The influence of the Systemic Risk Index (SRI) more than doubled in the blended weights (0.189) compared to its pure SHAP weight (0.144), demonstrating that the copula framework successfully captured its non-negligible role in the extremal risk structure, even though it is a weak average predictor.
The final model, which integrated the MRI score as a feature, demonstrated excellent and robust performance on the held-out test set (
Table 6), validating the entire framework.
A sensitivity analysis confirmed that the model’s high performance was robust to the choice of copula family, with AUC remaining consistently high across Gumbel (0.988), Clayton (0.986), and Student-t (0.985) families.
4.4. Risk Profiling Through Clustering
Unsupervised clustering based on the MRI scores and its components revealed three distinct risk profiles with escalating average injury severity (
Table 7).
Cluster 1 (Low Risk) is characterized by low environmental and behavioural risk but elevated systemic risk (SRI = 1.55), resulting in the lowest mean injury severity (0.18). This profile suggests crashes primarily linked to roadway design deficiencies, but without the compounding effect of bad weather or high-risk driving, the outcomes are less severe.
Cluster 2 (Medium Risk) shows a pronounced behavioural risk profile (BRI = 1.85) coupled with moderate environmental risk but occurs on roads with lower systemic risk. This yields a medium injury severity (0.38).
Cluster 3 (High Risk) represents the most hazardous scenario, with consistently high scores across all three domains (ERI = 1.22, BRI = 2.11, SRI = 2.51). This compound risk profile leads to the highest mean injury severity (0.45), underscoring the catastrophic potential when environmental stressors, risky behaviour, and inadequate infrastructure coincide.
It should be noted that the cluster labels showed low stability upon bootstrap resampling (Adjusted Rand Index = 0.001), indicating that while these distinct risk profiles exist, their precise boundaries in the feature space are not rigid. This supports the use of a continuous risk index (MRI) for robust prediction, while the clusters serve as useful, albeit fuzzy, archetypes for conceptualizing intervention strategies.
4.5. Comparative Model Performance Across Classifiers
Table 7 presents a comparative evaluation of the MRI-Copula framework across four classifiers—Logistic Regression, CatBoost, LightGBM, and Histogram-based Gradient Boosting (HistGB)—under three modeling scenarios: Baseline, With Indices, and With MRI. The Baseline models rely solely on the original crash-related variables, the With Indices models incorporate interpretable sub-indices representing Environmental (ERI), Behavioural (BRI), and Systemic (SRI) risks, while the With MRI models further integrate the α-weighted Multi-Risk Index (MRI), which combines SHAP-derived feature importance with vine-copula-based dependence weights, optimized at α = 0.80.
Across all configurations, the CatBoost + MRI-Copula model achieved the best overall predictive performance (F1 = 0.904; AUC = 0.985), demonstrating superior discrimination of severe crashes while maintaining balanced precision (0.929) and recall (0.881). The marginal differences among the Baseline, With Indices, and With MRI configurations indicate that the introduction of interpretable risk structures and dependence weighting did not compromise accuracy, confirming the robustness of the MRI-Copula integration. In comparison, LightGBM achieved similarly high predictive performance (AUC ≈ 0.96; F1 ≈ 0.87–0.89) but required substantially less training time (approximately 0.07 s), underscoring its efficiency and scalability for large-scale or near–real-time deployment. HistGB yielded slightly lower but competitive accuracy (AUC ≈ 0.95; F1 ≈ 0.81–0.82) within a modest runtime of about 0.23 s, further confirming the generalizability of the MRI-Copula approach across gradient-boosting frameworks.
By contrast, Logistic Regression consistently exhibited the lowest predictive performance (AUC ≈ 0.90; F1 ≈ 0.78), reflecting the inherent limitations of linear models in capturing the nonlinear interactions and complex dependencies among risk indices. This clear contrast between linear and ensemble-based learners highlights the methodological advantage of embedding dependency-aware indices within nonlinear architectures. Moreover, the inclusion of the interpretable ERI, BRI, and SRI sub-indices enhanced transparency by revealing how domain-specific risks contribute to overall crash severity, while the α-weighted MRI provided a unified representation of predictive and dependence-based relationships.
Overall, the comparative results confirm that the MRI-Copula framework achieves a balanced integration of predictive accuracy, computational efficiency, and interpretability. CatBoost remains the most analytically robust and interpretable configuration, while LightGBM and HistGB offer practical alternatives for real-time or resource-constrained applications. Together, these findings position the MRI-Copula framework as a reliable, scalable, and theoretically grounded tool for data-driven road-safety management.
4.6. Interpretability: SHAP and ALE Analysis
Model interpretability was examined using SHAP for global feature attribution and ALE plots for marginal effects (
Figure 3a–c and
Figure 4a–c).
The SHAP summary plots (
Figure 3a–c) reveal consistent patterns in feature importance across multiple gradient-boosting models (CatBoost, LightGBM, and HistGB), highlighting key drivers of crash severity. Across all models, temporal features; specifically, Time and Month, are among the most influential predictors, indicating that diurnal and seasonal cycles significantly modulate crash risk. This suggests that high-risk periods (e.g., rush hours or winter months) are critical windows for targeted interventions.
Traffic Violations consistently emerge as a top-tier contributor to model output, reinforcing the role of human behaviour in crash severity. In the CatBoost model, High Violations exhibits strong positive SHAP values, suggesting that severe or repeated violations are strongly associated with higher injury outcomes. The presence of Weather Condition and Road Characteristics in the upper tiers further shows the influence of environmental and infrastructure factors.
Notably, the composite MRI Score appears within the top features across models, demonstrating its effectiveness as a synthesized predictor that captures complex interactions between mobility, road, and intersection-level risks.
Interestingly, while Day of Week and Time of Day show modest influence in some models, their impact is less pronounced than that of Month and Time, suggesting that broader temporal trends (e.g., seasonal variations) may outweigh daily rhythms. Additionally, Vehicle Speed, Speed Limit, and Driver Age exhibit variable importance depending on the model architecture, indicating potential sensitivity to feature engineering and algorithmic assumptions.
These results collectively confirm that the MRI-Copula framework integrates both temporal dynamics and behavioural/environmental risk factors, enabling a comprehensive understanding of crash severity determinants. The consistent ranking of time-related and violation-based features across models strengthens confidence in their predictive relevance and supports their use in proactive safety planning.
ALE analysis provides granular insights into how key features influence crash risk:
Vehicle Speed (
Figure 4a): The effect remains flat at lower speeds but starts to rise around 60 km/h, peaking sharply near 100 km/h. Beyond this point, the effect declines steeply, suggesting that risk escalates in the mid-to-high speed range but then drops once extreme speeds (>120 km/h) are reached, likely reflecting lower exposure at very high speeds (fewer vehicles or restricted conditions).
Driver Age (
Figure 4b): Crash risk is highest among younger drivers in their early 20s, dips slightly for middle-aged drivers, rises again around the 40s, and then drops sharply after age 50. The overall trend suggests elevated vulnerability among young and mid-aged drivers, while older drivers (60+) show a negligible effect, possibly due to lower exposure or more cautious driving.
Traffic Violations (
Figure 4c): The effect rises steeply with the first few violations, reaching its peak quickly, and then declines slightly as violations accumulate. This pattern implies that even a small number of violations is strongly associated with higher crash risk, but beyond a certain point, the incremental effect diminishes, possibly due to behavioural adaptation or enforcement interventions targeting repeat violators.
Together, SHAP and ALE plots move beyond identifying important features to reveal the precise, non-linear nature of their effects, providing actionable thresholds for targeted policy interventions.
4.7. Operational Utility: Decision Curve Analysis
To evaluate the operational relevance of the MRI-Copula framework, we conducted a decision curve analysis (DCA) to assess its net benefit across a range of threshold probabilities for risk-based intervention. As illustrated in
Figure 5, the MRI-Copula model demonstrates a consistently positive net benefit across the majority of clinically meaningful probability thresholds, significantly outperforming the “treat none” strategy and surpassing the “treat all” approach over a broad range (approximately 0.1 to 0.8).
The net benefit of the MRI-Copula model remains robust, indicating that its use in guiding preventive actions such as targeted traffic enforcement, adaptive signal timing adjustments, or location-specific public awareness campaigns would lead to improved outcomes compared to either universal intervention or no action. Notably, the model achieves its greatest relative advantage in the threshold range of 0.2–0.6, which aligns with realistic risk tolerance levels for urban traffic safety management.
In contrast, the “treat all” strategy yields diminishing net benefit as the threshold probability increases, reflecting the growing burden of unnecessary interventions. The crossing point near a threshold of 0.8 suggests that only under extremely risk-averse policies would universal action be preferable conditions that are unlikely in resource-constrained operational environments.
These findings highlight the practical value of the MRI-Copula framework in supporting precision risk mitigation, enabling transportation authorities to allocate interventions more efficiently while maximizing public safety outcomes. By identifying high-risk scenarios with greater discriminatory accuracy, the model facilitates data-driven, cost-effective decision-making in intelligent transportation systems.
5. Discussion
The MRI-Copula framework advances traffic safety analytics by integrating vine copula–based tail dependence with CatBoost-SHAP feature importance. This combination captures the compound mechanisms of crash causation while producing an interpretable and policy-relevant risk index, addressing a persistent gap in road safety research [
4,
19].
A central finding is the pivotal role of behavioural risk. The Behavioural Risk Index (BRI) emerged both as the strongest direct predictor (highest SHAP weight) and the central hub in the dependence network, showing significant tail dependence with Environmental Risk (ERI). This dual influence suggests that while environmental conditions set the stage, risky driving behaviours often act as the trigger that converts potential risk into severe crashes. Such evidence underscores the importance of behaviour-focused interventions, including speed enforcement and distracted driving campaigns [
36].
In contrast, the Systemic Risk Index (SRI) was largely independent of both ERI and BRI and contributed less predictive weight. This suggests that infrastructure-related risks, such as those linked to expressways, may be partly insulated from environmental and behavioural triggers due to design features that reduce conflict points. This challenges the conventional assumption that systemic risk always amplifies crash severity, instead highlighting the compartmentalization effect of well-designed facilities.
To further validate the structural independence of the sub-indices, a Variance Inflation Factor (VIF) analysis was conducted. The results (ERI = 1.01, BRI = 1.01, SRI = 1.00) indicate very low multicollinearity, confirming that each index captures a distinct risk dimension. This statistical independence reinforces the conceptual design of the MRI-Copula framework, ensuring that the integrated α-weighted index combines complementary—rather than redundant—sources of risk information.
The α-blend ensured that copula-based dependence contributed meaningfully (≥20%) to the composite index, even though pure SHAP weighting maximized AUC. This constraint balanced statistical optimality with theoretical robustness, allowing the MRI to capture SRI’s occasional role in extremal events while maintaining strong predictive performance (AUC = 0.987). The framework therefore provides both empirical precision and theoretical soundness in the construction of the risk index.
Comparative results across classifiers further confirmed the robustness of the MRI-Copula framework. As presented in
Section 4.6, the CatBoost + MRI-Copula configuration achieved the highest predictive performance (AUC = 0.985; F1 = 0.904), demonstrating strong discriminative capability and balanced precision–recall trade-offs. LightGBM and HistGB models achieved comparably high accuracy (AUC ≈ 0.96; F1 ≈ 0.87–0.82) but required only fractions of a second for training, compared to approximately 32 s for CatBoost. This contrast shows an important trade-off between predictive power and computational efficiency, highlighting the framework’s flexibility for different deployment contexts. CatBoost remains preferable when analytical accuracy and interpretability are prioritized, while LightGBM and HistGB provide practical, time-efficient alternatives for real-time or large-scale traffic safety monitoring applications.
Additional insights came from the ALE analysis, which highlighted a sharp escalation in crash severity risk at around 100 km/h, followed by a decline at very high speeds. This suggests that the mid-to-high speed range represents the most critical zone for intervention, offering an empirical benchmark to guide urban speed-limit design and targeted enforcement [
66]. Age-related effects showed elevated risk among young drivers in their 20s and again around the 40s, with a marked reduction after age 50, pointing to the need for both early driver safety education and mid-life refresher programs. The analysis of traffic violations revealed that even a small number of violations significantly increases crash risk, emphasizing the value of early enforcement and corrective measures before risky behaviour becomes entrenched.
Risk clustering further confirmed the heterogeneous nature of crash causation, revealing three distinct archetypes: structural (low severity, infrastructure-linked), behavioural (moderate severity, dominated by risky behaviour), and compound (high severity, combining multiple risk domains). These profiles reinforce the need for tailored countermeasures—for instance, infrastructure upgrades for structural risk, behaviour-focused enforcement for behavioural risk, and integrated weather-responsive plus behavioural interventions for compound risk [
66].
At a systems level, vine copula analysis clarified the nuanced dependencies among risk indices. The weak but significant ERI–BRI tail link (Student-t copula, τ = 0.06) indicates that adverse weather or environmental conditions can exacerbate risky behaviour, echoing findings on weather–behaviour interactions [
19]. The independence of SRI from both ERI and BRI illustrates the flexibility of copulas in capturing asymmetric and conditional relationships [
41,
67].
Decision Curve Analysis confirmed that MRI-Copula delivers greater net benefit across intervention thresholds than naïve “treat none” or “treat all” strategies [
58]. This demonstrates that predictive gains can be directly translated into more efficient, targeted safety interventions [
33,
60]. Importantly, the MRI framework provides a practical basis for operational decision-making. Authorities could, for example, set speed enforcement thresholds around the 100 km/h inflection point identified by ALE, prioritize behaviour-focused campaigns for high-BRI clusters, or design adaptive traffic management strategies that allocate resources to environmental–behavioural compound risk zones. The MRI score itself can function as a prioritization tool: segments of the network, driver groups, or time periods exceeding a given MRI threshold can be flagged for proactive intervention. Embedding these thresholds into routine enforcement dashboards or intelligent transport management systems would allow safety agencies to operationalize predictive analytics in near-real time.
The conditional independence of the risk indices offers significant policy implications. It reveals that systemic or infrastructure-related risks, such as those linked to road geometry, surface quality, and access control, serve as stabilizing elements within the broader crash-risk ecosystem. Well-designed infrastructure can therefore buffer the effects of adverse environmental conditions and risky driving behaviour, preventing these from escalating into severe crashes.
This finding emphasises the complementary roles of long-term infrastructure planning and short-term behavioural interventions in traffic safety management. Behavioural risks (BRI) require continuous enforcement and awareness programs, whereas systemic risks (SRI) demand sustained investment in road design, maintenance, and access management to enhance network resilience. Improvements such as better geometric alignment, lane separation, and roadside protection can mitigate behavioural volatility and make safety outcomes less sensitive to temporary human or environmental disturbances.
The independence of SRI from ERI and BRI thus supports a multi-layered safety strategy: behavioural countermeasures should target immediate crash precursors, while infrastructure design should function as a long-term stabilizer that structurally reduces exposure to compounding risks. This evidence-based distinction provides actionable guidance for policymakers to balance resources effectively among enforcement, education, and engineering measures.
Beyond these policy implications, the MRI-Copula framework itself functions as a scalable decision-support tool for real-world safety applications. By integrating environmental, behavioural, and systemic components within an interpretable α-weighted structure, the model enables data-driven prioritization of high-risk road segments, driver categories, or time periods. The sub-indices (ERI, BRI, SRI) further guide the type of intervention required; behavioural, infrastructural, or environmental. Moreover, the comparable accuracy of LightGBM and HistGB models, coupled with their lower computational cost, allows for real-time deployment in intelligent transportation system (ITS) dashboards. This integration bridges predictive analytics with operational decision-making, supporting adaptive and context-aware traffic safety management.
Several limitations should be acknowledged; these findings should be interpreted within the scope of the study design. The analysis was based on 877 crash records from Jeddah, Saudi Arabia, which offers valuable insights into a Middle Eastern urban context but may not fully capture the diversity of cultural, infrastructural, and climatic conditions found elsewhere. The relatively modest sample size also places natural limits on statistical power, although this was partly mitigated by cross-validation and bootstrap replication. In addition, vine copula modeling remains computationally intensive, which may constrain its use in real-time or resource-limited settings. Finally, as with most observational crash data, the relationships observed are associative rather than strictly causal [
68].
Future work can address these constraints by scaling the framework to larger, multi-regional datasets to enhance generalizability, integrating geospatial analytics for corridor-level risk mapping [
50], and incorporating near-miss and exposure data to improve robustness [
45]. Addressing fairness and equity concerns is also essential to prevent algorithmic bias [
69]. Embedding MRI-Copula in human-in-the-loop decision systems could further bridge analytics with practitioner judgment, enabling adaptive and context-aware safety management.
6. Conclusions
This study developed the MRI-Copula framework, a novel hybrid model that integrates Vine Copula–based extremal dependence with CatBoost–SHAP explainable machine learning to predict road crash severity in complex urban environments. The framework bridges the methodological gap between dependence modeling and predictive analytics by combining tail dependence and feature importance into a unified, interpretable Multivariate Risk Index (MRI). Empirical results from 877 crash records in Jeddah, Saudi Arabia, demonstrate that behavioural risk is the most influential determinant of crash severity, interacting significantly with environmental conditions, while systemic risk functions as a stabilizing factor that mitigates compounding effects under well-designed infrastructure.
The MRI-Copula framework achieved state-of-the-art performance (AUC = 0.986; F1 = 0.904) and outperformed correlation and PCA-based alternatives while maintaining high interpretability. SHAP and Accumulated Local Effects (ALE) analyses revealed nonlinear escalation thresholds, particularly around 100 km/h and among younger and mid-aged drivers, providing clear guidance for targeted interventions. Decision Curve Analysis (DCA) further confirmed the framework’s practical utility, demonstrating greater net benefit than naïve “treat-all” or “treat-none” strategies across realistic intervention thresholds.
Beyond methodological innovation, MRI-Copula provides actionable policy relevance by linking predictive analytics to operational decision-making. The interpretable sub-indices; Environmental (ERI), Behavioural (BRI), and Systemic (SRI), enable policymakers to distinguish short-term behavioural countermeasures from long-term infrastructure strategies. The MRI score and its thresholds can be embedded into Intelligent Transportation Systems (ITS) to support proactive, real-time safety management through dynamic enforcement, adaptive zoning, and weather-responsive interventions.
In conclusion, through embedding statistical dependence modeling and explainable machine learning within a policy-ready architecture, the MRI-Copula framework advances traffic safety analytics from reactive assessment to proactive, precision risk management. Future extensions should explore real-time deployment, multi-regional validation, integration with geospatial and near-miss data, and fairness evaluation to enhance its generalizability and societal impact.