Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost

Wang, Hongxiao; Liang, Guohua

doi:10.3390/app15189909

Open AccessArticle

Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost

by

Hongxiao Wang

^1,2 and

Guohua Liang

^1,*

¹

College of Transportation Engineering, Chang’an University, Xi’an 710064, China

²

Department of Mechanical and Traffic Engineering, Ordos Institute of Technology, Ordos 017000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(18), 9909; https://doi.org/10.3390/app15189909

Submission received: 19 August 2025 / Revised: 4 September 2025 / Accepted: 6 September 2025 / Published: 10 September 2025

Download

Browse Figures

Versions Notes

Abstract

Featured Application

This study introduces a novel application of the XGBoost algorithm in predicting pedestrian injury severity in road traffic accidents, specifically for elderly pedestrians. By integrating key features such as collision speed, injury location, and driver awareness, the model offers valuable insights into the factors contributing to the severity of injuries in elderly pedestrians. The application of this model can be used in road safety planning, policy-making, and targeted interventions aimed at reducing the risk to elderly pedestrians. Additionally, by utilizing SHAP analysis, this work provides a deeper understanding of the critical factors influencing accident severity, which can guide the development of safety measures, such as speed regulation, better pedestrian infrastructure, and enhanced driver awareness programs. This work holds practical implications for improving traffic safety for vulnerable groups, particularly the aging population, in urban planning and public safety strategies.

Abstract

With declining physical functions, elderly pedestrians face a significantly higher risk of severe injuries and fatalities in traffic accidents. This study investigates the factors influencing injury severity among elderly pedestrians using traffic accident reports collected by the Shaanxi Chang’an University Traffic Accident Evidence Identification Center, covering nationwide cases from 2023 to 2024. By analyzing 2351 accident reports involving pedestrians aged 60 and above, 31 feature variables closely related to accident severity were selected to build a predictive model based on the XGBoost algorithm. Additionally, the SHAP method was employed to perform feature attribution analysis on the model’s key variables. The experimental results show that: (1) the model achieved 86% accuracy, 83% precision, 87% recall, and an F1 score of 85%, demonstrating the reliability of XGBoost in predicting injury severity among elderly pedestrians. (2) Global analysis identified collision speed, injury location, and driver awareness as the main factors influencing injury severity. However, the key factors differ across accidents of different severity levels. (3) The effect of the same factor also varies by severity level. For example, driver awareness reduces the likelihood of minor injuries but has less impact on severe injuries or fatalities. This study provides a theoretical foundation for developing traffic safety policies targeting elderly pedestrians and contributes to effectively reducing the severity of injuries in elderly pedestrian traffic accidents.

Keywords:

XGBoost model; elderly pedestrians; analysis of accident severity; traffic accidents; road safety

1. Introduction

Pedestrians are a vulnerable group in road traffic, with elderly individuals facing the greatest risk. Declining physical ability and slower reaction times place them at a much higher risk of severe injury or death, especially during moderate- to high-impact collisions [1,2]. The World Health Organization (WHO) reports that about 1.3 million people die in road accidents each year, nearly 100,000 of whom are elderly pedestrians [3]. In China, elderly pedestrians account for more than 30% of traffic-related deaths and injuries, a share that continues to rise with population aging [4]. To reduce these risks, many countries have implemented countermeasures. For example, the European Union’s “Road Safety Plan” promotes dedicated pedestrian paths, improved signal visibility, and targeted safety education [5], while the United States has enhanced infrastructure with speed bumps and safer crossings [6]. However, despite these measures, data indicate that injuries among elderly pedestrians remain a serious problem. Recent studies show that elderly pedestrians constitute a substantial share of fatal crashes in Europe, underscoring the urgent need for focused interventions [7]. Therefore, analyzing the factors influencing accident severity among elderly pedestrians is essential for guiding safety technologies and policies, and for reducing injury severity in this vulnerable group.

Research on pedestrian injury severity has long been a focus in traffic safety. Early studies used statistical approaches, such as the negative binomial regression [8,9,10] and the Logit model [11,12,13]. Naghawi et al. [14] applied negative binomial regression to classify severity into fatal and non-fatal categories and to examine the effect of traffic volume. They reported that higher traffic volumes are positively associated with severity, especially in high-density areas where both accident frequency and injury severity rise markedly. Sze et al. [15] used a Logit model to classify severity into minor, severe, and fatal categories, analyzing the effects of speed limits, road type, weather, and traffic volume. They found that fatal accidents are far more likely on roads with speed limits above 80 km/h, particularly under poor visibility and adverse weather, where accident severity is consistently higher. These early statistical and logistic regression methods provided a theoretical foundation for understanding accident mechanisms and predicting severity. However, they assume linear relationships, limiting their ability to capture complex nonlinear interactions between factors [16,17,18]. In addition, they require strict assumptions, and violations can lead to inaccurate predictions [19]. Moreover, large datasets greatly increase computational workload.

In recent years, advances in artificial intelligence have made machine learning the primary tool for predicting injury severity, as these models capture complex nonlinear relationships between variables [20,21]. Compared with traditional regression methods, machine learning can handle more features and detect hidden patterns in large datasets, thereby improving accuracy [22]. Random forests are among the most widely used models, valued for their strong feature selection and high predictive accuracy [23,24]. Yang et al. [25] used random forests to classify severity into property damage, minor injury, and fatality, predicting accidents on mountainous highways. They reported that vehicle type, driver violations, and road structure were key factors. Neural networks, especially deep neural networks (DNNs) and convolutional neural networks (CNNs), have also been widely applied [26,27,28]. Habibzadeh et al. [29] developed a rural road prediction model using neural networks with inverse residuals and an attention mechanism. Their model outperformed regression methods by 53.9% in accuracy. Gradient Boosting models, including XGBoost [30], LightGBM [31], and CatBoost [32], have become important tools for handling traffic accident data in recent years. Reviews highlight their superior performance in accident modeling and improved interpretability [33]. These models improve accuracy by combining multiple weak learners. Researchers have applied these models to analyze severity. Results show that XGBoost outperforms random forests and decision trees, especially when handling complex interactions among road design, driver behavior, and environmental conditions. Further analysis suggests that XGBoost is a more suitable model for severity prediction [34,35,36,37,38,39]. Although widely applied and accurate, XGBoost remains a “black box,” limiting interpretability. To address this, recent studies incorporate SHAP (Shapley Additive Explanations) to interpret model predictions [40,41]. SHAP reveals the contribution of each feature, helping researchers understand decision processes and optimize safety strategies.

The above analysis shows that while XGBoost has advantages in predicting accident severity, important limitations remain. Most existing studies focus on the general population, with few dedicated to elderly pedestrians. Their unique physiological, psychological, and behavioral characteristics lead to markedly different risks compared with other groups [42]. As a result, traditional models often fail to capture these risks in complex traffic environments and lack attribution analysis to support targeted safety policies. In addition, current learning models mainly provide qualitative analysis and lack an interpretable evaluation of key features, limiting researchers’ ability to understand how these factors affect injury severity.

To address these limitations, this study proposes an XGBoost-based model to predict the severity of elderly pedestrian accidents and applies SHAP values to analyze the contribution of key factors. We first used 2351 elderly pedestrian accident reports collected nationwide by the Shaanxi Chang’an University Traffic Accident Evidence Identification Center between 2023 and 2024. Outcomes were classified into three levels: minor injury, severe injury, and fatality. Thirty-one variables from five dimensions—pedestrians, drivers, vehicles, roads, and environment—were selected as predictors. We then built the prediction model using XGBoost (version 3.0.5, Python package, available at https://xgboost.ai). Finally, SHAP was applied to rank the importance of factors and reveal their impact mechanisms, providing a basis for improving prediction methods and informing safety policy.

2. Data and Methods

2.1. Data Sources and Preprocessing

The data were obtained from accident reports collected by the Shaanxi Chang’an University Traffic Accident Evidence Identification Center between 2023 and 2024. These reports contained details on accident time and location, road conditions, traffic signals, weather, injury severity, and driver behavior. Following the World Health Organization’s definition of elderly age, we selected cases involving pedestrians aged 60 and above. After processing abnormal and missing entries, the final dataset comprised 4681 complete reports for analysis and modeling.

Injury severity was classified into three categories: minor, severe, and fatal. This classification served as the target variable for prediction. To balance the dataset, we filtered the reports to obtain equal numbers of minor, severe, and fatal cases, minimizing disparities across categories. This reduced bias toward any single category during training. The final dataset included 2351 reports. From these reports, 31 variables were selected from five dimensions—pedestrians, drivers, vehicles, roads, and environment—as predictors (Table 1). All features were extracted from the available reports. Because of limitations in the quantity and quality of textual records, some influencing factors may not have been captured if not explicitly documented or rarely reported. For example, behaviors such as pedestrians running red lights or driver distraction were not systematically recorded and could not be included. Nevertheless, this limitation does not compromise the validity of our approach. If such factors are documented in future reports, they can be incorporated into the feature set and analyzed within the same framework.

2.2. Model Construction and Evaluation Metrics

In predicting injury severity, three tree-based ensemble models are commonly used: Random Forest (RF), LightGBM, and XGBoost. RF aggregates bagged decision trees, offering robustness to noise, minimal preprocessing, and competitive performance with simple hyperparameter tuning. LightGBM is a histogram-based, leaf-wise gradient boosting framework that trains quickly on large, high-dimensional, and sparse datasets. However, its leaf-wise growth requires careful regularization on medium or smaller datasets. XGBoost implements second-order gradient boosting with built-in regularization (e.g., gamma, min_child_weight, reg_alpha, reg_lambda). It also supports efficient parallelization, handles missing values natively, and is fully compatible with SHAP TreeExplainer for consistent global and local attribution.

In this study, the dataset is medium-sized (2351 cases) and contains heterogeneous, mainly categorical or ordinal variables across five dimensions. Given these characteristics, we required a model that delivers stable accuracy on medium-scale tabular data, controls overfitting through explicit regularization, and supports reliable interpretability. XGBoost best satisfied these requirements, showing stable performance with less tuning sensitivity in pilot runs, while remaining fully compatible with SHAP for transparent attribution. Accordingly, we selected XGBoost as the primary model.

XGBoost is an efficient ensemble method based on gradient-boosted decision trees. It builds successive decision trees as weak learners, iteratively optimizing the loss function to model complex features and nonlinear relationships. Compared with traditional machine learning, XGBoost offers strong generalization, efficient parallel computation, and built-in regularization to prevent overfitting. These strengths make it well suited to traffic safety prediction, where accident data are high-dimensional and involve complex, nonlinear interactions. Its core principle is to use the negative gradient of the loss function as the target for each new tree. Each iteration adds a tree that approximates the residual errors of the previous model, thereby reducing prediction error and improving performance.

For a given traffic accident dataset

D = {(x_{i}, y_{i})}_{i = 1}^{n}

, where

x_{i} \in ℝ^{m}

represents the feature vector of the i-th sample, and y_i ∈ {1,2,3} denotes the injury severity level of the elderly pedestrian in the i-th sample (corresponding to minor injury, severe injury, and fatality, respectively). The objective function of the XGBoost model at the t-th iteration can be expressed as:

L^{(t)} = \sum_{i = 1}^{n} l (y_{i}, {\hat{y}}_{i}^{(t - 1)} + f_{t} (x_{i})) + Ω (f_{t}),

(1)

where

L^{(t)}

represents the objective function value of the model after the t-th iteration, reflecting the balance between the model’s prediction error and model complexity; l() is the loss function, used to measure the discrepancy between the true accident severity label y_i and the predicted value

{\hat{y}}_{i}^{(t - 1)} + f_{t} (x_{i})

;

{\hat{y}}_{i}^{(t - 1)}

represents the prediction result of the model from the t−1-th iteration for the i-th sample; f_t(x_i) denotes the correction value predicted by the new decision tree constructed at the t-th iteration for the i-th sample; Ω(f_t) is the regularization term of the model, used to penalize the complexity of the decision tree model to prevent overfitting. The regularization term of the XGBoost model is specifically defined as:

Ω (f) = γ T + \frac{1}{2} λ ‖ w ‖^{2},

(2)

where γ represents the penalty coefficient for the number of leaf nodes in each decision tree, controlling the complexity of the tree model; T denotes the total number of leaf nodes in the decision tree model; λ is the L2 regularization coefficient for the leaf node weights; w is the weight vector for each leaf node in the tree model;

‖w‖

represents the L2 norm of the weight vector. The loss function l is in the form of multiclass cross-entropy, which is mathematically defined as:

l (y_{i}, \hat{y_{i}}) = - \sum_{k = 1}^{K} I (y_{i} = k) \log (p_{i, k}),

(3)

where

I (y_{i} = k)

is the indicator function, taking the value 1 when the accident sample label y_i equals category k, and 0 otherwise; p_i_,k represents the probability that the model predicts the i-th accident sample belongs to category k; K is the total number of categories for injury severity prediction (in this paper, K = 3).

XGBoost parameters are generally grouped into three categories: general, task, and regularization. Key parameters include: the number of estimators (n_estimators), which controls predictive power and generalization; maximum tree depth (max_depth) and minimum child weight (min_child_weight), which regulate model complexity and reduce overfitting; learning rate (learning_rate), which governs update step size; subsample ratio (subsample) and feature subsample ratio (colsample_bytree), which enhance generalization and reduce redundancy; and random seed (random_state), which ensures reproducibility.

To optimize these parameters, we applied a grid search to systematically explore candidate combinations and evaluate their performance. Grid search predefines candidate combinations, trains models iteratively, and evaluates performance to identify the best setting. To ensure stability, we combined grid search with 5-fold cross-validation. The data were randomly split into five subsets: four for training and one for validation. This process was repeated five times, and the average performance was used as the final criterion, reducing the effect of random partitioning. Using grid search with 5-fold cross-validation, we determined the optimized hyperparameters for XGBoost (Table 2).

To comprehensively and objectively evaluate the performance of the XGBoost model in predicting the injury severity of elderly pedestrians in road traffic accidents, accuracy, precision, recall, and the combined evaluation metric F1-score were selected as performance measures.

Accuracy is used to reflect the overall correctness of the model’s predictions, and is defined as:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N},

(4)

where TP represents the number of samples where both the predicted and actual labels are positive, TN represents the number of samples where both the predicted and actual labels are negative, FP represents the number of samples where the actual label is negative but the predicted label is positive, and FN represents the number of samples where the actual label is positive but the predicted label is negative.

Precision is the proportion of true positive risk factors among the factors predicted by the model as risk factors, and is defined as:

Precision = \frac{T P}{T P + F P},

(5)

Recall reflects the model’s ability to capture the true risk factors comprehensively, and is defined as:

Recall = \frac{T P}{T P + F N},

(6)

The F1-score combines and balances precision and recall, providing a better reflection of the model’s recognition performance, and is therefore used as the core evaluation metric, defined as:

F 1 - score = 2 \times \frac{Precision \times Recall}{Precision + Recall},

(7)

2.3. SHAP Attribution Analysis Method

SHAP is a game-theoretic approach for interpreting machine learning models, quantifying the contribution of each feature to prediction outcomes. It applies Shapley values to fairly distribute feature contributions, thereby explaining the decision process of complex models such as XGBoost. In classification tasks, SHAP values show how each feature contributes to the probability of predicting a specific class [43].

Specifically, SHAP values compute the marginal contribution of each feature across all possible feature permutations. For a given feature x_j, its SHAP value ϕ_j represents the impact of that feature on the prediction outcome. The calculation formula for SHAP values is:

ϕ_{j} = \sum_{S \subseteq N ∖ {j}} \frac{| S |! (| N | - | S | - 1)!}{| N |!} [f (S \cup {j}) - f (S)]

(8)

where f(S) represents the model’s prediction value when using the feature set S, S∪{j} represents the case when feature x_j is added to the feature set S, N is the feature set, and ∣S∣ and ∣N∣ represent the sizes of the sets S and N, respectively.

In multiclass problems, SHAP values assign contribution scores to each class. A positive value indicates that a feature increases the probability of a given class, while a negative value indicates a decrease. The absolute value reflects the magnitude of a feature’s influence, with larger values indicating greater impact. Thus, SHAP quantifies the influence of features on predictions and reveals their effects across different classes (e.g., minor injury, severe injury, and fatality).

3. Results

3.1. Model Evaluation Results

In total, 80% of the data was used as the training set, and 20% as the test set, to systematically evaluate and analyze the model’s classification prediction ability. The evaluation metric results obtained are shown in Table 3.

As shown in Table 3, the XGBoost model demonstrates a high predictive capability in predicting the injury severity of elderly pedestrians in traffic accidents. All evaluation metrics (accuracy, precision, recall, and F1 score) are above 80%, indicating that the model performs excellently. Therefore, the model’s performance can be considered outstanding, and the SHAP attribution analysis method can be further used to explain the model and analyze the underlying mechanisms of accident severity.

To further illustrate the classification accuracy of the model across different severity levels, a confusion matrix was constructed (Figure 1). The diagonal cells show that 147 minor injury cases, 138 severe injury cases, and 135 fatality cases were correctly classified, demonstrating the strong predictive ability of the model. Off-diagonal cells indicate some degree of misclassification, such as 11 minor injury cases being misclassified as severe injuries, 14 severe injury cases being misclassified as fatalities, and 8 fatalities being predicted as severe injuries. Overall, the results confirm that the model achieves high classification accuracy and provides a reliable basis for further interpretability analysis.

3.2. Global Prediction Results

Figure 2a presents the mean absolute SHAP values for each feature, summed across all categories and ranked from top to bottom. This provides a global explanation of feature importance in the model’s predictions. The horizontal length of each bar indicates absolute importance, with longer bars reflecting greater overall contribution. Figure 2a shows that collision speed, main injury location, driver awareness, pedestrian age, weight, and visibility blind spots have the largest impact on accident severity. In contrast, driver condition, vehicle type, road structure, and driver health status show lower importance. Furthermore, the key factors differ across minor, severe, and fatal injuries.

Figure 2b–d present SHAP summary plots for minor injury, severe injury, and fatality, respectively. Each point represents a sample, the X-axis shows the SHAP value, and the Y-axis ranks the features. Red denotes positive contributions, and blue denotes negative contributions.

In Figure 2b, collision speed strongly affects minor injuries, with both positive and negative contributions. Main injury location, driver awareness, and visibility blind spots show smaller but notable effects. In Figure 2c, traffic volume, visibility, pedestrian age, and collision speed have strong effects on severe injuries, while gender and collision direction also show notable influence. In Figure 2d, main injury location and collision speed are the dominant factors for fatalities, with the former showing especially strong influence. Driver awareness also contributes notably to fatalities.

3.3. Local Prediction Results

The top three factors with the greatest impact on elderly pedestrian injury severity (collision speed at impact, main injury location, and whether the driver noticed the elderly pedestrian) were selected for local analysis, resulting in the SHAP dependence plots shown in Figure 3, Figure 4 and Figure 5. In these plots, the x-axis shows feature values, and the y-axis shows the corresponding SHAP values for the predicted category. Positive SHAP values indicate that a feature increases the probability of the predicted category, while negative values indicate a decrease. Each point represents an accident case, and its distribution shows how changes in feature values affect predictions for a specific severity outcome.

Figure 3a–c present SHAP dependence plots of collision speed for minor injury, severe injury, and fatality. In Figure 3a, speeds below 10 km/h, 20–40 km/h, 50–60 km/h, and above 80 km/h reduce the probability of minor injury. By contrast, speeds of 10–20 km/h, 40–50 km/h, and 60–70 km/h increase the probability of minor injury. In Figure 3b, speeds below 20 km/h and above 60 km/h reduce the likelihood of severe injury. Speeds of 20–40 km/h increase the likelihood of severe injury, while speeds of 40–60 km/h show mixed effects, with positive influence predominating. In Figure 3c, speeds of 10–30 km/h, 40–50 km/h, and 60–70 km/h reduce fatality risk, whereas other ranges increase it.

Figure 4a–c present SHAP dependence plots of main injury location for minor injury, severe injury, and fatality. In Figure 4a, head, neck, or torso injuries reduce the probability of minor injury, while limb injuries increase it. In Figure 4b, head injuries reduce severe injury probability, whereas neck or torso injuries increase it. Limb injuries show mixed effects. In Figure 4c, head injuries strongly increase fatality risk, while other locations reduce it.

Figure 5a–c present SHAP dependence plots for driver awareness of the pedestrian, corresponding to minor injury, severe injury, and fatality. In Figure 5a, driver awareness before the collision mainly contributes positively to minor injuries, while late or absent awareness has negative effects. In Figure 5b, awareness before the collision reduces the likelihood of severe injuries, whereas late or absent awareness increases it. In Figure 5c, awareness before the collision shows mixed effects on fatalities, but late or absent awareness consistently increases fatal outcomes.

4. Discussion

4.1. Global Results Analysis

From the global results, collision speed is the most influential factor for injury severity among elderly pedestrians, underscoring its decisive role in prediction. By the law of kinetic energy, a vehicle’s kinetic energy increases with the square of speed; higher speeds therefore produce greater impact forces and more severe injuries. Prior studies confirm that higher speeds increase accident probability and markedly raise the risk of pedestrian fatalities and injuries [44,45,46]. Elderly pedestrians are especially vulnerable at high speeds because declining physical functions and reduced resilience of bones, muscles, and skin make them prone to fractures, organ damage, and concussions. Studies by Droździel and Törnros et al. [47,48] show that at high speeds, extremely short reaction times limit the ability of drivers and elderly pedestrians to take evasive action. By contrast, lower speeds allow more reaction time and greater chances of mitigating injury. Thus, vehicle speed remains a key determinant of injury severity in elderly pedestrians. Speed management is particularly critical in areas with high elderly pedestrian activity. Measures such as lower speed limits, traffic signals, and active safety systems [49] can reduce vehicle speed and injury severity. Local governments should enforce strict speed management in high-risk zones such as residential areas, hospitals, and community centers. Practical steps include lowering statutory limits, installing traffic calming devices (e.g., speed bumps, raised crosswalks), and deploying automated enforcement. Transportation departments should also integrate active safety technologies, such as intelligent signals and vehicle-based pedestrian detection, into urban safety planning to reduce high-speed crashes involving elderly pedestrians.

The main injury location is as critical as collision speed in determining injury severity among elderly pedestrians. Head, neck, and torso injuries are strongly associated with severe injury or fatality, consistent with findings by Febres et al. [50]. In the elderly, reduced bone density and declining physical functions make head injuries especially fatal. Even at low speeds, head impacts can cause fatal injuries such as brain hemorrhages or concussions. Torso and neck injuries also contribute substantially to severity, likely because of their central physiological role. In elderly individuals, tissue fragility further hampers recovery. By contrast, limb injuries pose less threat to life. For traffic safety design, limb protection is important, but greater emphasis should be placed on the head and torso through improved pedestrian facilities, promotion of helmet use, and safer road design. Examples include better design of overpasses and underpasses, wider refuge islands, and improved nighttime lighting to reduce high-impact exposure. In addition, campaigns promoting protective behaviors (e.g., helmet use for elderly cyclists and mobility scooter riders) can further reduce severe head and torso injuries.

Additionally, driver awareness of elderly pedestrians is another critical factor. Studies by Matsui, Zhang, and others [51,52,53] show that failure to notice pedestrians increases both accident probability and injury severity. Hazard perception is therefore central to accident prevention and injury mitigation. By contrast, pedestrian awareness of vehicles has little effect on injury severity and is absent from the result plots. This suggests that declining physical functions and slower reactions prevent elderly pedestrians from avoiding accidents, regardless of vehicle awareness, limiting their ability to reduce injury severity. Other factors such as age, weight, and visibility blind spots also affect severity. In contrast, driver condition, vehicle type, road structure, and driver health have minimal influence.

4.2. Local Results Analysis

Figure 3 shows a generally linear relationship between vehicle speed and injury severity in elderly pedestrians: higher speeds lead to more severe injuries. This aligns with previous analyses. An unusual pattern was also observed: speeds below 10 km/h were sometimes associated with fatalities. Theoretically, such low speeds should not cause serious harm, yet fatalities still occurred. To investigate this anomaly, we reviewed the original accident reports for these cases. Most of these fatalities occurred when large vehicles were starting or turning, situations with substantial front and side blind spots. Elderly pedestrians often lack awareness of these blind spots, making them unable to recognize the risks when large vehicles move off or turn. Wilmut and Purcell [54] further note that declining perception, slower reactions, limited safety knowledge, and over-reliance on traffic rules hinder timely hazard recognition in the elderly. Improving elderly pedestrian safety requires both infrastructure upgrades and enhanced safety education to raise self-protection awareness. Municipal traffic authorities should work with community organizations to provide training on blind spot recognition, safe crossings, and hazard awareness. Signage warning of heavy-vehicle blind spots should be installed near intersections with high elderly pedestrian activity. These interventions, combined with infrastructure upgrades, would help both drivers and elderly pedestrians avoid low-speed but high-risk crashes.

The SHAP dependence plots in Figure 3a–c reinforce the analysis of main injury location. Head, neck, and torso injuries are strongly linked to severe outcomes or fatalities, especially in the elderly. Reduced bone density and physiological fragility make even minor head injuries potentially fatal, such as brain trauma or organ damage. By contrast, limb injuries are usually less severe, leading mainly to minor trauma or functional impairment, and rarely cause death directly. However, in elderly pedestrians, limb injuries often involve complications such as long rehabilitation after fractures or joint dysfunction. Although not fatal, these complications can severely reduce quality of life, hinder daily activities, and cause long-term health problems. Thus, the impact of limb injuries on minor and severe outcomes is complex, showing both positive and negative effects depending on injury type.

Figure 5a–c show that driver awareness before a collision affects the severity of elderly pedestrian injuries. In Figure 5a, timely awareness and evasive action reduce injury severity. In Figure 5b, however, severe injuries may still occur despite prior awareness. This typically results from speeding, which shortens reaction time and braking distance. Thus, awareness alone may not prevent or mitigate the accident. Nonetheless, driver awareness remains essential. Asadamraji et al. [55] report that under normal driving conditions, timely awareness and evasive action significantly reduce harm to elderly pedestrians. Conversely, late or absent awareness usually leads to more severe outcomes. This highlights the need to strengthen pedestrian recognition and hazard perception, particularly for drivers in elderly pedestrian environments. Policy should reinforce driver training and continuous hazard-perception assessment, especially for professional drivers of large vehicles. Local governments could mandate refresher courses or simulation-based hazard-recognition tests. Adoption of driver-assistance technologies (e.g., blind-spot monitoring, pedestrian detection) in commercial fleets should also be encouraged or mandated to mitigate the identified risks.

5. Conclusions

This study predicts the severity of injuries in elderly pedestrian traffic accidents based on the XGBoost model and performs an attribution analysis of the key factors contributing to the occurrence of accidents. Through an in-depth analysis of 2351 traffic accident reports from the Shaanxi Chang’an University Traffic Accident Evidence Identification Center, a predictive model was developed, and various factors influencing the severity of accidents were explored. The conclusions of the study are as follows:

(1) This study developed a predictive model for the severity of injuries in elderly pedestrian traffic accidents based on the XGBoost algorithm. By predicting three types of accident severity—minor injury, severe injury, and fatality—the model achieved strong performance after training, with an average accuracy of 86%, precision of 83%, recall of 87%, and an F1 score of 85%. The model demonstrates excellent predictive capability and can effectively distinguish between different levels of injury severity.

(2) Collision speed at impact, main injury location, and whether the driver noticed the elderly pedestrian are key factors influencing the severity of injuries in elderly pedestrians. However, the dominant factors differ across severity levels. Model results and SHAP analysis show that lower speeds are mainly associated with minor injuries, whereas higher speeds lead to severe injuries or fatalities. Notably, when large vehicles start or turn, substantial blind spots and limited safety awareness among elderly pedestrians can lead to fatalities even at speeds below 10 km/h. Injuries to the head, neck, or torso generally result in severe or fatal outcomes, whereas limb injuries tend to cause less harm. Driver awareness also significantly influences injury severity. Failure to notice pedestrians is typically associated with more severe injuries.

(3) The impact of the same factor varies across accidents of different severity levels. For example, the driver’s awareness of the pedestrian has a positive effect on minor injuries, but its effect on severe injuries or fatalities is less significant. These findings can further refine traffic safety policies for elderly pedestrians, focusing on minimizing the probability of severe injuries or fatalities while using the least amount of manpower and resources. Specifically, stricter speed management (e.g., speed limits, bumps, automated enforcement) should be prioritized in areas with high elderly pedestrian activity, such as residential neighborhoods and hospital zones. Because head and torso injuries are strongly linked to fatalities, infrastructure should emphasize safer crossings, better lighting, and promotion of helmets for elderly cyclists. Policies should also improve driver hazard-perception training and expand adoption of driver-assistance technologies such as blind-spot monitoring and pedestrian detection. Less emphasis is needed on weaker factors such as traffic volume and road structure. Targeted policies can balance implementation costs with accident severity, effectively reducing injuries among elderly pedestrians and improving their safety.

Although this study has achieved certain predictive results through the XGBoost-based model and SHAP analysis, there are still some limitations:

(1) The conclusions of this study rely primarily on textual traffic accident reports provided by the Shaanxi Chang’an University Traffic Accident Evidence Identification Center. If certain factors that may strongly influence the injury severity of elderly pedestrians are not included in the reports or are represented by only a very limited number of cases, their effects cannot be reliably estimated. Nevertheless, this limitation does not compromise the validity of the proposed methodology. Future studies can build upon this framework, and with more comprehensive accident data, these factors can be incorporated to yield more accurate and generalizable conclusions.

(2) Traffic accident data also includes video, audio, and other modalities, but this study only utilized textual reports. Future research could expand the data sources by incorporating multimodal data, such as video and audio, to enhance the richness and quality of the data, thereby improving the accuracy and robustness of the model.

Author Contributions

Conceptualization, H.W. and G.L.; methodology, H.W. and G.L.; software, H.W.; validation, H.W. and G.L.; formal analysis, H.W.; investigation, H.W.; resources, G.L.; data curation, H.W.; writing—original draft preparation, H.W.; writing—review and editing, G.L.; visualization, H.W.; supervision, G.L.; project administration, H.W.; funding acquisition, G.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program in Shaanxi Province (grant number 2024GX-YBXM-131) and the Ordos City Key Research and Development Program (grant number YF20250245). The APC was funded jointly by the above-mentioned funding agencies.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to thank the Shaanxi Chang’an University Traffic Accident Evidence Identification Center for providing the traffic accident reports used in this study. Their valuable contribution has greatly supported the research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Azami-Aghdash, S.; Aghaei, M.H.; Sadeghi-Bazarghani, H. Epidemiology of road traffic injuries among elderly people; a systematic review and meta-analysis. Bull. Emerg. Trauma 2018, 6, 279. [Google Scholar] [CrossRef]
Furtado, B.M.A.S.M.; Lima, A.C.B.d.; Ferreira, R.C.G. Road traffic accidents involving elderly people: An integrative review. Rev. Bras. Geriatr. E Gerontol. 2019, 22, e190053. [Google Scholar] [CrossRef]
World Health Organization. World Health Statistics 2025: Monitoring Health for the SDGs, Sustainable Development Goals; World Health Organization: Geneva, Switzerland, 2025. [Google Scholar]
Fang, T.; Xu, F.; Zou, Z. Causal Factors in Elderly Pedestrian Traffic Injuries Based on Association Analysis. Appl. Sci. 2025, 15, 1170. [Google Scholar] [CrossRef]
Nikolaou, P.; Dimitriou, L. Evaluation of road safety policies performance across Europe: Results from benchmark analysis for a decade. Transp. Res. Part A Policy Pract. 2018, 116, 232–246. [Google Scholar] [CrossRef]
Levi, S.; De Leonardis, D.; Antin, J.; Angel, L. Identifying Countermeasure Strategies to Increase Safety of Older Pedestrians; National Highway Traffic Safety Administration: Washington, DC, USA, 2013.
Sánta, E.; Szűcs, P.; Patocskai, G.; Lakatos, I. Prevalence and Characteristics of Traffic Accidents Endangering Vulnerable Pedestrians in Hungary. Eng. Proc. 2024, 79, 94. [Google Scholar]
Hilbe, J.M. Negative Binomial Regression; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Famoye, F. On the bivariate negative binomial regression model. J. Appl. Stat. 2010, 37, 969–981. [Google Scholar] [CrossRef]
Yang, S.; Berdine, G. The negative binomial regression. Southwest Respir. Crit. Care Chron. 2015, 3, 50–54. [Google Scholar]
Hensher, D.A.; Greene, W.H. The mixed logit model: The state of practice. Transportation 2003, 30, 133–176. [Google Scholar] [CrossRef]
Demaris, A. Logit modeling: Practical Applications; Sage: London, UK, 1992; Volume 86. [Google Scholar]
Hausman, J.; McFadden, D. Specification tests for the multinomial logit model. Econom. J. Econom. Soc. 1984, 52, 1219–1240. [Google Scholar] [CrossRef]
Naghawi, H. Negative binomial regression model for road crash severity prediction. Mod. Appl. Sci. 2018, 12, 38. [Google Scholar] [CrossRef]
Sze, N.-N.; Wong, S. Diagnostic analysis of the logistic model for pedestrian injury severity in traffic crashes. Accid. Anal. Prev. 2007, 39, 1267–1278. [Google Scholar] [CrossRef]
Xie, Y. Values and limitations of statistical models. Res. Soc. Stratif. Mobil. 2011, 29, 343–349. [Google Scholar] [CrossRef]
Sarrias, M. Individual-specific posterior distributions from Mixed Logit models: Properties, limitations and diagnostic checks. J. Choice Model. 2020, 36, 100224. [Google Scholar] [CrossRef]
Hossain, M.Z. A review on some alternative specifications of the logit model. J. Bus. Econ. Res. (JBER) 2009, 7, 15–24. [Google Scholar] [CrossRef]
Rosenthal, H. The limitations of log-linear analysis. Contemp. Sociol. 1980, 9, 207–212. [Google Scholar] [CrossRef]
Obasi, I.C.; Benson, C. Evaluating the effectiveness of machine learning techniques in forecasting the severity of traffic accidents. Heliyon 2023, 9, e18812. [Google Scholar] [CrossRef] [PubMed]
AlMamlook, R.E.; Kwayu, K.M.; Alkasisbeh, M.R.; Frefer, A.A. Comparison of machine learning algorithms for predicting traffic accident severity. In Proceedings of the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan, 9–11 April 2019; pp. 272–276. [Google Scholar]
Infante, P.; Jacinto, G.; Afonso, A.; Rego, L.; Nogueira, V.; Quaresma, P.; Saias, J.; Santos, D.; Nogueira, P.; Silva, M. Comparison of statistical and machine-learning models on road traffic accident severity classification. Computers 2022, 11, 80. [Google Scholar] [CrossRef]
Yan, M.; Shen, Y. Traffic accident severity prediction based on random forest. Sustainability 2022, 14, 1729. [Google Scholar] [CrossRef]
Wang, J.; Ma, S.; Jiao, P.; Ji, L.; Sun, X.; Lu, H. Analyzing the risk factors of traffic accident severity using a combination of random forest and association rules. Appl. Sci. 2023, 13, 8559. [Google Scholar] [CrossRef]
Yang, J.; Han, S.; Chen, Y. Prediction of traffic accident severity based on random forest. J. Adv. Transp. 2023, 2023, 7641472. [Google Scholar] [CrossRef]
Pérez-Sala, L.; Curado, M.; Tortosa, L.; Vicent, J.F. Deep learning model of convolutional neural networks powered by a genetic algorithm for prevention of traffic accidents severity. Chaos Solitons Fractals 2023, 169, 113245. [Google Scholar] [CrossRef]
Shaik, M.E.; Islam, M.M.; Hossain, Q.S. A review on neural network techniques for the prediction of road traffic accident severity. Asian Transp. Stud. 2021, 7, 100040. [Google Scholar] [CrossRef]
Rahim, M.A.; Hassan, H.M. A deep learning based traffic crash severity prediction framework. Accid. Anal. Prev. 2021, 154, 106090. [Google Scholar] [CrossRef] [PubMed]
Habibzadeh, M.; Ayar, P.; Mirabimoghaddam, M.H.; Ameri, M.; Sadat Haghighi, S.M. Analysis of the severity of accidents on rural roads using statistical and artificial neural network methods. J. Adv. Transp. 2023, 2023, 8089395. [Google Scholar] [CrossRef]
Yang, Y.; Wang, K.; Yuan, Z.; Liu, D. Predicting freeway traffic crash severity using XGBoost-Bayesian network model with consideration of features interaction. J. Adv. Transp. 2022, 2022, 4257865. [Google Scholar] [CrossRef]
Li, K.; Xu, H.; Liu, X. Analysis and visualization of accidents severity based on LightGBM-TPE. Chaos Solitons Fractals 2022, 157, 111987. [Google Scholar] [CrossRef]
Zahid, M.; Habib, M.F.; Ijaz, M.; Ameer, I.; Ullah, I.; Ahmed, T.; He, Z. Factors affecting injury severity in motorcycle crashes: Different age groups analysis using Catboost and SHAP techniques. Traffic Inj. Prev. 2024, 25, 472–481. [Google Scholar] [CrossRef]
Xue, Z.; Yao, T. Enhancing occluded pedestrian re-identification with the MotionBlur data augmentation module. Mechatron. Intell. Transp. Syst 2024, 3, 73–84. [Google Scholar] [CrossRef]
Guo, M.; Yuan, Z.; Janson, B.; Peng, Y.; Yang, Y.; Wang, W. Older pedestrian traffic crashes severity analysis based on an emerging machine learning XGBoost. Sustainability 2021, 13, 926. [Google Scholar] [CrossRef]
Wu, S.; Yuan, Q.; Yan, Z.; Xu, Q. Analyzing accident injury severity via an extreme gradient boosting (XGBoost) model. J. Adv. Transp. 2021, 2021, 3771640. [Google Scholar] [CrossRef]
Ma, J.; Ding, Y.; Cheng, J.C.; Tan, Y.; Gan, V.J.; Zhang, J. Analyzing the leading causes of traffic fatalities using XGBoost and grid-based analysis: A city management perspective. IEEE Access 2019, 7, 148059–148072. [Google Scholar] [CrossRef]
Jamal, A.; Zahid, M.; Tauhidur Rahman, M.; Al-Ahmadi, H.M.; Almoshaogeh, M.; Farooq, D.; Ahmad, M. Injury severity prediction of traffic crashes with ensemble machine learning techniques: A comparative study. Int. J. Inj. Control Saf. Promot. 2021, 28, 408–427. [Google Scholar] [CrossRef]
Chen, H.; Chen, H.; Liu, Z.; Sun, X.; Zhou, R. Analysis of factors affecting the severity of automated vehicle crashes using XGBoost model combining POI data. J. Adv. Transp. 2020, 2020, 8881545. [Google Scholar] [CrossRef]
Jiang, F.; Ma, J. A comprehensive study of macro factors related to traffic fatality rates by XGBoost-based model and GIS techniques. Accid. Anal. Prev. 2021, 163, 106431. [Google Scholar] [CrossRef]
Parsa, A.B.; Movahedi, A.; Taghipour, H.; Derrible, S.; Mohammadian, A.K. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 2020, 136, 105405. [Google Scholar] [CrossRef] [PubMed]
Laphrom, W.; Se, C.; Champahom, T.; Jomnonkwao, S.; Wipulanusatd, W.; Satiennam, T.; Ratanavaraha, V. XGBoost-SHAP and unobserved heterogeneity modelling of temporal multivehicle truck-involved crash severity patterns. Civ. Eng. J. 2024, 10, 1890–1908. [Google Scholar] [CrossRef]
Ristić, B.; Bogdanović, V.; Stević, Ž. Urban evaluation of pedestrian crossings based on Start-Up Time using the MEREC-MARCOS Model. J. Urban Dev. Manag. 2024, 3, 34–42. [Google Scholar] [CrossRef]
Li, Z. Extracting spatial effects from machine learning model using local interpretation method: An example of SHAP and XGBoost. Comput. Environ. Urban Syst. 2022, 96, 101845. [Google Scholar] [CrossRef]
Uddin, M.; Ahmed, F. Pedestrian injury severity analysis in motor vehicle crashes in Ohio. Safety 2018, 4, 20. [Google Scholar] [CrossRef]
Islam, M. An exploratory analysis of the effects of speed limits on pedestrian injury severities in vehicle-pedestrian crashes. J. Transp. Health 2023, 28, 101561. [Google Scholar] [CrossRef]
Hussain, Q.; Feng, H.; Grzebieta, R.; Brijs, T.; Olivier, J. The relationship between impact speed and the probability of pedestrian fatality during a vehicle-pedestrian crash: A systematic review and meta-analysis. Accid. Anal. Prev. 2019, 129, 241–249. [Google Scholar] [CrossRef] [PubMed]
Droździel, P.; Tarkowski, S.; Rybicka, I.; Wrona, R. Drivers’ reaction time research in the conditions in the real traffic. Open Eng. 2020, 10, 35–47. [Google Scholar] [CrossRef]
Törnros, J. Effect of driving speed on reaction time during motorway driving. Accid. Anal. Prev. 1995, 27, 435–442. [Google Scholar] [CrossRef]
Fisa, R.; Musukuma, M.; Sampa, M.; Musonda, P.; Young, T. Effects of interventions for preventing road traffic crashes: An overview of systematic reviews. BMC Public Health 2022, 22, 513. [Google Scholar] [CrossRef] [PubMed]
Febres, J.D.; Mariscal, M.Á.; Herrera, S.; García-Herrero, S. Pedestrians’ injury severity in traffic accidents in Spain: A pedestrian actions approach. Sustainability 2021, 13, 6439. [Google Scholar] [CrossRef]
Matsui, Y.; Oikawa, S. Effect of A-Pillar Blind Spots on a Driver’s Pedestrian Visibility during Vehicle Turns at an Intersection. Stapp Car Crash J. 2024, 68, 14–30. [Google Scholar] [CrossRef]
Zhang, G.; Yau, K.K.; Zhang, X. Analyzing fault and severity in pedestrian–motor vehicle accidents in China. Accid. Anal. Prev. 2014, 73, 141–150. [Google Scholar] [CrossRef]
Haleem, K.; Alluri, P.; Gan, A. Analyzing pedestrian crash injury severity at signalized and non-signalized locations. Accid. Anal. Prev. 2015, 81, 14–23. [Google Scholar] [CrossRef]
Wilmut, K.; Purcell, C. Why are older adults more at risk as pedestrians? A systematic review. Hum. Factors 2022, 64, 1269–1291. [Google Scholar] [CrossRef]
Asadamraji, M.; Saffarzadeh, M.; Ross, V.; Borujerdian, A.; Ferdosi, T.; Sheikholeslami, S. A novel driver hazard perception sensitivity model based on drivers’ characteristics: A simulator study. Traffic Inj. Prev. 2019, 20, 492–497. [Google Scholar] [CrossRef]

Figure 1. Confusion matrix of injury severity classification using the XGBoost model.

Figure 2. Global Prediction Results: (a) Feature Importance Ranking for Injury Severity; (b) Minor Injury SHAP Values; (c) Severe Injury SHAP Values; (d) Fatality SHAP Values.

Figure 3. Collision Instant Speed SHAP Dependence Plot: (a) Minor Injury SHAP Dependence Plot; (b) Severe Injury SHAP Dependence Plot; (c) Fatality SHAP Dependence Plot.

Figure 4. Main Injury Location SHAP Dependence Plot: (a) Minor Injury SHAP Dependence Plot; (b) Severe Injury SHAP Dependence Plot; (c) Fatality SHAP Dependence Plot.

Figure 5. SHAP Dependence Plot for Whether the Driver Was Aware of the Elderly Pedestrian: (a) Minor Injury SHAP Dependence Plot; (b) Severe Injury SHAP Dependence Plot; (c) Fatality SHAP Dependence Plot.

Table 1. Classification of Accident Variables.

Variable Category	Variable Name	Variable Classification
Elderly Pedestrian Factors	Age	0: 60–64 years; 1: 65–69 years; 2: 70–74 years; 3: 75–79 years; 4: 80 years and above
	Height	0: Less than 150 cm; 1: 150–154 cm; 2: 155–159 cm; 3: 160–164 cm; 4: 165–169 cm; 5: 170 cm and above
	Weight	0: Less than 50 kg; 1: 50–59 kg; 2: 60–69 kg; 3: 70–79 kg; 4: 80 kg and above
	Gender	0: Male; 1: Female
	Health Status	0: Healthy; 1: Diseased (e.g., hypertension, diabetes)
	Main Injury Location	0: Head; 1: Neck; 2: Trunk; 3: Limbs
	Awareness of Vehicle	0: Unaware of the vehicle; 1: Aware of the vehicle; 2: Unknown if aware of the vehicle (including cases of death or when the pedestrian cannot provide information)
Driver Factors	Age	0: 18–25 years; 1: 26–35 years; 2: 36–45 years; 3: 46–55 years; 4: 56–65 years; 5: 65 years and above
	Gender	0: Male; 1: Female
	Driving Experience	0: 1–5 years; 1: 6–10 years; 2: 11–20 years; 3: Over 20 years
	Driver Status	0: Fatigued driving; 1: Drunk driving; 2: Unwell driving; 3: Normal driving
	Health Status	0: Healthy; 1: Unhealthy
	Awareness of Pedestrian	0: Aware before collision; 1: Aware after collision; 2: Unaware; 3: Unknown if aware (including cases of death or when the driver cannot provide information)
Vehicle Factors	Vehicle Type	0: Small sedan; 1: Small passenger vehicle; 2: Large bus and truck; 3: Other
	Collision Instant Speed	0: Below 10 km/h; 1: 10–20 km/h; 2: 20–30 km/h; 3: 30–40 km/h; 4: 40–50 km/h; 5: 50–60 km/h; 6: 60–70 km/h; 7: 70–80 km/h; 8: 80 km/h and above
	Collision Direction	0: Head-on collision; 1: Side collision; 2: Rear-end collision
Road Factors	Road Type	0: Expressway; 1: National road; 2: Ordinary road section; 3: Provincial road; 4: County road; 5: Rural road
	Road Condition	0: Dry; 1: Damp; 2: Pooled water; 3: Icy; 4: Muddy; 5: Other
	Traffic Control	0: No traffic control; 1: Traffic signal lights; 2: Zebra crossing; 3: Other warning signs
	Road Surface Material	0: Asphalt road; 1: Cement road; 2: Gravel road; 3: Other
	Signs and Markings	0: Clear; 1: Fuzzy; 2: Missing
	Road Structure	0: Straight road; 1: Curved road; 2: Intersection; 3: Crossroad; 4: Ramp
	Traffic Volume on the Accident Section	0: Very heavy traffic; 1: Heavy traffic; 2: Light traffic with sufficient distance between vehicles; 3: Light traffic
Environmental Factors	Weather	0: Sunny; 1: Rainy; 2: Snowy; 3: Hazy; 4: Hail; 5: Overcast; 6: Other
	Lighting Conditions	0: Daytime; 1: Nighttime with street lighting; 2: Nighttime without street lighting; 3: Dawn; 4: Dusk
	Workday	0: Yes; 1: No
	Morning and Evening Rush Hours	0: Yes; 1: No
	Holidays	0: Yes; 1: No
	Visibility	0: Below 50 m; 1: 50–100 m; 2: 100–200 m; 3: Above 200 m
	Blind Spot	0: Yes; 1: No

Table 2. XGBoost Model Hyperparameters.

Parameter	Description	Setting Value
n_estimators	Number of weak classifiers	200
max_depth	Maximum tree depth	6
learning_rate	Learning rate	0.08
subsample	Row sampling ratio	0.8
colsample_bytree	Feature sampling ratio	0.8
min_child_weight	Minimum leaf node weight	3
gamma	Minimum loss reduction for node split	0.1
reg_alpha	L1 regularization weight	0.1
reg_lambda	L2 regularization weight	1
objective	Classification objective function	multi:softmax
random_state	Random seed	42

Table 3. Model Evaluation Results.

Accident Category	Accuracy	Precision	Recall	F1 Score
Minor Injury	0.86	0.83	0.89	0.86
Severe Injury	0.88	0.81	0.88	0.84
Fatality	0.84	0.84	0.84	0.84
Macro Average	0.86	0.83	0.87	0.85

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Liang, G. Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost. Appl. Sci. 2025, 15, 9909. https://doi.org/10.3390/app15189909

AMA Style

Wang H, Liang G. Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost. Applied Sciences. 2025; 15(18):9909. https://doi.org/10.3390/app15189909

Chicago/Turabian Style

Wang, Hongxiao, and Guohua Liang. 2025. "Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost" Applied Sciences 15, no. 18: 9909. https://doi.org/10.3390/app15189909

APA Style

Wang, H., & Liang, G. (2025). Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost. Applied Sciences, 15(18), 9909. https://doi.org/10.3390/app15189909

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analysis of Injury Severity in Elderly Pedestrian Traffic Accidents Based on XGBoost

Abstract

Featured Application

Abstract

1. Introduction

2. Data and Methods

2.1. Data Sources and Preprocessing

2.2. Model Construction and Evaluation Metrics

2.3. SHAP Attribution Analysis Method

3. Results

3.1. Model Evaluation Results

3.2. Global Prediction Results

3.3. Local Prediction Results

4. Discussion

4.1. Global Results Analysis

4.2. Local Results Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI