Next Article in Journal
An Intelligent Control Framework for High-Power EV Fast Charging via Contrastive Learning and Manifold-Constrained Optimization
Previous Article in Journal
Efficient Drone Data Collection in WSNs: ILP and mTSP Integration with Quality Assessment
Previous Article in Special Issue
Modeling, Simulation, and Performance Evaluation of a Commercial Electric Scooter
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimizing Traffic Accident Severity Prediction with a Stacking Ensemble Framework

1
Laboratory of Computer Science Signals Automation and Cognitivism, Department of Computer Science, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco
2
Department of Computer Science and Information Systems, University of Limerick, V94 T9PX Limerick, Ireland
3
Laboratory of Mathematics and Applications to Engineering Sciences, Department of Mathematics and Computer Science, Heigh Normal School, Sidi Mohamed Ben Abdellah University, Fez 30000, Morocco
*
Author to whom correspondence should be addressed.
World Electr. Veh. J. 2025, 16(10), 561; https://doi.org/10.3390/wevj16100561
Submission received: 14 January 2025 / Revised: 22 March 2025 / Accepted: 28 March 2025 / Published: 1 October 2025

Abstract

Road traffic crashes (RTCs) have emerged as a major global cause of fatalities, with the number of accident-related deaths rising rapidly each day. To mitigate this issue, it is essential to develop early prediction methods that help drivers and riders understand accident statistics relevant to their region. These methods should consider key factors such as speed limits, compliance with traffic signs and signals, pedestrian crossings, right-of-way rules, weather conditions, driver negligence, fatigue, and the impact of excessive speed on RTC occurrences. Raising awareness of these factors enables individuals to exercise greater caution, thereby contributing to accident prevention. A promising approach to improving road traffic accident severity classification is the stacking ensemble method, which leverages multiple machine learning models. This technique addresses challenges such as imbalanced datasets and high-dimensional features by combining predictions from various base models into a meta-model, ultimately enhancing classification accuracy. The ensemble approach exploits the diverse strengths of different models, capturing multiple aspects of the data to improve predictive performance. The effectiveness of stacking depends on the careful selection of base models with complementary strengths, ensuring robust and reliable predictions. Additionally, advanced feature engineering and selection techniques can further optimize the model’s performance. Within the field of artificial intelligence, various machine learning (ML) techniques have been explored to support decision making in tackling RTC-related issues. These methods aim to generate precise reports and insights. However, the stacking method has demonstrated significantly superior performance compared to existing approaches, making it a valuable tool for improving road safety.

1. Introduction

Road traffic crashes (RTCs) represent a significant global health crisis, ranking as the fifth-leading cause of death worldwide, following cardiovascular diseases, cancers, respiratory diseases, and digestive diseases. Annually, approximately 1.3 million lives are lost in RTCs, with an additional 20 to 50 million individuals sustaining injuries, often leading to long-term disabilities. These incidents also impose substantial economic burdens on individuals, families, and nations.
The severity of this issue is underscored by statistics from the Moroccan Ministry of Equipment, Transport, Logistics, and Water (METLW) for 2018, which reported 96,133 accidents (+6.82%), including 3066 fatal accidents (−0.62%), resulting in 3485 deaths (−0.40%), 8725 serious injuries (−4.90%), and 128,249 minor injuries (+7.65%). Globally, RTCs account for nearly one in ten deaths, with varying degrees of severity [1,2,3,4,5,6,7,8,9]. Factors contributing to casualty severity are complex and multifaceted, encompassing road conditions [10], as well as driver-related attributes such as age and gender, and environmental factors like traffic congestion [11,12,13,14,15]. Consequently, researchers worldwide are actively developing prediction models and prevention strategies, with artificial intelligence (AI) playing a crucial role. This study focuses on predicting traffic accident severity using a three-class classification system: fatal, serious, and slight [16,17,18,19,20,21]. We utilize the TRAFFIC ACCIDENTS_2019_LEEDS (TAL19) dataset, comprising 7628 instances with 18 attributes each, sourced from the Leeds City Council ML directory. The dataset includes 88 fatal, 1336 serious, and 6204 slight accidents. Model performance is evaluated using a comprehensive suite of metrics, including the confusion matrix, false positive rate, true positive rate, accuracy, sensitivity, precision, recall, F1-score, root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE), coefficient of determination, balanced accuracy, and receiver operating characteristic (ROC) analysis [22,23,24,25,26,27,28,29]. This research investigates the effectiveness of the stacking ensemble method, which combines the predictive power of multiple base models to enhance the classification of road traffic accident severity. Stacking has demonstrated superior performance compared to individual models like logistic regression, K-Nearest Neighbors, random forest, and support vector machines (SVMs), motivating its further exploration. This research demonstrates the potential of the stacking method for improving road traffic accident severity classification, offering a valuable tool for enhancing decision making and resource allocation in road safety initiatives. The structure of this article is as follows: Section 2 presents related work and the recent literature. Section 3 describes the dataset. Section 4 details the performance evaluation metrics. Section 5 presents the experimental results. Section 6 provides a discussion of the findings, and Section 7 concludes the paper.

2. Related Work

This section presents a review of the relevant literature concerning the classification of traffic accident severity. Several studies have been conducted using various learning models, with comparisons made to recent approaches in the field. In 2022, Md. Ebrahim Shaik et al. [3] provided an overview and summary of numerous neural network models for road traffic crashes (RTCs). They discussed future research directions and perspectives, including the use of the radial basis function (RBF), multilayer perceptron (MLP), and single-layer perceptron (SLP). Rezapour et al. [4] implemented multiple neural network models in 2020 to predict the intensity and frequency of RTCs. They utilized SLP, MLP, and recurrent neural network (RNN) models and evaluated their results. In 2016, Sameen et al. [5] estimated that traffic accidents would become the fifth-leading cause of death worldwide by 2030 based on a recent study. Chakraborty et al. [6] employed artificial neural network (ANN) models in 2019 to predict RTC severity. Their approach demonstrated higher accuracy and precision compared to conventional methods. Lee et al. and Ebrahim et al. conducted studies in 2020 and 2018, respectively [7,8], focusing on understanding and evaluating the importance of traffic accident severity and devising strategies to mitigate it. Behbahani et al. [9] developed an unsupervised learning method in 2018 using a radial basis function neural network (RBFNN), incorporating the least squares strategy and the recurrent least squares method (RLS). In 2017, Taamneh et al. [10] employed clustering-based classification of RTC severity, utilizing an ANN model and hierarchical clustering. Their approach aimed to improve decision making for severity mitigation compared to conventional methods. Behbahani et al. [11] proposed a novel approach in 2018, dividing the dataset into three clusters using the k-means method. They then applied MLP to predict RTC severity and improve accuracy. Sheng Dong et al. [12] analyzed and predicted RTCs in 2022 using an ensemble learning model with explanations such as Additive and Shapley. The motivation for enhancing road traffic accident severity classification using the stacking method stems from the critical need for accurate accident severity prediction, which plays a vital role in emergency response, resource allocation, and policy planning. While several methods have been proposed in the literature, existing limitations must be addressed. For instance, traditional models like logistic regression may struggle to capture complex nonlinear relationships in the data, while K-Nearest Neighbors may be sensitive to the choice of k and computationally expensive for large datasets. Random forest, although powerful, may not generalize well to unseen data, and SVMs can be highly sensitive to hyperparameter tuning. Considering these existing studies, the stacking method presents a novel approach to addressing the limitations of traditional models while exploring the benefits of combining multiple models for road traffic accident severity classification. By systematically evaluating and comparing the stacking method against traditional models such as logistic regression, K-Nearest Neighbors, random forest, and SVMs, this research aims to demonstrate its superior performance and provide a robust framework for accurate accident severity prediction [28,29,30,31,32,33,34]. Table 1 provides a summary of the related works from the literature on traffic accident severity; the reviewed studies on traffic accident severity classification employ various machine learning and deep learning techniques, including neural networks (ANN, RNN, CNN), ensemble methods, and activation functions like RBF.
Figure 1 illustrates a detailed block diagram of the stacking method for traffic accident severity classification. It begins with data collection and preprocessing, followed by data distribution analysis, which reveals class imbalances (slight, serious, fatal). To address this, resampling techniques are applied to balance the dataset before training. The methodology incorporates multiple machine learning models, including random forest, logistic regression, SVM, and K-Nearest Neighbors, combined through a stacking method to enhance classification performance. The best-performing model is then selected for further refinement. Feature analysis using explainable machine learning identifies key factors influencing accident severity, and the models are re-trained using these selected features to improve accuracy and interpretability. This comprehensive approach ensures better prediction performance by handling data imbalance, optimizing model selection, and enhancing explainability.

3. Details of the Dataset

3.1. Dataset Description

The dataset used in this study is the Traffic Accidents 2019 Leeds (TAL19) dataset, which serves as the primary source of information for road traffic accident severity classification. This section will discuss the details of the dataset, including its characteristics, variables, and data preprocessing steps. The TAL19 dataset comprises a comprehensive collection of road traffic accidents that occurred in Leeds during the year 2019. It contains a diverse range of accidents, including different severity levels and various contributing factors. The dataset consists of both numerical and categorical variables, providing valuable information for accident severity classification. The dataset includes several key variables, such as the following:
Accident ID: a unique identifier for each accident in the dataset.
Accident Date and Time: the date and time at which the accident occurred.
Accident Severity: the severity level of the accident, categorized as minor, serious, or fatal.
Location: the geographical location of the accident, represented by coordinates or street addresses.
Weather Conditions: describes the prevailing weather conditions at the time of the accident.
Road Surface Conditions: specifies the condition of the road surface during the accident, such as dry, wet, icy, etc.
Contributing Factors: indicates the factors contributing to the accident, including driver behavior, road conditions, and vehicle-related issues. Before applying machine learning models and the stacking method, the TAL19 dataset requires preprocessing steps to ensure data quality and compatibility. This may involve handling missing values, encoding categorical variables, normalizing numerical features, and addressing class imbalance if present. The dataset can then be split into training and testing sets for model development and evaluation.
The TRAFFIC ACCIDENTS_2019_LEEDS (TAL19) dataset employed for this empirical analysis was downloaded from the traffic accidents in the Leeds City Council machine learning repository. This dataset contains a total of 7628 instances, and each instance consists of 18 attributes. There are 88 fatal, 1336 serious, and 6204 slight instances. The brief dataset feature information is shown in Table 2. All the instances in the dataset with the absence of TAL19 were taken as the slight class, and the instances with the existence of TAL19 were taken as the serious class for investigation purposes. It is evident from Figs. 3,4,5 that there exists a strong degree of correlation among all the 18 attributes for the slight as well as the serious or fatal class. Among all the attributes, ‘Lighting_Condition’ and ‘Weather_Conditions’ have the highest degree of correlation (0.65) between them for the TAL19 absence class.

3.2. Machine Learning Techniques

3.2.1. Logistic Regression

Logistic regression is a classification algorithm used to model the relationship between input features and the probability of an event occurring. It is based on the logistic function and is widely used for binary classification tasks. However, in the context of the Traffic Accidents 2019 Leeds dataset, the stacking method has been found to outperform logistic regression in terms of classification accuracy and predictive power.

3.2.2. K-Nearest Neighbors

KNN is a non-parametric classification algorithm that assigns a class label to an instance based on its proximity to the K-Nearest Neighbors in the training dataset. KNN relies on the assumption that instances with similar features tend to belong to the same class. However, the stacking method has demonstrated superior performance compared to KNN in the classification of road traffic accident severity in the Traffic Accidents 2019 Leeds dataset.

3.2.3. Random Forest

Random forest is an ensemble learning method that constructs multiple decision trees and combines their predictions through voting or averaging. It is known for its robustness against overfitting and its ability to handle high-dimensional datasets. Despite its strengths, the stacking method has been shown to outperform random forest in accurately classifying the severity of road traffic accidents in the Traffic Accidents 2019 Leeds dataset.

3.2.4. Support Vector Machines

SVM is a supervised learning algorithm that aims to find an optimal hyperplane that separates instances of different classes. It works well for linearly separable data and can be extended to nonlinear classification using kernel functions. However, the stacking method has been found to outperform SVM in the classification of accident severity, suggesting that the combination of multiple models in the stacking ensemble leads to better results. The significance of the stacking method lies in its ability to leverage the strengths of multiple base models and create a meta-model that improves classification performance. By combining the predictions of different models, the stacking method can capture diverse patterns and nuances present in the Traffic Accidents 2019 Leeds dataset, leading to enhanced accuracy and predictive power for accident severity classification.

3.2.5. Stacking Method

The stacking method is an ensemble learning technique that aims to improve classification performance by combining the predictions of multiple base models. It involves training several individual models on the same dataset and then utilizing a meta-model, often referred to as a “stacker” or a “blender”, to learn how to best combine the predictions of these base models. The stacking method is particularly relevant in tackling road traffic accidents (RTCs) because accident severity classification is a complex task that often requires capturing various factors and patterns. By leveraging the strengths of different models and combining their predictions, the stacking method can enhance the overall predictive power and accuracy in RTC severity classification. The process of implementing the stacking method involves the following steps:
Dataset Split: The available data are typically divided into a training set and a holdout set. The training set is used to train the base models and create a new dataset for the stacker, while the holdout set is used to evaluate the performance of the final stacked model.
Base Model Training: Several different base models are trained using the training set. These models can be diverse in nature, such as logistic regression, random forest, support vector machines, or neural networks. Each model learns to make predictions based on the input features and the known accident severity labels.
Stacker Training: The predictions from the base models are then used as input features to train the stacker model. This meta-model learns to combine the predictions of the base models and generate a final prediction for accident severity classification. The stacker can be any classification algorithm, such as logistic regression or random forest.
Prediction: Once the stacker is trained, it can be used to make predictions on new, unseen instances. The stacker takes the predictions of the base models as input and generates the final prediction for accident severity.
The stacking method is relevant in tackling RTCs because it leverages the diversity and complementary strengths of multiple models. Different base models may excel in capturing specific patterns or relationships in the data, and by combining their predictions, the stacking method can provide more accurate and robust predictions for accident severity classification.
By employing the stacking method, researchers and practitioners can benefit from improved accuracy in predicting accident severity, leading to better decision making in emergency response, resource allocation, and policy planning. The stacking method’s ability to harness the collective intelligence of multiple models makes it a valuable tool in tackling the complexities of RTC severity classification.

4. Metrics of Evaluation for the Performance of Different Learning Models

Several metrics can be utilized to assess the classification accuracy and predictive power of different learning models for the TAL19 dataset [24,25]. The most common ones are summarized in Table 3 and Equations (1)–(8).
The precision represents the quality can be calculated in Equation (4) as
P r e c i s i o n = T r u e   p o s i t i v e T r u e   p o s i t i v e + F a l s e   p o s i t i v e
R e c a l l = T r u e   p o s i t i v e T r u e   p o s i t i v e + F a l s e   n e g a t i v e
F S c o r e = 2 × p r e c i s i o n × r e c a l l p r e c i s i o n + r e c a l l
A c c u r a c y = T r u e   p o s i t i v e + T r u e   n e g a t i v e T r u e   p o s i t i v e + T r u e   n e g a t i v e + F a l s e   p o s i t i v e + F a l s e   n e g a t i v e
R M S E = c ϵ C i ϵ T t e s t C r e c o C , i r C , i 2 c ϵ C I t e s t C
M S E = 1 N i = 1 Y Y ^ 2
M A E = c ϵ C i ϵ I t e s t C r e c o C , i r C , i c ϵ C I t e s t C
B A = C × T r u e   P o s i t i v e P + T r u e   N e g a t i v e N

5. Experiments

Table 4 and Figure 2 show strong correlation among all 13 attributes and both the fatal class and the others class. Notably, the attributes ‘Lighting Conditions’ and ‘Weather Conditions’ exhibit the highest degree of correlation within the fatal class. The visual representation in Figure 2, specifically the upper bars and positive values in Table 3, provides evidence of this high correlation among most attributes, with a few exceptions like ‘vehicle number’, ‘northing’, and ‘accident date’ that show relatively low values.
Furthermore, in the case of the serious class, it is apparent from Figure 3 and Table 4 that a significant correlation exists among all 11 attributes and both the serious class and the others class within the TAL19 dataset. Among these attributes, ‘Lighting Conditions’ and ‘time’ exhibit the highest degree of correlation. The visual representation in Figure 3, including the upper bars and positive values in Table 4, indicates a notable correlation among the responsible attributes, with only a few attributes displaying relatively small values.
Lastly, in the case of the slight class, it is evident from Figure 4 and Table 5 that a strong correlation exists among all nine attributes and both the slight class and the others class within the TAL19 dataset. Among these attributes, ‘Road Surface’ and ‘Water Condition’ demonstrate the highest degree of correlation. Conversely, certain attributes such as ‘vehicle number’, ‘time’, and ‘grid ref easting’ exhibit a lower degree of correlation, although the extent of this decrease in correlation is minimal.
The experiments were conducted to predict the severity of road traffic crashes (RTCs) in the transport and logistics sector using the ACCIDENTS_2019_LEEDS dataset. Various classifiers were employed, with the dataset randomly split into 80% for training and 20% for testing. The sampling methods utilized included tenfold cross-validation, stratified shuffle split, and random sampling.
Based on the results presented in Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10 and Figure 5, Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10, the proposed classifiers exhibited favorable performance. The F1-score, which combines precision and recall ratios, was used as a measure of model ability.
Figure 7, Figure 8 and Figure 9, as well as Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12, provide valuable insight into the precision, recall, F1-score, fscore, and report rate offered by the improved classifiers. Similarly, the mean absolute error (MAE) and root mean square error (RMSE) performances are illustrated in Figure 10. The performance of the stacking model is reported in Table 13.

6. Discussion

Road traffic crashes are a major global concern, requiring advanced prediction methods for effective mitigation. The results in Table 13 show that traditional models like Bayesian regression, SVM, and linear regression have relatively high error rates, indicating their limited ability to capture complex patterns in the data. Random forest (RF) improves performance due to its ensemble nature, reducing errors compared to individual models. The K-Nearest Neighbors (KNN) model further lowers the error, but its performance is slightly inferior to stacking. The stacking model achieves the best overall results (MSE: 0.133066, RMSE: 0.364783), significantly outperforming all other models. Table 14 depicts the previous study of road accident prediction based on deep learning techniques such as RNN and CNN. Some research has shown that the accuracy of the various machine learning/deep learning approaches is greatly affected by different hyperparameters, so it is necessary to change the variables of the models before presenting the output.

7. Conclusions

Road traffic crashes pose a significant risk to the transport and logistics sector and contribute to a considerable number of accidents. Predicting these crashes at an early stage can not only help reduce the cost of fatalities but also save lives. Therefore, the development of a robust prediction system is of utmost importance. In this study, we conducted an experimental setup to evaluate seven different supervised machine learning techniques in terms of precision, accuracy, recall/sensitivity/true positive rate, specificity, negative predictive value, false positive rate (FPR), F1-score, ROC curve, and rate of misclassification. The goal was to identify the most suitable learning model for predicting road traffic crashes. The experimental results demonstrated the effectiveness of the proposed stacking method, achieving a maximum accuracy rate. The findings highlight the advantages of the stacking method in predicting road traffic crashes and emphasize its potential for practical implementation. Future research will focus on developing improved classification models by incorporating ensemble learning and collaborative learning approaches, along with metaheuristic optimization techniques. This approach aims to enhance the overall performance and accuracy of the prediction system. In conclusion, this study has made significant progress in the prediction of road traffic crashes, providing valuable insights into the performance of different learning models. The proposed stacking method demonstrates superior performance compared to conventional approaches, paving the way for the development of more effective and efficient prediction systems in the future.

Author Contributions

Conceptualization, I.E.M. and M.A.M.; methodology, J.R. and H.T.; software, I.E.M.; validation, M.A.M. and N.S.N.; formal analysis, N.S.N.; investigation, H.T.; resources, H.T.; data curation, M.E.M.; writing—original draft preparation, I.E.M.; writing—review and editing, N.S.N.; visualization, N.S.N.; supervision, M.A.M.; project administration, H.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data supporting the reported results are not publicly available due to ethical restrictions but can be requested from the corresponding author with appropriate approvals.

Conflicts of Interest

No conflicts of interest.

References

  1. WHO. Global Status Report on Road Safety 2021; GRSF, Global-Road-Safety-Facility-GRSF-Annual-Report-2021. 2021. Available online: https://www.who.int/teams/social-determinants-of-health/safety-and-mobility/decade-of-action-for-road-safety-2021-2030 (accessed on 13 January 2025).
  2. Ministry of Equipment, Transport, Logistics and Water. 2017. Available online: https://www.equipement.gov.ma/Transport-routier/Chiffres-cles/Pages/Securite-Routiere-en-chiffres.aspx (accessed on 21 January 2017).
  3. Shaik, M.E.; Islam, M.M.; Hossain, Q.S. A review on neural network techniques for the prediction of road traffic accident severity. Asian Transp. Stud. 2021, 7, 100040. [Google Scholar] [CrossRef]
  4. Rezapour, M.; Nazneen, S.; Ksaibati, K. Application of deep learning techniques in predicting motorcycle crash severity. Eng. Rep. 2020, 2, e12175. [Google Scholar]
  5. Sameen, M.I.; Pradhan, B.; Shafri, H.Z.M.; Hamid, H.B. Applications of deep learning in severity prediction of traffic accidents. In Proceedings of the Global Civil Engineering Conference, Kuala Lumpur, Malaysia, 25–28 July 2017; Springer: Singapore, 2019; pp. 793–808. [Google Scholar]
  6. Chakraborty, A.; Mukherjee, D.; Mitra, S. Development of pedestrian crash prediction model for a developing country using artificial neural network. Int. J. Inj. Control. Saf. Promot. 2019, 26, 283–293. [Google Scholar] [CrossRef]
  7. Lee, J.; Yoon, T.; Kwon, S.; Lee, J. Model evaluation for forecasting traffic accident severity in rainy seasons using machine learning algorithms: Seoul city study. Appl. Sci. 2019, 10, 129. [Google Scholar] [CrossRef]
  8. Ebrahim, S.; Hossain, Q.S. An Artificial Neural Network Model for Road Accident Prediction: A Case Study of Khulna Metropolitan City. In Proceedings of the 4th International Conference on Civil Engineering for Sustainable Development ICCESD-2018, Khulna, Bangladesh, 9–11 February 2018. [Google Scholar]
  9. Taamneh, M.; Taamneh, S.; Alkheder, S. Clustering-based classification of road traffic accidents using hierarchical clustering and artificial neural networks. Int. J. Inj. Control. Saf. Promot. 2017, 24, 388–395. [Google Scholar]
  10. Dong, C.; Shao, C.; Li, J.; Xiong, Z. An improved deep learning model for traffic crash prediction. J. Adv. Transport. 2018, 2018, 3869106. [Google Scholar]
  11. Behbahani, H.; Amiri, M.A.; Imaninasab, R.; Alizamir, M. Forecasting accident frequency of an urban road network: A comparison of four artificial neural network techniques. J. Forecast. 2018, 37, 767–780. [Google Scholar] [CrossRef]
  12. Dong, S. Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations. Int. J. Environ. Res. Public Health 2022, 19, 2925. [Google Scholar]
  13. Ng, A.Y.; Jordan, M.I. On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, UK, 2002; pp. 841–848. [Google Scholar]
  14. Collins, M.; Schapire, R.E.; Singer, Y. Logistic regression, AdaBoost and Bregman distances. Mach. Learn. 2002, 48, 253–285. [Google Scholar] [CrossRef]
  15. Guo, G.; Wang, H.; Bell, D.; Bi, Y.; Greer, K. KNN model-based approach in classification. In Proceedings of the on the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE—OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Sicily, Italy, 3–7 November 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 986–996. [Google Scholar]
  16. Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, IEEE, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar]
  17. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  18. Boser, B.E.; Guyon, I.M.; Vapnik, V.N. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, Pittsburgh, PA, USA, 27–29 July 1992; ACM: New York, NY, USA, 1992; pp. 144–152. [Google Scholar]
  19. Cortes, C.; Vapnik, V. Support-vector networks. Mach Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  20. Chang, C.-C.; Lin, C.-J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 27. [Google Scholar] [CrossRef]
  21. Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian network classifiers. Mach. Learn. 1997, 29, 131–163. [Google Scholar] [CrossRef]
  22. Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, WA, USA, 4 August 2001; Volume 3, pp. 41–46. [Google Scholar]
  23. Oyoo, J.O.; Wekesa, J.S.; Ogada, K.O. Predicting Road Traffic Collisions Using a Two-Layer Ensemble Machine Learning Algorithm. Appl. Syst. Innov. 2024, 7, 25. [Google Scholar] [CrossRef]
  24. Han, J.; Pei, J.; Kamber, M. Data Mining: Concepts and Techniques; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
  25. Bhavsar, H.; Ganatra, A. A comparative study of training algorithms for supervised machine learning. Int. J. Soft. Comput. Eng. 2012, 2, 2231–2307. [Google Scholar]
  26. Brodersen, K.H.; Ong, C.S.; Stephan, K.E.; Buhmann, J.M. The balanced accuracy and its posterior distribution. In Proceedings of the 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 3121–3124. [Google Scholar]
  27. Kelleher, J.D.; Mac Namee, B.; D’arcy, A. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies; The MIT Press: Cambridge, UK, 2015. [Google Scholar]
  28. Infante, P.; Jacinto, G.; Afonso, A.; Rego, L.; Nogueira, V.; Quaresma, P.; Saias, J.; Santos, D.; Nogueira, P.; Silva, M.; et al. Comparison of Statistical and Machine-Learning Models on Road Traffic Accident Severity Classification. Computers 2022, 11, 80. [Google Scholar] [CrossRef]
  29. Zhang, Y.; Sung, Y. Hybrid Traffic Accident Classification Models. Mathematics 2023, 11, 1050. [Google Scholar] [CrossRef]
  30. Islam, M.K.; Reza, I.; Gazder, U.; Akter, R.; Arifuzzaman, M.; Rahman, M.M. Predicting Road Crash Severity Using Classifier Models and Crash Hotspots. Appl. Sci. 2022, 12, 11354. [Google Scholar] [CrossRef]
  31. Ahmed, S.; Hossain, M.A.; Ray, S.K.; Bhuiyan, M.M.I.; Sabuj, S.R. A study on road accident prediction and contributing factors using explainable machine learning models: Analysis and performance. Transp. Res. Interdiscip. Perspect. 2023, 19, 100814. [Google Scholar] [CrossRef]
  32. El Mallahi, I.; Riffi, J.; Tairi, H.; Ez-Zahout, A.; Mahraz, M.A. A Distributed Big Data Analytics Models for Traffic Accidents Classification and Recognition based SparkMlLib Cores. J. Autom. Mob. Robot. Intell. Syst. 2023, 16, 62–71. [Google Scholar] [CrossRef]
  33. El Mallahi, I.; Dlia, A.; Riffi, J.; Mahraz, M.A.; Tairi, H. Prediction of Traffic Accidents using Random Forest Model. In Proceedings of the 2022 International Conference on Intelligent Systems and Computer Vision (ISCV), Fez, Morocco, 18–20 May 2022; pp. 1–7. [Google Scholar] [CrossRef]
  34. El Mallahi, I.; Riffi, J.; Tairi, H.; Mahraz, M.A. Efficient Vehicle Detection and Classification Algorithm Using Faster R-CNN Models. J. Autom. Mob. Robot. Intell. Syst. 2024, 18, 86–93. [Google Scholar] [CrossRef]
Figure 1. Detailed block diagram of the stacking method for the proposed methodology.
Figure 1. Detailed block diagram of the stacking method for the proposed methodology.
Wevj 16 00561 g001
Figure 2. Histogram of linear correlation for the fatal class.
Figure 2. Histogram of linear correlation for the fatal class.
Wevj 16 00561 g002
Figure 3. Histogram of linear correlation for the serious class.
Figure 3. Histogram of linear correlation for the serious class.
Wevj 16 00561 g003
Figure 4. Histogram of linear correlation for the slight class.
Figure 4. Histogram of linear correlation for the slight class.
Wevj 16 00561 g004
Figure 5. ROC curves of three different stacking models, one for each of the three classes.
Figure 5. ROC curves of three different stacking models, one for each of the three classes.
Wevj 16 00561 g005
Figure 6. ROC curves for the six different classifiers trained to predict the Serious class vs. the rest.
Figure 6. ROC curves for the six different classifiers trained to predict the Serious class vs. the rest.
Wevj 16 00561 g006
Figure 7. The accuracy of all classifiers.
Figure 7. The accuracy of all classifiers.
Wevj 16 00561 g007
Figure 8. Precision, recall, and fscore evaluation for all classifiers.
Figure 8. Precision, recall, and fscore evaluation for all classifiers.
Wevj 16 00561 g008
Figure 9. Performance of the stacking model.
Figure 9. Performance of the stacking model.
Wevj 16 00561 g009
Figure 10. MSE, MAE, and RMSE for all classifiers.
Figure 10. MSE, MAE, and RMSE for all classifiers.
Wevj 16 00561 g010
Table 1. This table summarizes the methodologies, results, and limitations of related works.
Table 1. This table summarizes the methodologies, results, and limitations of related works.
SLA Summary of Related WorksTechniques and Methods AppliedProblem and Approach AttackedObtained ResultLimits of Related Works and Perspective
1.Md. Ebrahim Shaik et al. [3] SLP, MLP, RBF, SLPPrediction of RTC severity Decision aids in RTC prediction Needs experimental validations
2.Rezapour et al. [4]SLP, MLP, RNN Prediction of intensity and frequency of motorbike accidentsDecision aids in motorbike conductNeeds to compare their confusion matrices and ROC
3.Sameen et al. [5]Deep learning approach (RNN and CNN)Prediction of road traffic accidents for road safety assessmentEstimation of RTC and predicts it to be the fifth-leading cause of death worldwide in 2030Classification accuracy requires improvements
4.Chakraborty et al. [6]ANN, MLPPredict traffic accident RTC and the severity of transport Prediction in injuries and deathsData normalization, standardization, and transformation
5.Lee et al. [7]Machine learning algorithmsAnticipation of RTC severity in rainy seasons Decision aids in RTC severity Anticipation of RTC severity in spring, fall, and winter seasons
6.Taamneh et al. [8]ANNSeverity prediction of traffic accidents Prediction accuracyNeeds to compare their confusion matrices and ROC
7.Behbahani et al. [9]Implement an activation function RBF, and compare it with FINN and RBFNN using a radial functionAnticipation of the frequency of RTCsPrediction frequencyNeeds to compare their confusion matrices
8.Dong et al. [10]Deep learning, FFNNTraffic crash predictionPrediction accuracyNeeds to compare their confusion matrices and ROC
9.Behbahani et al. [11]MLPPrediction of traffic accidents severity Prediction accuracyNeeds to compare their confusion matrices and ROC
10.Sheng Dong et al. [12]Ensemble learning based on boosting models such as SHAPley and exPlanationsAnalyzing and predicting RTC severityPrediction accuracyNeeds optimization model
Table 2. Encoding RTC methods and features used.
Table 2. Encoding RTC methods and features used.
SN Attributes Abbreviation
1Reference NumberReference_Number
2Grid Ref: EastingEasting
3Grid Ref: NorthingNorthing
4Number of VehiclesNumber_of_Vehicles
5Accident DateAccident_Date
6Time (24 h)Time
71st Road Class1_Road Class
81st Road Class & No1st_Road_Class_No
9Road SurfaceRoad_Surface
10Local AuthorityLocal_Authority
11Type of VehicleType_of_Vehicle
12Road SurfaceRoad_Surface
13Lighting ConditionsLighting_Conditions
14Weather ConditionsWeather_Conditions
15Age of CasualtyAge
16Type of VehicleType_Vehicle
17Sex of CasualtySex
18Casualty Severity: 1, Fatal; 2, Serious; 3, SlightCasualty_Severity Class
Table 3. Confusion matrix.
Table 3. Confusion matrix.
Predicted PositivePredicted Negative
Actual PositiveTrue Positive (TP): The actual class is positive, and the model correctly predicts it as positive.False Negative (FN): The actual class is positive, but the model incorrectly predicts it as negative.
Actual NegativeFalse Positive (FP): The actual class is negative, but the model incorrectly predicts it as positive.True Negative (TN): The actual class is negative, and the model correctly predicts it as negative.
Table 4. Report on linear correlation for the fatal class.
Table 4. Report on linear correlation for the fatal class.
Reference NumberGrid Ref:
Easting
Grid Ref:
Northing
Number of
Vehicles
Accident
Date
Time (24 h)1st Road
Class
1st Road
Class & No
Road SurfaceLighting
Conditions
Weather
Conditions
Vehicle
Number
Type of
Vehicle
Casualty
Class
Sex of
Casualty
Age of
Casualty
Reference Number10.010.060.120.070.030.100.120.380.290.480.130.340.300.250.08
Grid Ref: Easting0.0110.130.170.350.030.180.000.060.000.350.040.250.380.240.39
Grid Ref: Northing0.060.1310.280.250.170.200.040.110.070.230.200.070.080.110.51
Number of Vehicles0.120.170.2810.110.300.190.500.370.440.240.100.060.300.170.22
Accident Date0.070.350.250.1110.020.290.010.200.050.450.330.400.280.140.24
Time (24 h)0.030.030.170.300.0210.410.180.050.270.110.100.070.300.160.25
1st Road Class0.100.180.200.190.290.4110.460.160.340.150.020.100.120.100.13
1st Road Class & No0.120.000.040.500.010.180.4610.230.320.350.220.210.390.060.12
Road Surface0.380.060.110.370.200.050.160.2310.030.490.360.230.220.110.08
Lighting Conditions0.290.000.070.440.050.270.340.320.0310.550.060.220.070.210.22
Weather Conditions0.480.350.230.240.450.110.150.350.490.5510.180.410.440.070.29
Vehicle Number0.130.040.200.100.330.100.020.220.360.060.1810.420.410.080.07
Type of Vehicle0.340.250.070.060.400.070.100.210.230.220.410.4210.760.330.47
Casualty Class0.300.380.080.300.280.300.120.390.220.070.440.410.7610.140.35
Sex of Casualty0.250.240.110.170.140.160.100.060.110.210.070.080.330.1410.70
Age of Casualty0.080.390.510.220.240.250.130.120.080.220.290.070.470.350.701
Table 5. Report on linear correlation for the serious class.
Table 5. Report on linear correlation for the serious class.
Grid Ref: EastingGrid Ref: NorthingNumber of VehiclesTime (24 h)1st Road ClassRoad SurfaceLighting ConditionsWeather ConditionsVehicle NumberType of VehicleCasualty SeveritySex of CasualtyAge of Casualty
Grid Ref: Easting1.00 0.13 0.17 0.03 0.18 0.06 0.00 0.35 0.04 0.25 0.38 0.24 0.39
Grid Ref: Northing 0.13 1.00 0.28 0.17 0.20 0.11 0.07 0.23 0.20 0.07 0.08 0.11 0.51
Number of Vehicles 0.17 0.28 1.00 0.30 0.19 0.37 0.44 0.24 0.10 0.06 0.30 0.17 0.22
Time (24 h) 0.03 0.17 0.30 1.00 0.41 0.05 0.27 0.11 0.10 0.07 0.30 0.16 0.25
1st Road Class0.18 0.20 0.19 0.41 1.00 0.16 0.34 0.15 0.02 0.10 0.12 0.10 0.13
Road Surface0.06 0.11 0.37 0.05 0.16 1.00 0.03 0.49 0.36 0.23 0.22 0.11 0.08
Lighting Conditions0.00 0.07 0.44 0.27 0.34 0.03 1.00 0.55 0.06 0.22 0.07 0.21 0.22
Weather Conditions0.35 0.23 0.24 0.11 0.15 0.49 0.55 1.00 0.18 0.41 0.44 0.07 0.29
Vehicle Number0.04 0.20 0.10 0.10 0.02 0.36 0.06 0.18 1.00 0.42 0.41 0.08 0.07
Type of Vehicle0.25 0.07 0.06 0.07 0.10 0.23 0.22 0.41 0.42 1.00 0.76 0.33 0.47
Casualty Severity0.38 0.08 0.30 0.30 0.12 0.22 0.07 0.44 0.41 0.76 1.00 0.14 0.35
Sex of Casualty0.24 0.11 0.17 0.16 0.10 0.11 0.21 0.07 0.08 0.33 0.14 1.00 0.70
Age of Casualty0.39 0.51 0.22 0.25 0.13 0.08 0.22 0.29 0.07 0.47 0.35 0.70 1.00
Table 6. Report on linear correlation for the slight class.
Table 6. Report on linear correlation for the slight class.
Reference
Number
Grid Ref:
Easting
Grid Ref:
Northing
Number of
Vehicles
Accident
Date
Time (24 h)1st Road
Class
1st Road
Class & No
Road SurfaceLighting
Conditions
Weather
Conditions
Vehicle
Number
Type of
Vehicle
Casualty
Class
Sex of
Casualty
Age of
Casualty
Reference Number10.010.050.020.060.020.070.030.210.100.070.000.030.020.070.01
Grid Ref: Easting0.0110.000.140.000.010.150.090.070.040.070.090.090.020.030.05
Grid Ref: Northing0.050.0010.070.020.010.220.130.030.040.030.060.060.030.010.02
Number of Vehicles0.020.140.0710.010.010.310.210.020.070.050.570.010.430.010.11
Accident Date0.060.000.020.0110.010.000.020.020.000.080.010.040.010.040.01
Time (24 h)0.020.010.010.010.0110.060.050.040.190.030.010.050.020.000.04
1st Road Class0.070.150.220.310.000.0610.600.040.030.050.160.140.220.020.11
1st Road Class & No0.030.090.130.210.020.050.6010.060.040.090.120.040.130.030.09
Road Surface0.210.070.030.020.020.040.040.0610.140.420.090.040.010.000.01
Lighting Conditions0.100.040.040.070.000.190.030.040.1410.080.090.020.050.040.04
Weather Conditions0.070.070.030.050.080.030.050.090.420.0810.040.020.000.060.00
Vehicle Number0.000.090.060.570.010.010.160.120.090.090.0410.100.350.030.09
Type of Vehicle0.030.090.060.010.040.050.140.040.040.020.020.1010.270.180.10
Casualty Class0.020.020.030.430.010.020.220.130.010.050.000.350.2710.130.20
Sex of Casualty0.070.030.010.010.040.000.020.030.000.040.060.030.180.1310.02
Age of Casualty0.010.050.020.110.010.040.110.090.010.040.000.090.100.200.021
Table 7. Performance of the staking model.
Table 7. Performance of the staking model.
PrecisionRecallF1-ScoreSupport
Fatal61%18%29%32
Serious27%35%41%442
Slight22%47%32%17,773
Accuracy 81%2247
Macro avg49%34%32%2247
Weighted avg76%79%71%2247
Table 8. Performance of the Logistic Regression model.
Table 8. Performance of the Logistic Regression model.
PrecisionRecallF1-ScoreSupport
Fatal66%10%29%32
Serious16%17%41%442
Slight20%83%30%17,773
Accuracy 81%2247
Macro avg70%49%54%2247
Weighted avg79%81%79%2247
Table 9. Performance of the KNN model.
Table 9. Performance of the KNN model.
PrecisionRecallF1-ScoreSupport
Fatal75%53%55%32
Serious15%37%14%442
Slight10%20%31%17,773
Accuracy 88%2247
Macro avg89%67%74%2247
Weighted avg89%89%87%2247
Table 10. Performance of the Random Forest model.
Table 10. Performance of the Random Forest model.
PrecisionRecallF1-ScoreSupport
Fatal76%51%44%32
Serious15%40%35%442
Slight9%9%31%17,773
Accuracy 87%2247
Macro avg59%68%73%2247
Weighted avg81%79%70%2247
Table 11. Performance of the SVM model.
Table 11. Performance of the SVM model.
PrecisionRecallF1-ScoreSupport
Fatal56%48%34%32
Serious15%12%16%442
Slight29%40%50%17,773
Accuracy 87%2247
Macro avg59%68%73%2247
Weighted avg81%79%70%2247
Table 12. Performance of the Bayesian model.
Table 12. Performance of the Bayesian model.
PrecisionRecallF1-ScoreSupport
Fatal98%88%54%32
Serious91%89%63%442
Slight87%98%93%17,773
Accuracy 88%2247
Macro avg92%61%77%2247
Weighted avg88%88%86%2247
Table 13. Performance of stacking model report.
Table 13. Performance of stacking model report.
MSEMAERMSE
Bayes0.2790390.2523360.528241
SVM0.2532270.2247440.503216
LR0.2492210.2207390.464589
RF0.2158430.1971520.464586
Stacking0.1330660.1259460.364783
KNN0.1410770.1259460.375602
Table 14. Summary of previous studies that use RNN for traffic accident prediction.
Table 14. Summary of previous studies that use RNN for traffic accident prediction.
Authors and Study AreaInput/Independent VariableOutput/Dependent VariableData PartitioningPerformances (%)Severity Level
Md. Ebrahim Shaik (2021) [3]Accident time, zone and location, collision typeInjury severityTraining = 80%, Validation = 20%Accuracy of the RNN model was 71.77%, whereas the MLP and BLR models achieved 65.48% and 58.30%, respectivelySummarizes the different models
Rezapour, M., Nazneen, S., Ksaibati, K., (2020) [4]2430 motorcycle crashes in a mountainous area in the United States over a 10-year periodInjury severityTraining = 80%, Validation = 20%AUC ranges in value from 0 (100% wrong) vs. 1 (100% right)Prediction of motorcycle crashes
Sameen et al. (2019), Malaysia [5]Accident time, zone and location, collision type, surface and lighting condition, accident reportingInjury severities10-fold cross-validationSD:
RNN = 1.24
CNN = 0.53
FFNN = 2.21
Accuracy:
RNN = 73.76
CNN = 70.30
FFNN = 68.79
PDO = 238 (last section), 209 (main route)
Evident injury = 58 (last section), 155 (main route)
Disabling injury = 82 (last section), 666 (main route)
Abhishek Chakraborty (2019) [6]Pedestrian–vehicular interaction concerning ‘pedestrian-vehicular volume ratio’ and lack of ‘accessibility of pedestrian cross-walk’Accidents, traffic/mortalityTraining = 80%, Testing = 20% Accidents, traffic/mortality
Lee et al., 2019 [7]Road geometry data, precipitation data, and traffic accident data over nine years corresponding to the Naebu Expressway, which is located in Seoul, Republic of KoreaSeverity of traffic accidents in Seoul CityTraining = 75%, Testing = 25%Accuracy of 1.6878, followed by curve length (CL) at 1.1213Vehicle type (VT) showing a decrease of −1.2282, accident time (AT) at −2.9598, and super-elevation (Se) having the most negative impact at −3.8938.
Our StudyTraffic Accidents 2019 Leeds (TAL19) datasetSlight, serious, fatalTraining = 80%, Testing = 20%Precision = 98%, Recall = 60%, Score = 70% Slight, serious, fatal, MSE = 0.133066, MAE = 0.125946, RMSE= 0.364783
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

El Mallahi, I.; Riffi, J.; Tairi, H.; Nikolov, N.S.; El Mallahi, M.; Mahraz, M.A. Optimizing Traffic Accident Severity Prediction with a Stacking Ensemble Framework. World Electr. Veh. J. 2025, 16, 561. https://doi.org/10.3390/wevj16100561

AMA Style

El Mallahi I, Riffi J, Tairi H, Nikolov NS, El Mallahi M, Mahraz MA. Optimizing Traffic Accident Severity Prediction with a Stacking Ensemble Framework. World Electric Vehicle Journal. 2025; 16(10):561. https://doi.org/10.3390/wevj16100561

Chicago/Turabian Style

El Mallahi, Imad, Jamal Riffi, Hamid Tairi, Nikola S. Nikolov, Mostafa El Mallahi, and Mohamed Adnane Mahraz. 2025. "Optimizing Traffic Accident Severity Prediction with a Stacking Ensemble Framework" World Electric Vehicle Journal 16, no. 10: 561. https://doi.org/10.3390/wevj16100561

APA Style

El Mallahi, I., Riffi, J., Tairi, H., Nikolov, N. S., El Mallahi, M., & Mahraz, M. A. (2025). Optimizing Traffic Accident Severity Prediction with a Stacking Ensemble Framework. World Electric Vehicle Journal, 16(10), 561. https://doi.org/10.3390/wevj16100561

Article Metrics

Back to TopTop