Analysis of the Severity of Road Accidents Using Combined Data Mining Techniques
Abstract
1. Introduction
2. Literature Review
3. Materials and Methods
3.1. Association Rules
3.2. Classification and Regression Tree
3.3. Methodology
3.3.1. Data Collection
3.3.2. Data Characterization
3.3.3. Application of Association Rules
3.3.4. Application of CART
3.3.5. Discussion of Results
3.3.6. Conclusions
4. Results
4.1. Data Collection
4.2. Data Characterization
4.3. Application of Association Rules
4.4. Application of Classification and Regression Trees (CARTs)
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- World Health Organization. Global Status Report on Road Safety 2023; World Health Organization: Geneva, Switzerland, 2023; Available online: https://repository.gheli.harvard.edu/repository/12838/ (accessed on 6 April 2026).
- World Health Organization Regional Office for Europe. Fact Sheet on Sustainable Development Goals (SDGs): Health Targets Road Safety; World Health Organization Regional Office for Europe: Copenhagen, Denmark, 2017; Available online: https://iris.who.int/server/api/core/bitstreams/d6e87d9f-cb15-4c56-97e9-f836037117d6/content (accessed on 18 May 2026).
- Mindell, J.S.; Watkins, S.J. Transport, Health and Inequality. An Overview of Current Evidence. J. Transp. Health 2024, 38, 101886. [Google Scholar] [CrossRef]
- Monclús, J. Road Safety and the SDGgs: A Guide for Private Sector Oorganizations; Fundación MAPFRE: Madrid, Spain, 2019. [Google Scholar]
- World Health Organization, Regional Office for South-East Asia. Accelerating Actions for Implementation of Decade of Action for Road Safety; World Health Organization, Regional Office for South-East Asia: New Delhi, India, 2017. [Google Scholar]
- Özaydın, Ö.; Kabak, Ö.; Topcu, Y.I.; Ülengin, F.; Önsel Ekici, Ş. Analysis of Direct and Indirect Relations among Sustainable Development Goals and Transportation Targets. Transp. Policy 2025, 171, 270–281. [Google Scholar] [CrossRef]
- Ahmed, S.K.; Mohammed, M.G.; Abdulqadir, S.O.; El-Kader, R.G.A.; El-Shall, N.A.; Chandran, D.; Rehman, M.E.U.; Dhama, K. Road Traffic Accidental Injuries and Deaths: A Neglected Global Health Issue. Health Sci. Rep. 2023, 6, e1240. [Google Scholar] [CrossRef]
- Blincoe, L.; Miller, T.R.; Wang, J.-S.; Swedler, D.; Coughlin, T.; Lawrence, B.; Guo, F.; Klauer, S.; Dingus, T. February 2023 6. Performing Organization Code 7. Authors 13. Type of Report and Period Covered NHTSA Technical Report 14. Sponsoring Agency Code Unclassified; National Highway Traffic Safety Administration: Washington, DC, USA, 2023. Available online: https://rosap.ntl.bts.gov (accessed on 10 December 2025).
- International Road Assessment Programme (iRAP). Safety Insights Explorer; International Road Assessment Programme (iRAP): Bracknell, UK, 2026; Available online: https://irap.org/safety-insights-explorer/ (accessed on 6 April 2026).
- Bezerra, B.S. Road Safety and Sustainable Development; Filho, W.L., Wall, T., Azul, A.M., Brandli, L., Özuyar, P.G., Eds.; Springer: Cham, Switzerland, 2019. [Google Scholar] [CrossRef]
- Putatunda, A.; Al Haddad, C.; Antoniou, C. A Comprehensive Review of the Socio-Economic Appraisal Methodologies of the Road Safety Measures. Accid. Anal. Prev. 2025, 217, 108021. [Google Scholar] [CrossRef]
- Litman, T. A New Traffic Safety Paradigm; Victoria Transport Policy Institute: Victoria, BC, Canada, 2026; Volume 29. [Google Scholar]
- Barua, U.; Tay, R. Severity of Urban Transit Bus Crashes in Bangladesh. J. Adv. Transp. 2010, 44, 36–41. [Google Scholar] [CrossRef]
- OMS. La Seguridad Vial 2013. Informe Sobre la Situación Mundial de la Seguridad Vial 2013; World Health Organization (WHO): Geneva, Switzerland, 2013; Volume 12. [Google Scholar]
- Observatorio Nacional de Seguridad Vial (ONSV). Estadísticas de Siniestros de Tránsito 2023; Observatorio Nacional de Seguridad Vial (ONSV): Lima, Perú, 2025; Available online: http://www.onsv.gob.pe/ (accessed on 6 April 2026).
- Ministerio de Transportes y Comunicaciones. MTC Impulsa Proyecto de Recolección de Información Sobre Accidentes en Vías Concesionadas Para Reforzar la Seguridad; Ministerio de Transportes y Comunicaciones: Lima, Perú, 2025; Available online: https://www.gob.pe/institucion/mtc/noticias/1092860-mtc-impulsa-proyecto-de-recoleccion-de-informacion-sobre-accidentes-en-vias-concesionadas-para-reforzar-la-seguridad (accessed on 6 April 2026).
- International Transport Forum (ITF). Road Safety Annual Report 2024; International Transport Forum (ITF): Paris, France, 2024; Available online: https://www.itf-oecd.org/road-safety-annual-report-2024 (accessed on 6 April 2026).
- El-Achkar, J.; El-Gharib, M.; Ahmad, N.; Al-Hajj, S. The Burden of Road Traffic Injuries: A Global Perspective. Panor. Emerg. Med. 2026, 4, 2026. [Google Scholar] [CrossRef]
- Tini, N.H.; Zaly Shah, M.; Sultan, Z. Impact of Road Transportation Network on Socio-Economic Well-Being: An Overview of Global Perspective. Int. J. Sci. Res. Sci. Eng. Technol. 2018, 4, 282–296. [Google Scholar]
- Sargazi, A.; Sargazi, A.; Kumar Nadakkavukaran Jim, P.; Ali Danesh, H.; Sargolzaee Aval, F.; Kiani, Z.; Hosein Lashkarinia, A.; Sepehri, Z. Economic Burden of Road Traffic Accidents; Report from a Single Center from South Eastern Iran; Trauma Research Center, Shiraz University of Medical Sciences: Shiraz, Iran, 2016; Volume 4, Available online: www.beat-journal.com (accessed on 6 April 2026).
- Ahmed, M.; Patnaik, J.L.; Whitestone, N.; Hossain, M.A.; Alauddin, M.; Husain, L.; Hossain, M.P.; Islam, M.S.; Hossain, M.I.; Imdad, K.; et al. Visual Impairment and Risk of Self-Reported Road Traffic Crashes Among Bus Drivers in Bangladesh. Asia. Pac. J. Ophthalmol. 2022, 11, 72–78. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, T.C.; Nguyen, M.H.; Armoogum, J.; Ha, T.T. Bus Crash Severity in Hanoi, Vietnam. Safety 2021, 7, 65. [Google Scholar] [CrossRef]
- Verma, A.; Sasidharan, S.; Bhalla, K.; Allirani, H. Fatality Risk Analysis of Vulnerable Road Users from an Indian City. Case Stud. Transp. Policy 2022, 10, 269–277. [Google Scholar] [CrossRef]
- Miller, K.A.; Filtness, A.J.; Anund, A.; Maynard, S.E.; Pilkington-Cheney, F. Contributory Factors to Sleepiness amongst London Bus Drivers. Transp. Res. Part F Traffic Psychol. Behav. 2020, 73, 415–424. [Google Scholar] [CrossRef]
- Law, T.H.; Daud, M.S.; Hamid, H.; Haron, N.A. Development of Safety Performance Index for Intercity Buses: An Exploratory Factor Analysis Approach. Transp. Policy 2017, 58, 46–52. [Google Scholar] [CrossRef]
- Zegeer, C.V.; Huang, H.F.; Stutts, J.C.; Rodgman, E.; Hummer, J.E. Commercial Bus Accident Characteristics and Roadway Treatments. Transp. Res. Rec. 1994, 1467, 14–22. [Google Scholar]
- Kaplan, S.; Prato, C.G. Risk Factors Associated with Bus Accident Severity in the United States: A Generalized Ordered Logit Model. J. Saf. Res. 2012, 43, 171–180. [Google Scholar] [CrossRef]
- Polo, J.R.; Riveros, C.C.; Diaz, W.A.; Cansaya, A.C.; Anticona, M.R. Caracterización Del Nivel de Estrés de Alumnos de Ingeniería Mediante Herramientas de Data Mining. In Proceedings of the LACCEI International Multi-Conference for Engineering, Education and Technology 2021, Virtual, 19–23 July 2021. [Google Scholar] [CrossRef]
- Spanos, A. Revisiting Data Mining: ‘Hunting’ with or without a License. J. Econ. Methodol. 2000, 7, 231–264. [Google Scholar] [CrossRef]
- Xianfang, T.; Yachao, J.; Ru, Z. The Infiltration of Mathematical Modeling Thoughts in College Mathematics Teaching. J. Phys. Conf. Ser. 2019, 1168, 052018. [Google Scholar] [CrossRef]
- Wu, Y. The Modes of Data Development in the Internet Age. Data Sci. J. 2007, 6, 962–967. [Google Scholar] [CrossRef]
- Dandge, S.S.; Chakraborty, S. A Data Mining Approach for Analysis of a Wire Electrical Discharge Machining Process. Manag. Prod. Eng. Rev. 2021, 12, 116–128. [Google Scholar] [CrossRef]
- Wang, J.; Ma, S.; Jiao, P.; Ji, L.; Sun, X.; Lu, H. Analyzing the Risk Factors of Traffic Accident Severity Using a Combination of Random Forest and Association Rules. Appl. Sci. 2023, 13, 8559. [Google Scholar] [CrossRef]
- Ghomi, H.; Bagheri, M.; Fu, L.; Miranda-Moreno, L.F. Analyzing Injury Severity Factors at Highway Railway Grade Crossing Accidents Involving Vulnerable Road Users: A Comparative Study. Traffic Inj. Prev. 2016, 17, 833–841. [Google Scholar] [CrossRef] [PubMed]
- Chand, S.; Li, Z.; Alsultan, A.; Dixit, V.V. Comparing and Contrasting the Impacts of Macro-Level Factors on Crash Duration and Frequency. Int. J. Environ. Res. Public Health 2022, 19, 5726. [Google Scholar] [CrossRef]
- Li, F.; Jiang, K. Application of Random-Parameter Negative Binomial Model to Examine the Relationship between the Severity of Traffic Accident. In Proceedings of the 2020 IEEE 5th International Conference on Intelligent Transportation Engineering, ICITE 2020, Beijing, China, 11–13 September 2020; pp. 351–354. [Google Scholar] [CrossRef]
- Mahmud, A.; Gayah, V.V. Estimation of Crash Type Frequencies on Individual Collector Roadway Segments. Accid. Anal. Prev. 2021, 161, 106345. [Google Scholar] [CrossRef] [PubMed]
- Ghadban, N.R.; Abdella, G.M.; Alhajyaseen, W.; Al-Khalifa, K.N. Analyzing the Impact of Human Characteristics on the Comprehensibility of Road Traffic Signs. In Proceedings of the International Conference on Industrial Engineering and Operations Management, Bandung, Indonesia, 6–8 March 2018; pp. 2210–2219. [Google Scholar]
- Kraidi, R.; Evdorides, H. Pedestrian Safety Models for Urban Environments with High Roadside Activities. Saf. Sci. 2020, 130, 104847. [Google Scholar] [CrossRef]
- Mujalli, R.O.; de Ona, J. Injury Severity Models for Motor Vehicle Accidents: A Review. Proc. Inst. Civ. Eng. Transp. 2013, 166, 255–270. [Google Scholar] [CrossRef]
- Witten, I.; Frank, E.; Hall, M. Data Mining, 3rd. ed.; Morgan Kaufmann Publishers: Burlington, MA, USA, 2011. [Google Scholar]
- Agrawal, R.; Imieliński, T.; Swami, A. Mining Association Rules between Sets of Items in Large Databases. J. SIGMOD Rec. 1993, 22, 207–216. [Google Scholar] [CrossRef]
- Pande, A.; Abdel-Aty, M. Discovering Indirect Associations in Crash Data through Probe Attributes. Transp. Res. Rec. 2008, 2083, 170–179. [Google Scholar] [CrossRef]
- Montella, A. Identifying Crash Contributory Factors at Urban Roundabouts and Using Association Rules to Explore Their Relationships to Different Crash Types. Accid. Anal. Prev. 2011, 43, 1451–1463. [Google Scholar] [CrossRef] [PubMed]
- Cheng, C.W.; Lin, C.C.; Leu, S.-S. Use of Association Rules to Explore Cause-Effect Relationships in Occupational Accidents in the Taiwan Construction Industry. Saf. Sci. 2010, 48, 436–444. [Google Scholar] [CrossRef]
- Montella, A.; Aria, M.; D’Ambrosio, A.; Mauriello, F. Analysis of Powered Two-Wheeler Crashes in Italy by Classification Trees and Rules Discovery. Accid. Anal. Prev. 2012, 49, 58–72. [Google Scholar] [CrossRef]
- Daher, J.R.; Chilkaka, S.; Younes, A.; Shaban, K. Association Rule Mining on Five Years of Motor Vehicle Crashes. MATEC Web Conf. 2016, 81, 02017. [Google Scholar] [CrossRef]
- Wang, K.; Qin, X. Exploring Driver Error at Intersections: Key Contributors and Solutions. Transp. Res. Rec. 2015, 2514, 1–9. [Google Scholar] [CrossRef]
- Liu, S.; Kang, L.; Sun, H.; Wu, J.; Amihere, S. Exploring the Factors of Major Road Traffic Accidents: A Case Study of China. Front. Eng. Manag. 2025, 12, 414–424. [Google Scholar] [CrossRef]
- Tariq, M.; Mehmood, N.Q.; Mahfooz, S.Z. Discovering Associated Factors behind Road Accidents Using Association Rule Mining: A Case Study from Gujarat, Pakistan. World J. Adv. Res. Rev. 2022, 15, 001–011. [Google Scholar] [CrossRef]
- Gu, C.; Xu, J.; Gao, C.; Mu, M.; E, G.; Ma, Y. Multivariate Analysis of Roadway Multi-Fatality Crashes Using Association Rules Mining and Rules Graph Structures: A Case Study in China. PLoS ONE 2022, 17, e0276817. [Google Scholar] [CrossRef] [PubMed]
- Huang, S.; Jin, C.; Chen, T.; Wang, Z.W.; Wang, J. Analysis of Major Road Traffic Accident Causes Using a Combined Method of Association Rule and Complex Network. J. Adv. Transp. 2025, 2025, 8714444. [Google Scholar] [CrossRef]
- Grochtmann, M.; Grimm, K. Classification Trees for Partition Testing. Softw. Test. Verif. Reliab. 1993, 3, 63–82. [Google Scholar] [CrossRef]
- Azhar, A.; Ariff, N.M.; Bakar, M.A.A.; Roslan, A. Classification of Driver Injury Severity for Accidents Involving Heavy Vehicles with Decision Tree and Random Forest. Sustainability 2022, 14, 4101. [Google Scholar] [CrossRef]
- Le, K.G.; Tran, Q.H.; Do, V.M. Urban Traffic Accident Features Investigation to Improve Urban Transportation Infrastructure Sustainability by Integrating GIS and Data Mining Techniques. Sustainability 2024, 16, 107. [Google Scholar] [CrossRef]
- Abdullah, P.; Sipos, T. Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method. Sustainability 2022, 14, 11339. [Google Scholar] [CrossRef]
- Megnidio-Tchoukouegno, M.; Adedeji, J.A. Machine Learning for Road Traffic Accident Improvement and Environmental Resource Management in the Transportation Sector. Sustainability 2023, 15, 2014. [Google Scholar] [CrossRef]
- Wang, H.; Liang, G. Association Rules Between Urban Road Traffic Accidents and Violations Considering Temporal and Spatial Constraints: A Case Study of Beijing. Sustainability 2025, 17, 1680. [Google Scholar] [CrossRef]
- Chen, M.M.; Chen, M.C. Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest. Information 2020, 11, 270. [Google Scholar] [CrossRef]
- Yang, X.; Ji, Y.; Gu, J.; Niu, M. An Electricity Consumption Disaggregation Method for HVAC Terminal Units in Sub-Metered Buildings Based on CART Algorithm. Buildings 2023, 13, 967. [Google Scholar] [CrossRef]
- Kim, H.; Kim, W.; Kim, J.; Lee, S.J.; Yoon, D.; Jo, J. A Study on Re-engagement and Stabilization Time on Take-over Transition in a Highly Automated Driving System. Electronics 2021, 10, 344. [Google Scholar] [CrossRef]
- Pagliara, F.; Mauriello, F.; Ping, Y. Analyzing the Impact of High-Speed Rail on Tourism with Parametric and Non-Parametric Methods: The Case Study of China. Sustainability 2021, 13, 3416. [Google Scholar] [CrossRef]
- SUTRAN. Reporte Estadístico de Siniestros Viales 2022; SUTRAN: Lima, Peru, 2023; Available online: https://www.gob.pe/institucion/sutran/informes-publicaciones/4171345-reporte-estadistico-de-siniestros-viales-2022 (accessed on 22 April 2026).
- Corrales, C.A.; Atoche, W.J. Modelo básico de simulación de accidentes de unidades de transporte público pesado en una carretera peruana. In Proceedings of the XIX Congreso Internacional de Prevención de Riesgos Laborales (ORP 2019), Sevilla, España, 5–7 June 2019; Fundación Editorial ORP: Barcelona, España, 2019; pp. 1088–1099. [Google Scholar]
- Office of Rail and Road. Common Safety Indicators: Assessment of Achievement of Safety Targets for 2021; Office of Rail and Road: London, UK, 2023. Available online: https://dataportal.orr.gov.uk/media/2184/common-safety-indicators-2021.pdf (accessed on 3 June 2026).
- Ministerio de Transportes y Comunicaciones del Perú. Directiva N° 002-2005-MTC/15: Mecanismos de Información Masiva sobre Niveles de Accidentalidad y Formalidad de las Empresas de Transporte Interprovincial de Personas; Dirección General de Circulación Terrestre: Lima, Perú, 2005; Available online: http://www2.sutran.gob.pe/portal/images/ranking_ipa/dirrectiva-002_accidentalidad-ipa.pdf (accessed on 3 June 2026).
- Pakgohar, A.; Tabrizi, R.S.; Khalili, M.; Esmaeili, A. The Role of Human Factor in Incidence and Severity of Road Crashes Based on the CART and LR Regression: A Data Mining Approach. In Procedia Computer Science; Elsevier: Amsterdam, The Netherlands, 2011; Volume 3, pp. 764–769. [Google Scholar] [CrossRef]
- Beshah, T.; Hill, S. Mining Road Traffic Accident Data to Improve Safety: Role of Road-Related Factors on Accident Severity in Ethiopia. In Proceedings of the AAAI Spring Symposium: Artificial Intelligence for Development, Stanford, CA, USA, 22–24 March 2010. [Google Scholar]
- Samerei, S.A.; Aghabayk, K.; Mohammadi, A.; Shiwakoti, N. Data Mining Approach to Model Bus Crash Severity in Australia. J. Saf. Res. 2021, 76, 73–82. [Google Scholar] [CrossRef]
- Kashani, A.T.; Zandi, K.; Okabe, A. Investigation of Factors Associated with Heavy Vehicle Crashes in Iran (Tehran–Qazvin Freeway). Sustainability 2023, 15, 10497. [Google Scholar] [CrossRef]
- Bonnet, E.; Nikiema, A.; Sana, I.; Guiard-Schmid, J.-B.; Petitfour, L. Assessing the Burden of Road Traffic Injuries: A One-Year Prospective Study in Ouagadougou, Burkina Faso. F1000Research 2025, 14, 1112. [Google Scholar] [CrossRef]
- Meskarpour Amiri, M.; Bahadori, M.; Mehrabi-Tavana, A. The Dilemma of Road Traffic Accidents in Iran. Int. J. Med. Rev. 2017, 4, 91–92. [Google Scholar] [CrossRef]
- Costa, J.O.D.; Freitas, E.F.; Jacques, M.A.P.; Pereira, P.A.A. Collision Prediction Models with Longitudinal Data: An Analysis of Contributing Factors in Collision Frequency in Road Segments in Portugal. In Proceedings of the 17th International Conference Road Safety on Five Continents (RS5C 2016), Rio de Janeiro, Brazil, 17–19 May 2016; pp. 1–12. [Google Scholar]
- Besharati, M.M.; Tavakoli Kashani, A. Factors Contributing to Intercity Commercial Bus Drivers’ Crash Involvement Risk. Arch. Environ. Occup. Health 2018, 73, 243–250. [Google Scholar] [CrossRef]



| Category | Factor | Abbreviation |
|---|---|---|
| Road factors | City proximity | City_prox |
| Road factors | Place in which crash occurred | Kilometer |
| Road factors | Traffic level | Traffic_L |
| Time factors | Month of year | Month |
| Time factors | Day of week | Day |
| Time factors | Time of day | Time |
| Vehicle factors | Vehicle type involved | Vec_type |
| Vehicle factors | Number of vehicles involved | Vec_num |
| Other factors | Type of crash | Crash_type |
| Other factors | Crash severity | Severity |
| Factor | Abbreviation | Category | Definitions | ||
|---|---|---|---|---|---|
| Road factors | City proximity | City_prox | N/F | Near/Far | |
| Road factors | Place in which crash occurred | Kilometer | DIS1/DIS2/DIS3/DIS4/DIS5/DIS6/DIS7/DIS8/DIS9/DIS10 | 1/2/3/4/5/6/7/8/9/10 | KM 0–10/10–20/20–40/40–60/60–80/80–100/100–120/120–140/140–160/160–180/180–200 |
| Road factors | Traffic level | Traffic_L | TL1/TL2/TL3 | 1/2/3 | Low/Medium/High |
| Time factors | Month of year | Month | MON1/MON2/MON3/MON4/MON5/MON6/MON7/MON8/MON9 | 1/2/3/4/5/6/7/8/9 | January/February/March/April, May, June/July/August/September, October/November/December |
| Time factors | Day of week | Day | DAY1/DAY2/DAY3/DAY4/DAY5/DAY6/DAY7 | 1/2/3/4/5/6/7 | Sunday/Monday/Tuesday/Wednesday/Thursday/Friday/Saturday |
| Time factors | Time of day | Time | TIM1/TIM2/TIM3/TIM4/TIM5 | 1/2/3/4/5 | 07:00–11:00/11:00–14:00/14:00–18:00/18:00–24:00/00:00–07:00 |
| Vehicle factors | Vehicle type involved | Vec_type | VEC1/VEC2/VEC3/VEC4/VEC5 | 1/2/3/4/5 | BUS/Truck/Private car/BUS-TRUCK/Others |
| Vehicle factors | Number of vehicles involved | Vec_num | NUM1/NUM2/NUM3/NUM4/NUM5 | 1/2/3/4/5 | 1/2/3/4/Multiple |
| Other factors | Type of crash | Crash_type | TYP1/TYP2/TYP3/TYP4/TYP5/TYP6/TYP7 | 1/2/3/4/5/6/7 | Run off the road/Head-on collision/Rear-end collision/Sideswipe collision/Turnover/Hit object or pedestrian/Others |
| Other factors | Crash severity | IPA | IPA1/IPA2/IPA3 | 1/2/3 | IPA0–1/IPA2–4/IPA ≥ 5 |
| Top Association Rules for Crash Type/Severity | |||||
|---|---|---|---|---|---|
| Antecedents | Consequents | Support | Confidence | Lift | |
| 87 | (CrashType_TYP6) | (SeverityBin_IPA1) | 0.130346 | 1.000000 | 1.316354 |
| 218 | (Proximity_F, CrashType_TYP6) | (SeverityBin_IPA1) | 0.063136 | 1.000000 | 1.316354 |
| 287 | (Proximity_N, CrashType_TYP6) | (SeverityBin_IPA1 | 0.067210 | 1.000000 | 1.316354 |
| 439 | (VehType_VEC1, CrashType_TYP6) | (SeverityBin_IPA1) | 0.089613 | 1.000000 | 1.316354 |
| 488 | (NumVeh_NUM1, CrashType_TYP6) | (SeverityBin_IPA1) | 0.118126 | 1.000000 | 1.316354 |
| 684 | (NumVeh_NUM1, Proximity_F, CrashType_TYP6) | (SeverityBin_IPA1) | 0.057026 | 1.000000 | 1.316354 |
| 742 | (Proximity_N, NumVeh_NUM1, CrashType_TYP6) | (SeverityBin_IPA1) | 0.061100 | 1.000000 | 1.316354 |
| 886 | (VehType_VEC1, NumVeh_NUM1, CrashType_TYP6) | (SeverityBin_IPA1) | 0.085540 | 1.000000 | 1.316354 |
| 724 | (Proximity_N, NumVeh_NUM1, VehType_VEC2) | (SeverityBin_IPA1 | 0.059063 | 0.966667 | 1.272475 |
| 349 | (Weekday_DAY4, NumVeh_NUM1) | (SeverityBin_IPA1) | 0.057026 | 0.965517 | 1.270962 |
| 820 | (VehType_VEC2, TimeBin_TIM4, NumVeh_NUM2) | (CrashType_TYP7) | 0.050916 | 0.961538 | 2.000489 |
| 890 | (VehType_VEC1, CrashType_TYP6) | (SeverityBin_IPA1, NumVeh_NUM1) | 0.085540 | 0.954545 | 3.023754 |
| 371 | (TimeBin_TIM3, CrashType_TYP7) | (SeverityBin_IPA1) | 0.067210 | 0.942857 | 1.241134 |
| 855 | (NumVeh_NUM1, TimeBin_TIM5, VehType_VEC2) | (SeverityBin_IPA1 | 0.063136 | 0.939394 | 1.236575 |
| 510 | (Proximity_F, VehType_VEC2, MonthBin_MON4) | (SeverityBin_IPA1) | 0.061100 | 0.937500 | 1.234082 |
| 319 | (VehType_VEC2, MonthBin_MON4) | (SeverityBin_IPA1) | 0.083503 | 0.931818 | 1.226602 |
| 279 | (Proximity_N, NumVeh_NUM1) | (SeverityBin_IPA1) | 0.138493 | 0.931507 | 1.226193 |
| 549 | (TimeBin_TIM4, Proximity_F, VehType_VEC2) | (SeverityBin_IPA1) | 0.052953 | 0.928571 | 1.222329 |
| 580 | (TimeBin_TIM5, Proximity_F, VehType_VEC2) | (SeverityBin_IPA1) | 0.052953 | 0.928571 | 1.222329 |
| 701 | (Proximity_N, NumVeh_NUM1, TimeBin_TIM5) | (SeverityBin_IPA1) | 0.050916 | 0.925926 | 1.218846 |
| 776 | (NumVeh_NUM1, VehType_VEC2, MonthBin_MON4) | (SeverityBin_IPA1) | 0.050916 | 0.925926 | 1.218846 |
| 912 | (CrashType_TYP1, VehType_VEC2) | (SeverityBin_IPA1, NumVeh_NUM1) | 0.097760 | 0.923077 | 2.924069 |
| 461 | (CrashType_TYP1, VehType_VEC2) | (SeverityBin_IPA1) | 0.097760 | 0.923077 | 1.215096 |
| 908 | (CrashType_TYP1, NumVeh_NUM1, VehType_VEC2) | (SeverityBin_IPA1) | 0.097760 | 0.923077 | 1.215096 |
| 1022 | (CrashType_TYP1, Proximity_F, VehType_VEC2) | (SeverityBin_IPA1, NumVeh_NUM1) | 0.069246 | 0.918919 | 2.910898 |
| 631 | (CrashType_TYP1, Proximity_F, VehType_VEC2) | (SeverityBin_IPA1) | 0.069246 | 0.918919 | 1.209622 |
| 1017 | (CrashType_TYP1, NumVeh_NUM1, Proximity_F, VehType_VEC2) | (SeverityBin_IPA1) | 0.069246 | 0.918919 | 1.209622 |
| 402 | (TimeBin_TIM5, VehType_VEC2) | (SeverityBin_IPA1) | 0.091650 | 0.918367 | 1.208896 |
| 1047 | (VehType_VEC2, SeverityBin_IPA1, Proximity_F, NumVeh_NUM2) | (CrashType_TYP7) | 0.067210 | 0.916667 | 1.907133 |
| 1044 | (VehType_VEC2, Proximity_F, CrashType_TYP7, NumVeh_NUM2) | (SeverityBin_IPA1) | 0.067210 | 0.916667 | 1.206658 |
| 141 | (MonthBin_MON5, Proximity_F) | (SeverityBin_IPA1) | 0.065173 | 0.914286 | 1.203524 |
| 452 | (NumVeh_NUM1, VehType_VEC2) | (SeverityBin_IPA1) | 0.171079 | 0.913043 | 1.201888 |
| 746 | (Proximity_N, CrashType_TYP6) | (SeverityBin_IPA1, NumVeh_NUM1) | 0.061100 | 0.909091 | 2.879765 |
| 491 | (CrashType_TYP6) | (SeverityBin_IPA1, NumVeh_NUM1) | 0.118126 | 0.906250 | 2.870766 |
| 949 | (SeverityBin_IPA1, Proximity_F, NumVeh_NUM2, MonthBin_MON4) | (CrashType_TYP7) | 0.059063 | 0.906250 | 1.885461 |
| Metric | Value | Notes |
|---|---|---|
| Data partition | ||
| Training set (70%) | n = 343 | IPA1: 104/IPA2: 139/IPA3: 100 |
| Test set (30%) | n = 148 | IPA1: 45/IPA2: 60/IPA3: 43 |
| Model performance (test set) | ||
| Accuracy | 41.2% | Proportion of correctly classified cases |
| Cohen’s Kappa | 0.11 | Agreement beyond chance |
| Cross-validation (10-fold) | ||
| CV Accuracy (mean) | 48.5% | Mean over 10 folds |
| CV Accuracy (SD) | ±6.0% | |
| F1-Score by class (test set) | ||
| F1—IPA1 (low severity) | 0.32 | |
| F1—IPA2 (moderate severity) | 0.41 | |
| F1—IPA3 (high severity) | 0.51 | |
| Variable importance (top) | ||
| 1st—Crash type (CRASH_TYPE) | 36.5% | |
| 2nd—Traffic level (TRAFFIC_L) | 35.3% | |
| 3rd—Number of vehicles (VEC_NUM) | 11.0% | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Corrales, C.; Rubio-Romero, J.C.; Pardo-Ferreira, M.d.C. Analysis of the Severity of Road Accidents Using Combined Data Mining Techniques. Sustainability 2026, 18, 6118. https://doi.org/10.3390/su18126118
Corrales C, Rubio-Romero JC, Pardo-Ferreira MdC. Analysis of the Severity of Road Accidents Using Combined Data Mining Techniques. Sustainability. 2026; 18(12):6118. https://doi.org/10.3390/su18126118
Chicago/Turabian StyleCorrales, César, Juan Carlos Rubio-Romero, and María del Carmen Pardo-Ferreira. 2026. "Analysis of the Severity of Road Accidents Using Combined Data Mining Techniques" Sustainability 18, no. 12: 6118. https://doi.org/10.3390/su18126118
APA StyleCorrales, C., Rubio-Romero, J. C., & Pardo-Ferreira, M. d. C. (2026). Analysis of the Severity of Road Accidents Using Combined Data Mining Techniques. Sustainability, 18(12), 6118. https://doi.org/10.3390/su18126118

