Road Traffic Accident Hotspot Detection: A GIS-Based Machine Learning Approach Using HDBSCAN and Spatial Clustering Techniques
Abstract
1. Introduction
2. Database and Methodology
2.1. Study Area
2.2. Database Generation
2.3. Purely Temporal Cluster Analysis
2.4. Spatial Pattern Analysis Using Kernel Density Estimation (KDE)
2.5. Detecting Spatial Clusters and Autocorrelations
2.5.1. Spatial Patterns Identification Through LISA and Moran’s I
2.5.2. Purely Spatial Analysis Using Scan Statistics Approach
2.5.3. HDBSCAN a Machine Learning-Based Clustering Algorithm
2.6. Space-Time Cube Based Emerging Hotspot
3. Results and Discussion
3.1. Purely Temporal Analysis
3.2. Spatial Pattern Analysis
3.2.1. Time-Segmented Accident Density Analysis

3.2.2. Spatial Accident Density with Aerial Intersection View
3.3. Multi-Analytical Clustering Detection of RTA
3.3.1. Spatial Patterns of RTA Clusters Using Moran’s I and LISA
3.3.2. Identifying Purely Spatial Clusters of RTA with Scan Statistics
3.3.3. Density-Based Clustering of Accident Hotspots Using HDBSCAN
3.3.4. Comparison of Clustering Techniques
3.4. Space-Time Based Emerging Hotspot Analysis
4. Policy Implications
5. Limitations
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Time-Zone | MLI | KDE Estimate (Accidents/km2) | % of RTA Out of Total | Total % of RTA in MLs | HRZ | HRZ Code | Remarks |
|---|---|---|---|---|---|---|---|
| Morning Peak hours (8:01 a.m.–11:00 a.m.) | Location 1 | 2.79–2.38 | 9.76 | 51.22 | Segment along Hill Cart Road near Darjeeling More | A1 | MIHT |
| Location 2 | 2.11–2.31 | 9.76 | Segment along Sevoke road near Salugara more | A2 | BCA | ||
| Location 3 | 3.52–3.85 | 12.20 | Intersection of AH2, Noukaghat road and Burdwan road near Noukaghat more | A3 | MIHT | ||
| Location 4 | 2.71–3.61 | 7.32 | Segment along AH2 near Tinbatti more | A4 | MIHT | ||
| Location 5 | 1.99–2.02 | 7.32 | Intersection of Eastern Bypass and Ghogomali Road near Ashighar More | A5 | MIHT & BCA | ||
| Location 6 | 1.61–1.6.3 | 4.88 | Segment along IOC road near NJP Station | A6 | HTZ | ||
| Mid-day off peak hours (11:01 a.m.–04:00 p.m.) | Location 1 | 3.71–6.50 | 7.14 | 51.19 | Segment along NH10 near Champasari More | A7 | MIHT & BCA |
| Location 2 | 4.15–4.50 | 5.95 | A2 | A2 | BCA | ||
| Location 3 | 6.56–8.05 | 8.33 | Intersection of Hill cart road and Burdwan road near Airview and Sevoke more | A8 | MIHT | ||
| Location 4 | 2.87–3.65 | 4.76 | A5 | A5 | MIHT & BCA | ||
| Location 5 | 3.86–4.53 | 4.76 | Segment along Sevoke road near check post more | A9 | HTZ & BCA | ||
| Location 6 | 3.19–4.12 | 5.95 | A3 & A4 | A3 & A4 | MIHT | ||
| Other location | 2.81–6.33 | 14.29 | A1, A6, Intersection of Check post more—(A9), Venus more—(A10) | A1, A6, A9, A10 | MIHT & BCA | ||
| Evening peak hours (04:01 p.m.–08:00 p.m.) | Location 1 | 3.07–4.03 | 8.06 | 50.00 | Segment along Champasari road | A11 | HTZ & BCA |
| Location 2 | 2.41–3.38 | 6.45 | Segment along Champasari road and NH10 near Champasari more | A12 | MIHT, HTZ & BCA | ||
| Location 3 | 3.23–6.46 | 16.13 | Segment along Hill cart road & AH2 near Darjeeling more upto Siliguri Junction | A13 | MIHT, HTZ & BCA | ||
| Location 4 | 2.17–2.69 | 4.84 | A8 | A8 | MIHT & BCA | ||
| Location 5 | 2.54–3.36 | 8.06 | Segment along Satyen Bose road near Babupara | A9 | HTZ | ||
| Location 6 | 2.10–2.94 | 6.45 | A6 | A6 | HTZ | ||
| Lean hours (08:01 p.m.–08:00 a.m.) | Location 1 | 5.64–14.08 | 11.28 | 67.67 | A1, A7 | A1, A7 | MIHT & BCA |
| Location 2 | 7.04–12.67 | 6.02 | Segment along AH2 from Mallaguri upto Siliguri Junction | A14 | |||
| Location 3 | 9.26–10.20 | 5.26 | A3 | A3 | MIHT | ||
| Location 4 | 4.64–8.19 | 8.27 | A9 | A9 | MIHT & BCA | ||
| Location 5 | 9.64–14.21 | 13.53 | Intersection of Venus more and Court more | A10 | MIHT & BCA | ||
| Location 6 | 9.51–12.55 | 10.53 | Intersection of Hill cart road, Burdwan Road and AH2 near Airview and Jhankar more (A15) | A8, A15 | MIHT & BCA | ||
| Other location | 4.17–6.59 | 12.78 | Segment along S.F. Road (A16), A2, A4, A5, A11 | A2, A4, A5, A11, A16 | MIHT & BCA |
References
- Soltani, A.; Askari, S. Exploring spatial autocorrelation of traffic crashes based on severity. Injury 2017, 48, 637–647. [Google Scholar] [CrossRef] [PubMed]
- Le, K.G.; Liu, P.; Lin, L.T. Determining the road traffic accident hotspots using GIS-based temporal-spatial statistical analytic techniques in Hanoi, Vietnam. Geo-Spat. Inf. Sci. 2020, 23, 153–164. [Google Scholar] [CrossRef]
- Tola, A.M.; Demissie, T.A.; Saathoff, F.; Gebissa, A. Severity, spatial pattern and statistical analysis of road traffic crash hot spots in Ethiopia. Appl. Sci. 2021, 11, 8828. [Google Scholar] [CrossRef]
- Afolayan, A.; Easa, S.M.; Abiola, O.S.; Alayaki, F.M.; Folorunso, O. GIS-based spatial analysis of accident hotspots: A Nigerian case study. Infrastructures 2022, 7, 103. [Google Scholar] [CrossRef]
- Keay, K.; Simmonds, I. Road accidents and rainfall in a large Australian city. Accid. Anal. Prev. 2006, 38, 445–454. [Google Scholar] [CrossRef]
- Petrova, E.G.; Shiryaeva, A.V. Road accidents in Moscow: Weather impact. Adv. Environ. Sci. 2019, 11, 19–30. [Google Scholar]
- Amiri, A.M.; Nadimi, N.; Khalifeh, V.; Shams, M. GIS-based crash hotspot identification: A comparison among mapping clusters and spatial analysis techniques. Int. J. Inj. Control Saf. Promot. 2021, 28, 325–338. [Google Scholar] [CrossRef]
- Gupta, P.; Shekhar, M.S.; Singh, G.P.; Gupta, D.S.; Singh, A.; Kumar, A.; Kumar, R.; Tomar, D.S. High-resolution analysis and prediction of heavy precipitation-induced GLOF events in North Sikkim Himalayas using the WRF model. Phys. Chem. Earth Parts A/B/C 2025, 139, 103968. [Google Scholar] [CrossRef]
- Joshi, A.K.; Joshi, C.; Singh, M.; Singh, V. Road traffic accidents in hilly regions of northern India: What has to be done? World J. Emerg. Med. 2014, 5, 112. [Google Scholar] [CrossRef]
- Harirforoush, H.; Bellalite, L. A new integrated GIS-based analysis to detect hotspots: A case study of the city of Sherbrooke. Accid. Anal. Prev. 2019, 130, 62–74. [Google Scholar] [CrossRef]
- Le, K.G.; Liu, P.; Lin, L.T. Traffic accident hotspot identification by integrating kernel density estimation and spatial autocorrelation analysis: A case study. Int. J. Crashworthiness 2022, 27, 543–553. [Google Scholar] [CrossRef]
- WHO. Global Status Report on Road Safety 2015; World Health Organization: Geneva, Switzerland, 2015. [Google Scholar]
- Tavakkoli, M.; Torkashvand-Khah, Z.; Fink, G.; Takian, A.; Kuenzli, N.; de Savigny, D.; Cobos Muñoz, D. Evidence from the decade of action for road safety: A systematic review of the effectiveness of interventions in low and middle-income countries. Public Health Rev. 2022, 43, 1604499. [Google Scholar] [CrossRef]
- WHO. Road Traffic Injuries. 2023. Available online: https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries (accessed on 6 December 2023).
- Ahmad, A.U.; Hossain, K.T.; Hossain, M.A. Identification of urban traffic accident hotspot zones using GIS: A case study of Dhaka Metropolitan Area. J. Geogr. Stud. 2020, 3, 36–42. [Google Scholar] [CrossRef]
- MRTH. Road Accidents in India. 2021. Available online: https://morth.nic.in/sites/default/files/RA_2021_Compressed.pdf (accessed on 6 December 2023).
- WHO. Road Traffic Mortality Rate per 100,000 Population in 2021. 2024. Available online: https://data.who.int/indicators/i/B9D9E6A/D6176E2?m49=356 (accessed on 6 December 2023).
- Mahata, D.; Narzary, P.K.; Govil, D. Spatio-temporal analysis of road traffic accidents in Indian large cities. Clin. Epidemiol. Glob. Health 2019, 7, 586–591. [Google Scholar] [CrossRef]
- Dereli, M.A.; Erdogan, S. A new model for determining the traffic accident black spots using GIS-aided spatial statistical methods. Transp. Res. Part A Policy Pract. 2017, 103, 106–117. [Google Scholar] [CrossRef]
- Elvik, R. A survey of operational definitions of hazardous road locations in some European countries. Accid. Anal. Prev. 2008, 40, 1830–1835. [Google Scholar] [CrossRef]
- Mohammadi, A.; Kiani, B.; Mahmoudzadeh, H.; Bergquist, R. Pedestrian Road Traffic Accidents in Metropolitan Areas: GIS-Based Prediction Modelling of Cases in Mashhad, Iran. Sustainability 2023, 15, 10576. [Google Scholar] [CrossRef]
- Soroori, E.; Kiani, B.; Ghasemi, S.; Mohammadi, A.; Shabanikiya, H.; Bergquist, R.; Kiani, F.; Tabatabaei-Jafari, H. Spatial Association Between Urban Neighbourhood Characteristics and Child Pedestrian–Motor Vehicle Collisions. Appl. Spat. Anal. Policy 2023, 16, 1443–1462. [Google Scholar] [CrossRef]
- Mohammadi, R.; Taleai, M.; Otto, P.; Sester, M. Analyzing urban crash incidents: An advanced endogenous approach using spatiotemporal weights matrix. Trans. GIS 2024, 28, 368–410. [Google Scholar] [CrossRef]
- Quddus, M.A. Time series count data models: An empirical application to traffic accidents. Accid. Anal. Prev. 2008, 40, 1732–1741. [Google Scholar] [CrossRef] [PubMed]
- Getahun, K.A. Time series modeling of road traffic accidents in Amhara Region. J. Big Data 2021, 8, 102. [Google Scholar] [CrossRef]
- Gu, J.; Jiang, Z.; Fan, W.D.; Wu, J.; Chen, J. Real-time passenger flow anomaly detection considering typical time series clustered characteristics at metro stations. J. Transp. Eng. Part A Syst. 2020, 146, 04020015. [Google Scholar] [CrossRef]
- Mannering, F.L.; Bhat, C.R. Analytic methods in accident research: Methodological frontier and future directions. Anal. Methods Accid. Res. 2014, 1, 1–22. [Google Scholar] [CrossRef]
- Alam, M.S.; Tabassum, N.J. Spatial pattern identification and crash severity analysis of road traffic crash hot spots in Ohio. Heliyon 2023, 9, e16303. [Google Scholar] [CrossRef] [PubMed]
- Kan, Z.; Kwan, M.; Tang, L. Ripley’s K-function for network-constrained flow data. Geogr. Anal. 2022, 54, 769–788. [Google Scholar] [CrossRef]
- Zheng, M.; Xie, X.; Jiang, Y.; Shen, Q.; Geng, X.; Zhao, L.; Jia, F. Optimizing Kernel Density Estimation Bandwidth for Road Traffic Accident Hazard Identification: A Case Study of the City of London. Sustainability 2024, 16, 6969. [Google Scholar] [CrossRef]
- Miao, C.; Chen, X.; Zhang, C. Assessing network-based traffic crash risk using prospective space-time scan statistic method. J. Transp. Geogr. 2024, 119, 103958. [Google Scholar] [CrossRef]
- Cheng, Z.; Zu, Z.; Lu, J. Traffic crash evolution characteristic analysis and spatiotemporal hotspot identification of urban road intersections. Sustainability 2018, 11, 160. [Google Scholar] [CrossRef]
- Mohammed, S.; Alkhereibi, A.H.; Abulibdeh, A.; Jawarneh, R.N.; Balakrishnan, P. GIS-based spatiotemporal analysis for road traffic crashes; in support of sustainable transportation Planning. Transp. Res. Interdiscip. Perspect. 2023, 20, 100836. [Google Scholar] [CrossRef]
- Mesic, A.; Damsere-Derry, J.; Feldacker, C.; Mooney, S.J.; Gyedu, A.; Mock, C.; Kitali, A.; Wagenaar, B.H.; Wuaku, D.H.; Afram, M.O.; et al. Identifying emerging hot spots of road traffic injury severity using spatiotemporal methods: Longitudinal analyses on major roads in Ghana from 2005 to 2020. BMC Public Health 2024, 24, 1609. [Google Scholar] [CrossRef]
- Zheng, Z.; Zhou, M.; Chen, Y.; Huo, M.; Sun, L.; Zhao, S.; Chen, D. A fused method of machine learning and dynamic time warping for road anomalies detection. IEEE Trans. Intell. Transp. Syst. 2020, 23, 827–839. [Google Scholar] [CrossRef]
- McInnes, L.; Healy, J.; Astels, S. hdbscan: Hierarchical density based clustering. J. Open Source Softw. 2017, 2, 205. [Google Scholar] [CrossRef]
- Roy, S.; Chowdhury, I.R. Intoxication in the city: Investigating spatial patterns and determinants of drugs and alcohol-related illegal activities in India’s geostrategic corridor. Appl. Geogr. 2024, 171, 103386. [Google Scholar] [CrossRef]
- Roy, S.; Chowdhury, I.R. Brighter Nights, Safer Cities? Exploring spatial link between VIIRS nightlight and urban crime risk. Remote Sens. Appl. Soc. Environ. 2025, 37, 101489. [Google Scholar] [CrossRef]
- Kathuria, S.; Mathur, P. A policy framework to build on northeast India’s strengths. In Playing to Strengths; World Bank Group: Washington, DC, USA, 2019; p. 1. [Google Scholar]
- Roy, S. Claiming the night: Fear, exclusion, and the right to urban nocturnality. Hum. Geogr. 2025, 19427786251377462. [Google Scholar] [CrossRef]
- Ghosh, A. The importance of being Siliguri: Border effect and the ‘untimely’city in North Bengal. In Logistical Asia: The Labour of Making a World Region; Palgrave Macmillan: Singapore, 2018; pp. 135–154. [Google Scholar]
- Roy, S.; Majumder, S.; Bose, A.; Roy Chowdhury, I. Does geographical heterogeneity influence urban quality of life? A case of a densely populated Indian city. Pap. Appl. Geogr. 2023, 9, 395–424. [Google Scholar] [CrossRef]
- Roy, S.; Singha, N. Analysis of ambient air quality based on exceedance factor and air quality index for Siliguri City, West Bengal. Curr. World Environ. 2020, 15, 235. [Google Scholar] [CrossRef]
- Bhattacharyya, D.B.; Mitra, S. Making Siliguri a walkable city. Procedia-Soc. Behav. Sci. 2013, 96, 2737–2744. [Google Scholar] [CrossRef]
- Costa, M.A.; Kulldorff, M. Applications of spatial scan statistics: A review. In Scan Statistics; Birkhäuser: Boston, MA, USA, 2009; pp. 129–152. [Google Scholar]
- Kiani, B.; Raouf Rahmati, A.; Bergquist, R.; Hashtarkhani, S.; Firouraghi, N.; Bagheri, N.; Moghaddas, E.; Mohammadi, A. Spatio-temporal epidemiology of the tuberculosis incidence rate in Iran 2008 to 2018. BMC Public Health 2021, 21, 1093. [Google Scholar] [CrossRef]
- Kulldorff, M. SaTScan User Guide for Version 10.1. 2022. Available online: https://www.satscan.org/techdoc.html (accessed on 21 November 2023).
- Kulldorff, M. A spatial scan statistic. Commun. Stat.-Theory Methods 1997, 26, 1481–1496. [Google Scholar] [CrossRef]
- Kulldorff, M.; Mostashari, F.; Duczmal, L.; Katherine Yih, W.; Kleinman, K.; Platt, R. Multivariate scan statistics for disease surveillance. Stat. Med. 2007, 26, 1824–1833. [Google Scholar] [CrossRef]
- Maro, J.C.; Nguyen, M.D.; Dashevsky, I.; Baker, M.A.; Kulldorff, M. Statistical power for postlicensure medical product safety data mining. eGEMs 2017, 5, 6. [Google Scholar] [CrossRef][Green Version]
- Stewart, G.; Al-Khassaweneh, M. An implementation of the HDBSCAN* clustering algorithm. Appl. Sci. 2022, 12, 2405. [Google Scholar] [CrossRef]
- Campello, R.J.; Moulavi, D.; Sander, J. Density-based clustering based on hierarchical density estimates. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Berlin, Heidelberg, 14–17 April 2013; pp. 160–172. [Google Scholar]
- Jain, R.; Bhat, A. Determining Statistically Significant Road Accident Spatial Hotspots using Machine Learning Approaches. In 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N); IEEE: Piscataway, NJ, USA, 2022; pp. 214–221. [Google Scholar]
- Wang, D.; Huang, Y.; Cai, Z. A two-phase clustering approach for traffic accident black spots identification: Integrated GIS-based processing and HDBSCAN model. Int. J. Inj. Control Saf. Promot. 2023, 30, 270–281. [Google Scholar] [CrossRef] [PubMed]
- Cesario, E.; Lindia, P.; Vinci, A. Detecting multi-density urban hotspots in a smart city: Approaches, challenges and applications. Big Data Cogn. Comput. 2023, 7, 29. [Google Scholar] [CrossRef]
- Sadler, R.C.; Melde, C.; Zeoli, A.; Wolfe, S.; O’Brien, M. Characterizing spatio-temporal differences in homicides and non-fatal shootings in Milwaukee, Wisconsin, 2006–2015. Appl. Spat. Anal. Policy 2022, 15, 117–142. [Google Scholar] [CrossRef]
- Lucidi, F.; Mallia, L.; Violani, C.; Giustiniani, G.; Persia, L. The contributions of sleep-related risk factors to diurnal car accidents. Accid. Anal. Prev. 2013, 51, 135–140. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Li, J.; Wang, K.; Zhao, J.; Cong, H.; He, P. Exploring factors affecting the severity of night-time vehicle accidents under low illumination conditions. Adv. Mech. Eng. 2019, 11, 1687814019840940. [Google Scholar] [CrossRef]
- Eboli, L.; Forciniti, C.; Mazzulla, G. Factors influencing accident severity: An analysis by road accident type. Transp. Res. Procedia 2020, 47, 449–456. [Google Scholar] [CrossRef]
- Ackaah, W.; Apuseyine, B.A.; Afukaar, F.K. Road traffic crashes at night-time: Characteristics and risk factors. Int. J. Inj. Control Saf. Promot. 2020, 27, 392–399. [Google Scholar] [CrossRef]
- Gu, Z.; Peng, B. Investigation into the built environment impacts on pedestrian crash frequencies during morning, noon/afternoon, night, and during peak hours: A case study in Miami County, Florida. J. Transp. Saf. Secur. 2021, 13, 915–935. [Google Scholar] [CrossRef]









| Measure | LISA & Moran’s I | Scan Statistic Based Purely Spatial | HDBSCAN |
|---|---|---|---|
| Cluster Detected | Yes | Yes | Yes |
| Cluster Types | High-High (Hotspot), Low-Low (Coldspot), Outliers | Significant cluster | Density-based cluster with noise |
| Number of Clusters |
|
| Total cluster—8 |
| Statistical measures |
| All significant cluster p-value—<0.00 |
|
| Cluster detection success rate | N/A | N/A |
|
| Cluster probability rate | N/A | N/A |
|
| Cluster Shape | Irregular | Circular | Irregular |
| Strength | Detects local spatial autocorrelation | Identifies statistically significant high-risk clusters | Flexible detection of irregular and varying-density clusters with probabilistic confidence |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Roy, S.; Mohammadi, A.; Roy, R. Road Traffic Accident Hotspot Detection: A GIS-Based Machine Learning Approach Using HDBSCAN and Spatial Clustering Techniques. Geographies 2026, 6, 55. https://doi.org/10.3390/geographies6020055
Roy S, Mohammadi A, Roy R. Road Traffic Accident Hotspot Detection: A GIS-Based Machine Learning Approach Using HDBSCAN and Spatial Clustering Techniques. Geographies. 2026; 6(2):55. https://doi.org/10.3390/geographies6020055
Chicago/Turabian StyleRoy, Subham, Alireza Mohammadi, and Ranjan Roy. 2026. "Road Traffic Accident Hotspot Detection: A GIS-Based Machine Learning Approach Using HDBSCAN and Spatial Clustering Techniques" Geographies 6, no. 2: 55. https://doi.org/10.3390/geographies6020055
APA StyleRoy, S., Mohammadi, A., & Roy, R. (2026). Road Traffic Accident Hotspot Detection: A GIS-Based Machine Learning Approach Using HDBSCAN and Spatial Clustering Techniques. Geographies, 6(2), 55. https://doi.org/10.3390/geographies6020055
