Prediction of Traffic Incident Locations with a Geohash-Based Model Using Machine Learning Algorithms
Abstract
:1. Introduction
2. Background Knowledge
2.1. Geohash
2.2. Machine Learning
2.2.1. Decision Trees
2.2.2. Random Forest
2.2.3. K-Nearest Neighbors
2.2.4. Support Vector Machines
2.2.5. GridSearchCV
3. Experiment
3.1. Study Area
3.2. Data Acquisition
- Time-dependent variables include year, month, day, special day, and incident time.
- Location-dependent variables consist of latitude, longitude, geohash (output), and region, among others.
- Vehicle variables encompass cars, trucks, and motorcycles, among others.
- Variables related to traffic index include minimum, maximum, and average traffic index as well as traffic density.
- Speed-related variables consist of minimum, maximum, and average speed and the number of vehicles per day.
- Road structure and condition variables include road type, number of lanes, and divided road information.
- Meteorological variables include temperature, humidity, wind speed and direction, road temperature, and rainfall amount.
- Social and demographic variables include the number of schools, hospitals, residences, cafes/restaurants, and workplaces.
- District-related variables consist of population, number of neighborhoods, agricultural area, surface area, etc.
3.3. Hyperparameter Tuning
3.4. Performance Matrix
3.5. Setup for the Experiment
4. Results
5. Conclusions
5.1. Limitations
5.2. Future Research
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Variables | Data Type | Explanation |
Year | Numeric | Year of traffic incident |
Month | Categorical | Month in which the traffic incident occurred (1–12) |
Day | Categorical | Day of the month in which the traffic incident occurred (1–31) |
Special day | Categorical | Representation of public holidays, religious holidays, and important days |
Incident Day | Categorical | Day of the week when the traffic incident occurred (1–7) |
Time period | Categorical | Time range of the day when the traffic incident occurred (1 8–14, 2 14–20, 3 20–2, 4 2–8) |
District | Categorical | Districts where traffic incidents occur such as Avcilar, Bakirkoy, Bahcelievler, etc. |
District population | Numeric | Population information of the district where the traffic incident occurred |
Number of Neighborhoods | Numeric | Number of neighborhoods in the district where the traffic incident occurred |
Area measurement | Numeric | Area of the district where the traffic incident took place |
Minimum speed | Numeric | Minimum speed for the relevant geohash area on the given day and time |
Maximum speed | Numeric | Maximum speed for the relevant geohash area on the given day and time |
Average speed | Numeric | Average speed for the relevant geohash area on the given day and time |
Number of unique vehicles | Numeric | Number of different vehicles within the relevant geohash area on the given day and time |
Minimum traffic index | Numeric | Field containing minimum traffic index information on the relevant day and time |
Maximum traffic index | Numeric | Field containing maximum traffic index information on the relevant day and time |
Average traffic index | Numeric | Field containing average traffic index information on the relevant day and time |
Number of vehicles per day | Numeric | Total number of vehicles passing within the given day and the relevant geohash area |
Daily average speed | Numeric | Average speed of vehicles on the given day and within the relevant geohash area |
Traffic percentage | Numeric | Percentage of overall traffic measured at five-minute intervals on a given day |
Temperature | Numeric | Air temperature of the relevant district on the given day and time |
Road temperature | Numeric | Road temperature information of the relevant district on the given day and time |
Humidity | Numeric | Air humidity rate of the relevant district on the given day and time |
Rainfall Amount | Numeric | Rainfall amount of the relevant district on the given day and time |
Wind speed | Numeric | Wind speed of the relevant district on the given day and time |
Wind direction | Numeric | Direction of the wind speed of the relevant district on the given day and time |
Ground information | Categorical | Ground information of the relevant geohash area (dry/wet, etc.) on the given day and time |
Road type-1 | Categorical | Road type of the relevant geohash area (access road, side road, intersection, bridge, etc.) |
Road type-2 | Categorical | Road type in the relevant geohash area (main artery, highways, etc.) |
Number of lanes | Categorical | Number of stripes of the relevant geohash area |
Divided road | Categorical | Divided road information of the relevant geohash area |
Speed | Numeric | Current speed limit |
Width | Numeric | Width of the road in the relevant geohash area |
One Way | Categorical | One-way information of the road in the relevant geohash area |
Car | Numeric | Number of cars registered in the relevant month and district |
Minibus | Numeric | Number of minibuses registered in the relevant month and district |
Bus | Numeric | Number of buses registered in the relevant month and district |
Van | Numeric | Number of vans registered in the relevant month and district |
Truck | Numeric | Number of trucks registered in the relevant month and district |
Motorcycle | Numeric | Number of motorcycles registered in the relevant month and district |
Special purpose vehicle | Numeric | Number of special purpose vehicles registered in the relevant month and district |
Total number of vehicles | Numeric | Total number of vehicles in the relevant month |
Bachelor’s degree rate | Numeric | Rate of people with a bachelor’s degree by year in the given district |
Illiteracy rate | Numeric | Year-based illiteracy rate of the given district |
Student rate | Numeric | Year-based student rate for the given district |
Average household size | Numeric | Year-based average household size for the given district |
Number of houses | Numeric | Monthly number of houses in the given district |
Number of private workplaces | Numeric | Monthly number of private workplaces in the given district |
Agricultural field | Numeric | Year-based agricultural area of the given district |
Number of hospitals | Numeric | Year-based number of hospitals in the given district |
Number of schools | Numeric | Year-based number of schools in the given district |
University | Numeric | Year-based number of universities in the given district |
University facility | Numeric | Number of university facilities in the given district |
Police | Numeric | Year-based number of police officers in the given district |
Fire station | Numeric | Year-based fire department area of the given district |
personSOS | Numeric | Year-based number of emergency healthcare workers in the given district |
Metrobus station | Numeric | Year-based number of metrobus stations in the given district |
Metro station | Numeric | Year-based number of metro stations in the given district |
Port | Numeric | Year-based number of ports for the given district |
Number of parking lots | Numeric | Year-based number of parking lots in the given district |
Number of banks | Numeric | Year-based number of banks in the given district |
Number of ATMs | Numeric | Year-based number of ATMs in the given district |
Number of shopping malls | Numeric | Year-based number of shopping malls in the given district |
Number of markets | Numeric | Year-based number of markets in the given district |
Number of mini markets | Numeric | Year-based number of minimarkets in the given district |
Number of super markets | Numeric | Year-based number of supermarkets in the given district |
Number of hotels | Numeric | Year-based number of hotels in the given district |
Number of stores | Numeric | Year-based number of stores in the given district |
Industrial area | Numeric | Year-based industrial area amount for the given district |
Number of bars/clubs | Numeric | Year-based number of bars/clubs in the given district |
Number of cafes | Numeric | Year-based number of cafes in the given district |
Number of museum galleries | Numeric | Year-based number of museums and galleries in the given district |
Sports facility | Numeric | Year-based number of sports facilities in the given district |
Number of theaters | Numeric | Year-based number of theater halls in the given district |
Appendix B
References
- Helman, D.L. Traffic incident management. Public Roads 2004, 68, 14–21. [Google Scholar]
- Farrag, S.G.; Outay, F.; Yasar, A.U.H.; Janssens, D.; Kochan, B.; Jabeur, N. Toward the improvement of traffic incident management systems using Car2X technologies. Pers. Ubiquitous Comput. 2021, 25, 163–176. [Google Scholar] [CrossRef]
- Farrag, S.G.; Sahli, N.; El-Hansali, Y.; Shakshuki, E.M.; Yasar, A.; Malik, H. STIMF: A smart traffic incident management framework. J. Ambient Intell. Humaniz. Comput. 2021, 12, 85–101. [Google Scholar] [CrossRef]
- Wang, C.; Quddus, M.A.; Ison, S.G. The effect of traffic and road characteristics on road safety: A review and future research direction. Saf. Sci. 2013, 57, 264–275. [Google Scholar] [CrossRef]
- Touahmia, M. Identification of risk factors influencing road traffic accidents. Eng. Technol. Appl. Sci. Res. 2018, 8, 2417–2421. [Google Scholar] [CrossRef]
- Zou, Y.; Zhang, Y.; Cheng, K. Exploring the impact of climate and extreme weather on fatal traffic accidents. Sustainability 2021, 13, 390. [Google Scholar] [CrossRef]
- Ulu, M.; Türkan, Y.S.; Mengüç, K. Trafik kazalarını etkileyen faktörlerin ağırlıklarının BWM ve SWARA yöntemleri ile belirlenmesi. Akıllı Ulaşım Sist. Ve Uygulamaları Derg. 2022, 5, 227–238. [Google Scholar] [CrossRef]
- Xiang, W. An efficient location privacy preserving model based on Geohash. In Proceedings of the 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), Beijing, China, 28–30 October 2019; pp. 1–5. [Google Scholar]
- Zhang, Z.; Sun, X.; Chen, S.; Liang, Y. LPPS-AGC: Location privacy protection strategy based on alt-geohash coding in location-based services. Wirel. Commun. Mob. Comput. 2022, 2022, 3984099. [Google Scholar] [CrossRef]
- Basheer Ahmed, M.I.; Zaghdoud, R.; Ahmed, M.S.; Sendi, R.; Alsharif, S.; Saad, B.A.A.; Alsabt, R.; Rahman, A.; Krishnasamy, G. A real-time computer vision based approach to detection and classification of traffic incidents. Big Data Cogn. Comput. 2023, 7, 22. [Google Scholar] [CrossRef]
- Grigorev, A.; Mihaita, A.S.; Lee, S.; Chen, F. Incident duration prediction using a bi-level machine learning framework with outlier removal and intra–extra joint optimisation. Transp. Res. Part C Emerg. Technol. 2022, 141, 103721. [Google Scholar] [CrossRef]
- Li, L.; Sheng, X.; Du, B.; Wang, Y.; Ran, B. A deep fusion model based on restricted Boltzmann machines for traffic accident duration prediction. Eng. Appl. Artif. Intell. 2020, 93, 103686. [Google Scholar] [CrossRef]
- Zhao, Y.; Deng, W. Prediction in traffic accident duration based on heterogeneous ensemble learning. Appl. Artif. Intell. 2022, 36, 2018643. [Google Scholar] [CrossRef]
- Gutierrez-Osorio, C.; González, F.A.; Pedraza, C.A. Deep Learning Ensemble Model for the Prediction of Traffic Accidents Using Social Media Data. Computers 2022, 11, 126. [Google Scholar] [CrossRef]
- Lin, D.J.; Chen, M.Y.; Chiang, H.S.; Sharma, P.K. Intelligent traffic accident prediction model for Internet of Vehicles with deep learning approach. IEEE Trans. Intell. Transp. Syst. 2021, 23, 2340–2349. [Google Scholar] [CrossRef]
- Chuanxia, S.; Han, Z.; Peixuan, Y. Machine learning and IoTs for forecasting prediction of smart road traffic flow. Soft Comput. 2023, 27, 323–335. [Google Scholar] [CrossRef]
- Bai, M.; Lin, Y.; Ma, M.; Wang, P.; Duan, L. PrePCT: Traffic congestion prediction in smart cities with relative position congestion tensor. Neurocomputing 2021, 444, 147–157. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, C.; Wen, J.; Xiao, X.; Chen, Z. A grey convolutional neural network model for traffic flow prediction under traffic accidents. Neurocomputing 2022, 500, 761–775. [Google Scholar] [CrossRef]
- An, J.; Fu, L.; Hu, M.; Chen, W.; Zhan, J. A Traffic congestion prediction in smart cities with relative position network method to traffic flow prediction with uncertain traffic accident information. IEEE Access 2019, 7, 20708–20722. [Google Scholar] [CrossRef]
- Quek, C.; Pasquier, M.; Lim, B. A novel self-organizing fuzzy rule-based system for modelling traffic flow behaviour. Expert Syst. Appl. 2009, 36, 12167–12178. [Google Scholar] [CrossRef]
- Yan, M.; Shen, Y. Traffic accident severity prediction based on random forest. Sustainability 2022, 14, 1729. [Google Scholar] [CrossRef]
- Vaiyapuri, T.; Gupta, M. Traffic accident severity prediction and cognitive analysis using deep learning. Soft Comput. 2021, 1–13. [Google Scholar] [CrossRef]
- Yang, Z.; Zhang, W.; Feng, J. Predicting multiple types of traffic accidents using deep learning techniques. Cluster-task deep learning framework. Saf. Sci. 2022, 146, 105522. [Google Scholar] [CrossRef]
- Santos, D.; Saias, J.; Quaresma, P.; Nogueira, V.B. Machine learning approaches to traffic accident analysis and hotspot prediction. Computers 2021, 10, 157. [Google Scholar] [CrossRef]
- Zhang, Z.; Yang, W.; Wushour, S. Traffic accident prediction based on LSTM-GBRT model. J. Control Sci. Eng. 2020, 2020, 4206919. [Google Scholar] [CrossRef]
- Godumula, D.T.; Ravi Shankar, K.V.R. Safety evaluation of horizontal curves on two lane rural highways using machine learning algorithms: A priority-based study for sight distance improvements. Traffic Inj. Prev. 2023, 24, 331–337. [Google Scholar] [CrossRef]
- Ferreira-Vanegas, C.M.; Vélez, J.I.; García-Llinás, G.A. Analytical methods and determinants of frequency and severity of road accidents: A 20-year systematic literature review. J. Adv. Transp. 2022, 145, 7239464. [Google Scholar] [CrossRef]
- Lin, Y.; Li, R. Real-time traffic accidents post-impact prediction: Based on crowdsourcing data. Accid. Anal. Prev. 2020, 145, 105696. [Google Scholar] [CrossRef]
- Zhang, C.; Li, Y.; Li, T. A road traffic accidents prediction model for traffic service robot. Libr. Hi Tech 2002, 40, 1031–1048. [Google Scholar] [CrossRef]
- Gan, J.; Li, L.; Zhang, D.; Yi, Z.; Xiang, Q. An alternative method for traffic accident severity prediction: Using deep forests algorithm. J. Adv. Transp. 2020, 2020, 1257627. [Google Scholar] [CrossRef]
- Park, R.C.; Hong, E.J. Urban traffic accident risk prediction for knowledge-based mobile multimedia service. Pers. Ubiquitous Comput. 2022, 26, 417–427. [Google Scholar] [CrossRef]
- Azhar, A.; Rubab, S.; Khan, M.M.; Bangash, Y.A.; Alshehri, M.D.; Illahi, F.; Bashir, A.K. Detection Predicting multiple types of deep learning techniques. Clust. Comput. 2023, 26, 477–493. [Google Scholar] [CrossRef]
- Rahman, M.T.; Jamal, A.; Al-Ahmadi, H.M. Examining hotspots of traffic collisions and their spatial relationships with land use: A GIS-based geographically weighted regression approach for Dammam, Saudi Arabia. ISPRS Int. J. Geo-Inf. 2020, 9, 540. [Google Scholar] [CrossRef]
- Qu, X.; Meng, Q. A note on hotspot identification for urban expressways. Saf. Sci. 2014, 66, 87–91. [Google Scholar] [CrossRef]
- Anderson, T.K. Kernel density estimation and K-means clustering to profile road accident hotspots. Accid. Anal. Prev. 2009, 41, 359–364. [Google Scholar] [CrossRef]
- Macedo, M.R.; Maia, M.L.; Rabbani, E.R.K.; Neto, O.C.L.; Andrade, M. Traffic accident prediction model for rural highways in Pernambuco. Case Stud. Transp. Policy 2022, 10, 278–286. [Google Scholar] [CrossRef]
- Moons, E.; Brijs, T.; Wets, G. Identifying hazardous road locations: Hot spots versus hot zones. In Transactions on Computational Science VI; Springer: Berlin/Heidelberg, Germany, 2009; pp. 288–300. [Google Scholar]
- Shariff, S.R.; Maad, H.A.; Halim, N.N.A.; Derasit, Z. Determining hotspots of road accidents using spatial analysis. Indones. J. Electr. Eng. Comput. Sci. 2018, 9, 146–151. [Google Scholar]
- Al-Omari, A.; Shatnawi, N.; Khedaywi, T.; Miqdady, T. Prediction of traffic accidents hot spots using fuzzy logic and GIS. Appl. Geomat. 2020, 12, 149–161. [Google Scholar] [CrossRef]
- Al-Aamri, A.K.; Hornby, G.; Zhang, L.C.; Al-Maniri, A.A.; Padmadas, S.S. Mapping road traffic crash hotspots using GIS-based methods: A case study of Muscat Governorate in the Sultanate of Oman. Spat. Stat. 2021, 42, 100458. [Google Scholar] [CrossRef]
- Manap, N.; Borhan, M.N.; Yazid, M.R.M.; Hambali, M.K.A.; Rohan, A. Identification of hotspot segments with a risk of heavy-vehicle accidents based on spatial analysis at controlled-access highway. Sustainability 2021, 13, 1487. [Google Scholar] [CrossRef]
- Erdogan, S.; Yilmaz, I.; Baybura, T.; Gullu, M. Geographical information systems aided traffic accident analysis system case study: City of Afyonkarahisar. Accid. Anal. Prev. 2008, 40, 174–181. [Google Scholar] [CrossRef]
- Liang, L.Y.; Ma’soem, D.M.; Hua, L.T. Traffic accident application using geographic information system. J. East. Asia Soc. Transp. Stud. 2005, 6, 3574–3589. [Google Scholar]
- Mali, S. Traffic police operation based on sensors and data analytics. Transp. Res. Procedia 2020, 47, 187–194. [Google Scholar] [CrossRef]
- Feng, Y.; Zhu, W. Formulating an Innovative Spatial-Autocorrelation-based Method for Identifying Road Accident Hot Zones. IOP Conf. Ser. Earth Environ. Sci. 2020, 446, 052068. [Google Scholar] [CrossRef]
- Alkhadour, W.; Zraqou, J.; Al-Helali, A.; Al-Ghananeem, S. Traffic accidents detection using geographic information systems (GIS). Int. J. Adv. Comput. Sci. Appl. 2021, 12, 484–494. [Google Scholar] [CrossRef]
- Xie, K.; Ozbay, K.; Yang, D.; Xu, C.; Yang, H. Modeling bicycle crash costs using big data: A grid-cell-based Tobit model with random parameters. J. Transp. Geogr. 2021, 91, 102953. [Google Scholar] [CrossRef]
- Ulu, M. Trafik Olay Yönetiminde Yapay Zeka Tabanlı Bir Optimizasyon Modeli ve Uygulaması. Doctoral Dissertation, Istanbul University–Cerrahpasa, Istanbul, Türkiye, 2023. [Google Scholar]
- Menguc, K.; Aydin, N.; Yilmaz, A. A Data Driven Approach to Forecasting Traffic Speed Classes Using Extreme Gradient Boosting Algorithm and Graph Theory. Phys. A Stat. Mech. Its Appl. 2023, 620, 128738. [Google Scholar] [CrossRef]
- Huang, K.; Li, G.; Wang, J. Rapid retrieval strategy for massive remote sensing metadata based on GeoHash coding. Remote Sens. Lett. 2018, 9, 1070–1078. [Google Scholar] [CrossRef]
- Suwardi, I.S.; Dharma, D.; Satya, D.P.; Lestari, D.P. Geohash index based spatial data model for corporate. In Proceedings of the 2015 International Conference on Electrical Engineering and Informatics (ICEEI), Denpasar, Indonesia, 10–11 August 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 478–483. [Google Scholar]
- GeoHash. Available online: https://learn.microsoft.com/tr-tr/azure/data-explorer/kusto/query/geo-point-to-geohash-function (accessed on 10 June 2023).
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
- Vanfretti, L.; Arava, V.N. Decision tree-based classification of multiple operating conditions for power system voltage stability assessment. Int. J. Electr. Power Energy Syst. 2020, 123, 106251. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Wan, M.; Wu, Q.; Yan, L.; Guo, J.; Li, W.; Lin, W.; Lu, S. Taxi drivers’ traffic violations detection using random forest algorithm: A case study in China. Traffic Inj. Prev. 2023, 24, 362–370. [Google Scholar] [CrossRef]
- Dwivedi, Y.K.; Hughes, L.; Ismagilova, E.; Aarts, G.; Coombs, C.; Crick, T.; Duan, Y.; Dwivedi, R.; Edwards, J.; Eirug, A.; et al. Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. Int. J. Inf. Manag. 2021, 57, 101994. [Google Scholar] [CrossRef]
- Rudin, C.; Radin, J. Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harv. Data Sci. Rev. 2019, 1, 1–9. [Google Scholar]
- Raja, M.N.A.; Abdoun, T.; El-Sekelly, W. Smart prediction of liquefaction-induced lateral spreading. J. Rock Mech. Geotech. Eng. 2023. [Google Scholar] [CrossRef]
- Song, Y.; Liang, J.; Lu, J.; Zhao, X. An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 2017, 251, 26–34. [Google Scholar] [CrossRef]
- Peterson, L.E. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
- Abu Alfeilat, H.A.; Hassanat, A.B.; Lasassmeh, O.; Tarawneh, A.S.; Alhasanat, M.B.; Eyal Salman, H.S.; Prasath, V.S. Effects of distance measure choice on k-nearest neighbor classifier performance: A review. Big Data 2019, 7, 221–248. [Google Scholar] [CrossRef]
- Jakkula, V. Tutorial on Support Vector Machine (Svm); School of EECS, Washington State University: Pullman, WA, USA, 2006; Volume 37, p. 3. [Google Scholar]
- Alpaydin, E. Machine Learning: The New AI; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2022. [Google Scholar]
- Liao, L.; Li, H.; Shang, W.; Ma, L. An empirical study of the impact of hyperparameter tuning and model optimization on the performance properties of deep neural networks. ACM Trans. Softw. Eng. Methodol. (TOSEM) 2022, 31, 1–40. [Google Scholar] [CrossRef]
- Liang, E.; Stamp, M. Predicting pedestrian crosswalk behavior using Convolutional Neural Networks. Traffic Inj. Prev. 2023, 24, 338–343. [Google Scholar] [CrossRef]
- Raschka, S. Python Machine Learning; Packt Publishing Ltd.: Birmingham, UK, 2015. [Google Scholar]
- Radhakrishnan, P. What are Hyperparameters? And How to tune the Hyperparameters in a Deep Neural Network? Data Sci. 2017, 18. [Google Scholar]
Name | Description |
---|---|
Latitude | Geospatial coordinate is the latitude value in degrees. The valid value is a real number and is in the range [−90, +90]. |
Longitude | Geospatial coordinate is the longitude value in degrees. The valid value is a real number and is in the range [−180, +180]. |
Accuracy | It defines the requested level of accuracy. The supported values fall within the range of (1, 18). If a value is not specified, the default of 5 will be used. |
Geohash Length/Code | Cell Width | Cell Height |
---|---|---|
1 | 5000 km | 5000 km |
2 | 1250 km | 625 km |
3 | 156.25 km | 156.25 km |
4 | 39.06 km | 19.53 km |
5 | 4.88 km | 4.88 km |
6 | 1.22 km | 0.61 km |
7 | 152.59 m | 152.59 m |
8 | 38.15 m | 19.07 m |
9 | 4.77 m | 4.77 m |
10 | 1.19 m | 0.59 m |
11 | 149.01 mm | 149.01 mm |
12 | 37.25 mm | 18.63 mm |
Index | Geohash | Incident Frequency |
---|---|---|
0 | sxk3k1 | 654 |
1 | sxk3nt | 600 |
2 | sxk3ju | 561 |
3 | sxk3k2 | 542 |
4 | sxk90z | 529 |
5 | sxk3n5 | 510 |
6 | sxk90w | 463 |
7 | sxk91p | 472 |
8 | sxk3k8 | 405 |
9 | sxk3pr | 378 |
Time Intervals | Traffic Incident Time Interval | Incident Frequency |
---|---|---|
1 | 08:00–13:59 | 1927 |
2 | 14:00–19:59 | 2148 |
3 | 20:00–01:59 | 754 |
4 | 02:00–07:59 | 285 |
DT | k-NN | ||
Hyperparameter | Value | Hyperparameter | Value |
max_depth | 3, 5, 7, None | n_neighbors | 3, 5, 7, 9, 11 |
min_samples_split | 2, 5, 10 | weights | uniform, distance |
min_samples_leaf | 1, 2, 4 | p | 1, 2, 3 |
RF | SVM | ||
Hyperparameter | Value | Hyperparameter | Value |
n_estimators | 50, 100, 200 | C | 0.1, 1, 10 |
max_depth | None, 5, 10 | kernel | linear, poly, rbf |
min_samples_split | 2, 5 | ||
min_samples_leaf | 1, 2, 4 | degree | 2, 3 |
Predicted Positive | Predicted Negative | |
---|---|---|
Actual Positive | True Positive (TP) | False Negative (FN) |
Actual Negative | False Positive (FP) | True Negative (TN) |
Performance Metric | DT | k-NN | RF | SVM |
---|---|---|---|---|
accuracy | 0.874 | 0.786 | 0.908 | 0.875 |
balanced accuracy | 0.867 | 0.774 | 0.905 | 0.872 |
precision micro | 0.874 | 0.786 | 0.908 | 0.875 |
precision macro | 0.878 | 0.790 | 0.915 | 0.876 |
recall micro | 0.874 | 0.786 | 0.908 | 0.875 |
recall macro | 0.867 | 0.774 | 0.905 | 0.872 |
f1 micro | 0.874 | 0.786 | 0.908 | 0.875 |
f1 macro | 0.870 | 0.775 | 0.907 | 0.873 |
Geohash Area | Actual/ Predicted | 1st Time Zone | 2nd Time Zone | 3rd Time Zone | 4th Time Zone | Total |
---|---|---|---|---|---|---|
0 | Actual | 38 | 68 | 15 | 3 | 124 |
Predicted | 38 | 62 | 14 | 3 | 117 | |
1 | Actual | 50 | 52 | 32 | 6 | 140 |
Predicted | 50 | 51 | 32 | 6 | 139 | |
2 | Actual | 36 | 39 | 18 | 8 | 101 |
Predicted | 33 | 38 | 18 | 8 | 97 | |
3 | Actual | 27 | 54 | 10 | 10 | 101 |
Predicted | 25 | 43 | 7 | 8 | 83 | |
4 | Actual | 44 | 40 | 9 | 5 | 98 |
Predicted | 43 | 37 | 9 | 4 | 93 | |
5 | Actual | 33 | 46 | 24 | 2 | 105 |
Predicted | 29 | 42 | 22 | 1 | 94 | |
6 | Actual | 47 | 33 | 17 | 5 | 102 |
Predicted | 42 | 28 | 14 | 4 | 88 | |
7 | Actual | 33 | 47 | 14 | 9 | 103 |
Predicted | 25 | 40 | 11 | 6 | 82 | |
8 | Actual | 21 | 36 | 8 | 2 | 67 |
Predicted | 19 | 33 | 8 | 2 | 62 | |
9 | Actual | 30 | 32 | 19 | 1 | 82 |
Predicted | 28 | 29 | 17 | 0 | 74 | |
Total Incidents | 359 | 447 | 166 | 51 | 1023 | |
Total Predictions | 332 | 403 | 152 | 42 | 929 | |
Accuracy Rate | 92.48 | 90.16 | 91.57 | 82.35 | 90.81 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ulu, M.; Kilic, E.; Türkan, Y.S. Prediction of Traffic Incident Locations with a Geohash-Based Model Using Machine Learning Algorithms. Appl. Sci. 2024, 14, 725. https://doi.org/10.3390/app14020725
Ulu M, Kilic E, Türkan YS. Prediction of Traffic Incident Locations with a Geohash-Based Model Using Machine Learning Algorithms. Applied Sciences. 2024; 14(2):725. https://doi.org/10.3390/app14020725
Chicago/Turabian StyleUlu, Mesut, Erdal Kilic, and Yusuf Sait Türkan. 2024. "Prediction of Traffic Incident Locations with a Geohash-Based Model Using Machine Learning Algorithms" Applied Sciences 14, no. 2: 725. https://doi.org/10.3390/app14020725
APA StyleUlu, M., Kilic, E., & Türkan, Y. S. (2024). Prediction of Traffic Incident Locations with a Geohash-Based Model Using Machine Learning Algorithms. Applied Sciences, 14(2), 725. https://doi.org/10.3390/app14020725