Container Truck High-Risk Events Prediction and Its Influencing Factors Analyses Based on Trajectory Data
Abstract
1. Introduction
- (1)
- Referring to the construction process of the HighD dataset, this paper uses drones and YOLOv8 technology to create a natural trajectory dataset of container truck traffic flow, which contains a total of 24,038 trajectories of about 40 h.
- (2)
- Aiming at the low level of vehicle intelligence and imperfect road infrastructure in container truck traffic flow, a method framework from data collection to real-time safety analysis is proposed.
- (3)
- Multiple classification models were used for conflict prediction. The best performing model was XGBoost, with an accuracy of 0.86 and an AUC value of 0.933. SHAP was used to carefully interpret the results of the XGBoost model and analyze how the selected feature variables specifically affect the conflict.
- (4)
- The research on conflict prediction in recent years was compared to this paper, and the differences between container truck traffic flow and traditional traffic flow were concluded in terms of conflict influencing factors.
2. Literature Review
3. Methodology
3.1. Data Collection Methods
3.1.1. Video Data Capture
3.1.2. Object Detection
3.1.3. Trajectory Extraction
3.2. Machine Learning Methods
3.3. Model Evaluation
3.4. Shapley Additive Explanation
4. Container Truck Dataset
4.1. Dataset Introduction
4.2. Data Processing
4.2.1. Identifying Key Vehicle Interactions
4.2.2. Conflict Event Identification
4.2.3. Feature Extraction
5. Results
5.1. Results of the Model
5.2. Discussion
5.2.1. Interpretation of Model Results
5.2.2. Comparison with Existing Studies
6. Conclusions
- (1)
- The XGBoost model performed best in all datasets, and the model after SMOTE processing had the best performance, with a prediction accuracy of 0.86 and an AUC value of 0.933.
- (2)
- In the container truck traffic flow scenario, traffic density and lane change ratio are the two main factors causing conflicts, both of which have a positive impact on the occurrence of conflicts; that is, the greater the traffic density and the more lane-changing vehicles, the more likely conflicts will occur. The average speed and car ratio have a negative impact on the occurrence of conflicts; that is, the higher the speed and the higher the car ratio, the less likely conflicts will occur. In addition, based on several characteristics of the lane, the results show that the impact on the occurrence of conflicts is small.
- (3)
- Compared to general traffic flow, the traffic density of container truck traffic flow has the greatest impact on conflicts. Relevant studies in recent years have shown that in general traffic flow, regardless of whether the features are extracted from a micro or macro perspective, the features that have the greatest impact on conflicts are related to speed. Compared to general traffic flow, truck traffic flow has more lane changes. Therefore, the impact of lane changes on conflicts in container truck traffic flow is second only to traffic density.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Xu, C.; Ozbay, K.; Liu, H.; Xie, K.; Yang, D. Exploring the impact of truck traffic on road segment-based severe crash proportion using extensive weigh-in-motion data. Saf. Sci. 2023, 166, 106261. [Google Scholar] [CrossRef]
- Li, L.; Lyu, H.; Wang, T.; Cheng, R. STdi4DMPC: Distributed Model Predictive Control for Connected and Automated Truck Platoon with Mixed Traffic Flow Based on Spatiotemporal Trajectory Prediction. IEEE Trans. Veh. Technol. 2024, 73, 14563–14579. [Google Scholar] [CrossRef]
- Nabavi Niaki, M.S.; Fu, T.; Saunier, N.; Miranda-Moreno, L.F.; Amador, L.; Bruneau, J.-F. Road Lighting Effects on Bicycle and Pedestrian Accident Frequency: Case Study in Montreal, Quebec, Canada. Transp. Res. Rec. 2016, 2555, 86–94. [Google Scholar] [CrossRef]
- Fu, C.; Sayed, T. Identification of adequate sample size for conflict-based crash risk evaluation: An investigation using Bayesian hierarchical extreme value theory models. Anal. Methods Accid. Res. 2023, 39, 100281. [Google Scholar] [CrossRef]
- Zheng, L.; Sayed, T.; Mannering, F. Modeling traffic conflicts for use in road safety analysis: A review of analytic methods and future directions. Anal. Methods Accid. Res. 2021, 29, 100142. [Google Scholar] [CrossRef]
- Ji, Q.; Lyu, H.; Yang, H.; Wei, Q.; Cheng, R. Bifurcation control of solid angle car-following model through a time-delay feedback method. J. Zhejiang Univ.-Sci. A 2023, 24, 828–840. [Google Scholar] [CrossRef]
- Abdel-Aty, M.; Wang, Z.; Zheng, O.; Abdelraouf, A. Advances and applications of computer vision techniques in vehicle trajectory generation and surrogate traffic safety indicators. Accid. Anal. Prev. 2023, 191, 107191. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Cheng, R. A real-time adaptive signal control method for multi-intersections in mixed connected vehicle environments. J. Zhejiang Univ.-Sci. A Appl. Phys. Eng. 2025, 1, 189. [Google Scholar] [CrossRef]
- Li, D.; Fu, C.; Sayed, T.; Wang, W. An integrated approach of machine learning and Bayesian spatial Poisson model for large-scale real-time traffic conflict prediction. Accid. Anal. Prev. 2023, 192, 107286. [Google Scholar] [CrossRef]
- Sohail, A.; Cheema, M.A.; Ali, M.E.; Toosi, A.N.; Rakha, H.A. Data-driven approaches for road safety: A comprehensive systematic literature review. Saf. Sci. 2023, 158, 105949. [Google Scholar] [CrossRef]
- Xie, Y.; Pongsakornsathien, N.; Gardi, A.; Sabatini, R. Explanation of Machine-Learning Solutions in Air-Traffic Management. Aerospace 2021, 8, 224. [Google Scholar] [CrossRef]
- Peng, Y.; Liu, D.; Wu, S.; Yang, X.; Wang, Y.; Zou, Y. Enhancing Mixed Traffic Flow with Platoon Control and Lane Management for Connected and Autonomous Vehicles. Sensors 2025, 25, 644. [Google Scholar] [CrossRef]
- Kovvali, V.G.; Alexiadis, V.; Zhang, L. Video-Based Vehicle Trajectory Data Collection. Available online: https://trid.trb.org/View/801154 (accessed on 25 January 2017).
- Krajewski, R.; Bock, J.; Kloeker, L.; Eckstein, L. The highD Dataset: A Drone Dataset of Naturalistic Vehicle Trajectories on German Highways for Validation of Highly Automated Driving Systems. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, MI, USA, 4–7 November 2018; pp. 2118–2125. [Google Scholar] [CrossRef]
- Barmpounakis, E.; Geroliminis, N. On the new era of urban traffic monitoring with massive drone data: The pNEUMA large-scale field experiment. Transp. Res. Part C: Emerg. Technol. 2020, 111, 50–71. [Google Scholar] [CrossRef]
- Zheng, O.; Abdel-Aty, M.; Yue, L.; Abdelraouf, A.; Wang, Z.; Mahmoud, N. CitySim: A Drone-Based Vehicle Trajectory Dataset for Safety-Oriented Research and Digital Twins. Transp. Res. Rec. J. Transp. Res. Board 2023, 2678, 606–621. [Google Scholar] [CrossRef]
- Yu, R.; Han, L.; Zhang, H. Trajectory data based freeway high-risk events prediction and its influencing factors analyses. Accid. Anal. Prev. 2021, 154, 106085. [Google Scholar] [CrossRef]
- Hu, Y.; Li, Y.; Huang, H.; Lee, J.; Yuan, C.; Zou, G. A high-resolution trajectory data driven method for real-time evaluation of traffic safety. Accid. Anal. Prev. 2022, 165, 106503. [Google Scholar] [CrossRef]
- Davis, G.A.; Hourdos, J.; Xiong, H.; Chatterjee, I. Outline for a causal model of traffic conflicts and crashes. Accid. Anal. Prev. 2011, 43, 1907–1919. [Google Scholar] [CrossRef]
- Tarko, A.P. Estimating the expected number of crashes with traffic conflicts and the Lomax Distribution—A theoretical and numerical exploration. Accid. Anal. Prev. 2018, 113, 63–73. [Google Scholar] [CrossRef] [PubMed]
- Zheng, L.; Ismail, K.; Meng, X. Traffic conflict techniques for road safety analysis: Open questions and some insights. Can. J. Civ. Eng. 2014, 41, 633–641. [Google Scholar] [CrossRef]
- Zheng, L.; Sayed, T. A bivariate Bayesian hierarchical extreme value model for traffic conflict-based crash estimation. Anal. Methods Accid. Res. 2020, 25, 100111. [Google Scholar] [CrossRef]
- Orsini, F.; Gecchele, G.; Rossi, R.; Gastaldi, M. A conflict-based approach for real-time road safety analysis: Comparative evaluation with crash-based models. Accid. Anal. Prev. 2021, 161, 106382. [Google Scholar] [CrossRef] [PubMed]
- Kamel, A.; Sayed, T.; Fu, C. Real-time safety analysis using autonomous vehicle data: A Bayesian hierarchical extreme value model. Transp. B Transp. Dyn. 2022, 11, 826–846. [Google Scholar] [CrossRef]
- Kilicarslan, M.; Zheng, J.Y. Predict Vehicle Collision by TTC From Motion Using a Single Video Camera. IEEE Trans. Intell. Transp. Syst. 2019, 20, 522–533. [Google Scholar] [CrossRef]
- Li, Y.; Wu, D.; Lee, J.; Yang, M.; Shi, Y. Analysis of the transition condition of rear-end collisions using time-to-collision index and vehicle trajectory data. Accid. Anal. Prev. 2020, 144, 105676. [Google Scholar] [CrossRef] [PubMed]
- Bella, F.; Russo, R. A Collision Warning System for rear-end collision: A driving simulator study. Procedia -Soc. Behav. Sci. 2011, 20, 676–686. [Google Scholar] [CrossRef]
- Meng, Q.; Qu, X. Estimation of rear-end vehicle crash frequencies in urban road tunnels. Accid. Anal. Prev. 2012, 48, 254–263. [Google Scholar] [CrossRef]
- Qu, X.; Kuang, Y.; Oh, E.; Jin, S. Safety Evaluation for Expressways: A Comparative Study for Macroscopic and Microscopic Indicators. Traffic Inj. Prev. 2014, 15, 89–93. [Google Scholar] [CrossRef]
- Mohammadian, S.; Haque, M.M.; Zheng, Z.; Bhaskar, A. Integrating safety into the fundamental relations of freeway traffic flows: A conflict-based safety assessment framework. Anal. Methods Accid. Res. 2021, 32, 100187. [Google Scholar] [CrossRef]
- Sun, J.; Sun, J. Real-time crash prediction on urban expressways: Identification of key variables and a hybrid support vector machine model. IET Intell. Transp. Syst. 2016, 10, 331–337. [Google Scholar] [CrossRef]
- Yuan, C.; Li, Y.; Huang, H.; Wang, S.; Sun, Z.; Li, Y. Using traffic flow characteristics to predict real-time conflict risk: A novel method for trajectory data analysis. Anal. Methods Accid. Res. 2022, 35, 100217. [Google Scholar] [CrossRef]
- Katrakazas, C.; Quddus, M.; Chen, W.H. A Simulation Study of Predicting Real-Time Conflict-Prone Traffic Conditions. IEEE Trans. Intell. Transp. Syst. 2018, 19, 3196–3207. [Google Scholar] [CrossRef]
- Orsini, F.; Gecchele, G.; Gastaldi, M.; Rossi, R. Real-time conflict prediction: A comparative study of machine learning classifiers. Transp. Res. Procedia 2021, 52, 292–299. [Google Scholar] [CrossRef]
- Li, P.; Abdel-Aty, M.; Cai, Q.; Yuan, C. The application of novel connected vehicles emulated data on real-time crash potential prediction for arterials. Accid. Anal. Prev. 2020, 144, 105658. [Google Scholar] [CrossRef] [PubMed]
- Li, P.; Abdel-Aty, M.; Yuan, J. Real-time crash risk prediction on arterials based on LSTM-CNN. Accid. Anal. Prev. 2020, 135, 105371. [Google Scholar] [CrossRef] [PubMed]
- Yao, R.; Zeng, W.; Chen, Y.; He, Z. A deep learning framework for modelling left-turning vehicle behaviour considering diagonal-crossing motorcycle conflicts at mixed-flow intersections. Transp. Res. Part C Emerg. Technol. 2021, 132, 103415. [Google Scholar] [CrossRef]
- Islam, Z.; Abdel-Aty, M. Traffic conflict prediction using connected vehicle data. Anal. Methods Accid. Res. 2023, 39, 100275. [Google Scholar] [CrossRef]
- Gregurić, M.; Vrbanić, F.; Ivanjko, E. Towards the spatial analysis of motorway safety in the connected environment by using explainable deep learning. Knowl. -Based Syst. 2023, 269, 110523. [Google Scholar] [CrossRef]
- Madushani, J.P.S.S.; Sandamal, R.M.K.; Meddage, D.P.P.; Pasindu, H.R.; Gomes, P.I.A. Evaluating expressway traffic crash severity by using logistic regression and explainable & supervised machine learning classifiers. Transp. Eng. 2023, 13, 100190. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Why Should I Trust You? In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA; pp. 1135–1144. [CrossRef]
- Lundberg, S. A unified approach to interpreting model predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar]
- Safavian, S.R.; Landgrebe, D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef]
- Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995. [Google Scholar] [CrossRef]
- Laval, J.A. Hysteresis in traffic flow revisited: An improved measurement method. Transp. Res. Part B Methodol. 2011, 45, 385–391. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Bradley, A.P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997, 30, 1145–1159. [Google Scholar] [CrossRef]
- Charly, A.; Mathew, T.V. Estimation of traffic conflicts using precise lateral position and width of vehicles for safety assessment. Accid. Anal. Prev. 2019, 132, 105264. [Google Scholar] [CrossRef] [PubMed]
- Jiang, C.; Yin, S.; Yao, Z.; He, J.; Jiang, R.; Jiang, Y. Safety evaluation of mixed traffic flow with truck platoons equipped with (cooperative) adaptive cruise control, stochastic human-driven cars and trucks on port freeways. Phys. A Stat. Mech. Appl. 2024, 643, 129802. [Google Scholar] [CrossRef]
Raw Data | Description | Unit |
---|---|---|
Frame | Frame time. | _ |
ID | Vehicle ID. | _ |
cls_Name | Vehicle type: defines five different types of trucks as well as a car type. | _ |
(X_left_top,Y_left_top) | The relative coordinates of the upper left corner of the vehicle. | _ |
(X_right_bottom,Y_right_bottom) | The relative coordinates of the lower right corner of the vehicle. | _ |
(X_center,Y_center) | Relative coordinates of the vehicle center. | _ |
Length | Vehicle length. | m |
Width | Vehicle width. | m |
X_speed | The vehicle’s speed in the X direction. | m/s |
Y_speed | The vehicle’s speed in the Y direction. | m/s |
X_acceleration | The acceleration of the vehicle in the X direction. | m/s2 |
Y_acceleration | The acceleration of the vehicle in the Y direction. | m/s2 |
Dataset | FPS | Duration (Minutes) | Road Type |
---|---|---|---|
NGSIM | 10 | 75 | Highway |
HighD | 25 | 990 | Highway |
Interaction | 10–30 | 998 | Intersection, expressway |
CITR and DUT | 29.97 | 18.7 | Intersection |
SkyEye | 50 | 180 | Intersection |
InD | 25 | 600 | Intersection |
RounD | 25 | 360 | Roundabout |
pNEUMA | 25 | 3540 | Freeway |
High SIM | 30 | 120 | Highway |
MAGIC | 25 | 180 | Highway |
CitySim | 30 | 1200+ | Highway, intersections, on/off ramps, weaving sections |
This study | 30 | 2270 | Road sections near intersections |
Vehicle Type | Number of Tracks | Proportion |
---|---|---|
car | 4323 | 0.18 |
bigtruck | 13,218 | 0.55 |
littletruck | 1215 | 0.051 |
truck | 346 | 0.014 |
no_container | 2270 | 0.094 |
half | 2666 | 0.111 |
Total | 24,038 |
Unit | Description | Mean | Min | Max | Std | |
---|---|---|---|---|---|---|
Variable | ||||||
Traffic_flow | veh/min | The number of vehicles passing a certain point on the road within 1 min. | 14.69 | 1 | 46 | 5.80 |
Traffic_density | veh/km | Average traffic flow density within 1 min. | 32.17 | 7.14 | 75.38 | 12.85 |
Space_mean_speed | m/s | The average of space mean speed within 1 min. | 3.80 | 0.05 | 14.93 | 2.27 |
Lane_change_ratio | _ | The proportion of vehicles changing lanes within 1 min to traffic flow. | 0.07 | 0 | 0.63 | 0.09 |
Car_ratio | _ | The proportion of cars in traffic flow within 1 min. | 0.18 | 0 | 1 | 0.17 |
Turn_left_ratio | _ | The proportion of vehicles turning left in 1 min to traffic flow. | 0.16 | 0 | 1 | 0.23 |
Straight_ratio | _ | The proportion of vehicles going straight in 1 min to the traffic flow. | 0.65 | 0 | 1 | 0.30 |
Turn_right_ratio | _ | The proportion of vehicles turning right in 1 min to traffic flow. | 0.19 | 0 | 1 | 0.25 |
Conflict data | ||||||
Conflict | _ | This is a binary variable; 1 represents conflict and 0 represents no conflict. | 0.45 | 0 | 1 | 0.50 |
Conflict frequency | _ | Counted the number of different conflicts that occurred within 1 min. | 0.74 | 0 | 8 | 1.06 |
Samples | Original | SMOTE | Original (4:1) | SMOTE |
---|---|---|---|---|
A | B | C | D | |
y = 0 | 1146 | 1146 | 1146 | 1146 |
y = 1 | 925 | 1146 | 286 | 1146 |
Dataset | Model | n_Estimators | Max_Depth | Learning_Rate | Subsample | Colsample_Bytree |
---|---|---|---|---|---|---|
A | XGBoost | 400 | 4 | 0.1 | 1.0 | 0.7 |
B | 300 | 3 | 0.3 | 1.0 | 0.9 | |
C | 500 | 3 | 0.01 | 1.0 | 0.8 | |
D | 400 | 6 | 0.1 | 1.0 | 0.7 |
Dataset | ACC | FPR | FNR | |
---|---|---|---|---|
A | LR | 0.71 | 0.19 | 0.42 |
SVM | 0.72 | 0.20 | 0.40 | |
RF | 0.73 | 0.23 | 0.31 | |
XGBoost | 0.77 | 0.21 | 0.25 | |
B | LR | 0.72 | 0.25 | 0.31 |
SVM | 0.73 | 0.33 | 0.22 | |
RF | 0.73 | 0.27 | 0.27 | |
XGBoost | 0.76 | 0.24 | 0.24 |
Dataset | ACC | FPR | FNR | |
---|---|---|---|---|
C | LR | 0.81 | 0.04 | 0.77 |
SVM | 0.79 | 0.02 | 0.92 | |
RF | 0.80 | 0.06 | 0.75 | |
XGBoost | 0.80 | 0.04 | 0.80 | |
D | LR | 0.72 | 0.26 | 0.30 |
SVM | 0.79 | 0.24 | 0.17 | |
RF | 0.84 | 0.17 | 0.14 | |
XGBoost | 0.86 | 0.14 | 0.13 |
Authors | Trajectory Data | Variable Selection Perspective |
---|---|---|
[23] | Data collected by detectors and radar on Italian highways | Macro perspective |
[32] | HighD trajectory dataset | Macro perspective |
[18] | HighD trajectory dataset | Extract variables based on lanes from a macro perspective |
[38] | Connected vehicle dataset provided by Wejo | Microscopic perspective |
This study | Container truck traffic flow dataset | Extract variables based on lanes from a macro perspective |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, Z.; Meng, Y.; Cheng, R. Container Truck High-Risk Events Prediction and Its Influencing Factors Analyses Based on Trajectory Data. Systems 2025, 13, 326. https://doi.org/10.3390/systems13050326
Zhu Z, Meng Y, Cheng R. Container Truck High-Risk Events Prediction and Its Influencing Factors Analyses Based on Trajectory Data. Systems. 2025; 13(5):326. https://doi.org/10.3390/systems13050326
Chicago/Turabian StyleZhu, Zhihao, Yuan Meng, and Rongjun Cheng. 2025. "Container Truck High-Risk Events Prediction and Its Influencing Factors Analyses Based on Trajectory Data" Systems 13, no. 5: 326. https://doi.org/10.3390/systems13050326
APA StyleZhu, Z., Meng, Y., & Cheng, R. (2025). Container Truck High-Risk Events Prediction and Its Influencing Factors Analyses Based on Trajectory Data. Systems, 13(5), 326. https://doi.org/10.3390/systems13050326