Research on Pedestrian Detection and DeepSort Tracking in Front of Intelligent Vehicle Based on Deep Learning
Abstract
:1. Introduction
2. Improved YOLO Network Design Based on Deep Learning
2.1. Analysis of Overall Pedestrian Detection Process Based on Deep Learning
- (1)
- Model building
- (2)
- Model parameter setting
- (3)
- Data preprocessing
- (4)
- Model training
- (5)
- Model test
- (6)
- Model evaluation
2.2. Model Improvement Based on CBAM Module
3. DeepSort Pedestrian Tracking Based on Improved Network
3.1. DeepSort Pedestrian Tracking Process
- (1)
- According to the YOLO detection model, the detector information of the pedestrian at the last moment is obtained. The path of the pedestrian at the next moment is predicted by a Kalman filter algorithm, and the prior estimate value of the pedestrian is obtained.
- (2)
- The YOLO detection model is used to extract and save the features of the target pedestrian in the image at the current moment; the detector of pedestrian position information at the current moment can be obtained.
- (3)
- The Hungarian algorithm is used to match the detector and track by using the appearance feature and Mahalanobis distance. If the match is successful, it will enter a Kalman update and output tracking results. Otherwise, the unmatched detection frame and tracker are matched with IOU. During the cascade matching, a time update parameter (time_since_update) is set for each track. The tracking track is arranged according to the update parameter value. The tracking track with the smallest parameter value is first associated with the detector for matching, so as to ensure that the tracking target with the shortest matching time has a higher priority, which ensures the continuity of tracking and reduces the number of identity ID switchings.
- (4)
- The proportion of overlapping regions is used to calculate the similarity between the detection frame and the tracker and determine whether the detection frame and tracker at this moment have the same identity ID. If the match is successful, it will enter a Kalman update and output tracking results. Otherwise, the unmatched tracker will be matched for 10 consecutive frames. If the detection frame is matched within 10 frames, the tracking result will be output. If there is still no match, the target is considered as a non-tracking target, that is, the trajectory is deleted.
3.2. Kalman Filter Estimation of Pedestrian Target State
3.3. Correlation Matching of Pedestrian Targets
- (1)
- Mahalanobis distance matching
- (2)
- Appearance feature matching
4. Pedestrian Detection and Tracking Evaluation Indicators
4.1. Pedestrian Detection Evaluation Indicators
4.2. Pedestrian Tracking Evaluation Indicators
5. Pedestrian Detection and Tracking Verification
5.1. Pedestrian Detection Verification and Result Analysis
5.2. Pedestrian Tracking Verification and Result Analysis
6. Conclusions
- (1)
- The channel attention module (CAM) and spatial attention module (SAM) were introduced and spliced to the rear of a YOLO backbone network Darknet-53, which improved the representation ability of important feature information of the model.
- (2)
- Based on the improved YOLO network, a DeepSort pedestrian tracking method was designed. A Kalman filtering algorithm was used to estimate the pedestrian state. Mahalanobis distance and apparent feature indexes were used to calculate the similarity between the detection frame and predicted pedestrian trajectory; the optimal matching of a pedestrian target was achieved by a Hungarian algorithm. According to official statistics, the test accuracy of the YOLO model was only 55.3%. The training accuracy and testing accuracy of the YOLO model before improvement were 80.64% and 69.45%, respectively. The improved YOLOv3 pedestrian detection model and DeepSort pedestrian tracking method were compared and verified in the same experimental environment. The training accuracy and testing accuracy of the improved YOLO model were 82.21% and 72.02%, respectively. Compared with the model before improvement, the training accuracy and detection accuracy of the improved model were improved by 1.57% and 2.57%, respectively. The verification results showed that the improved pedestrian detection model had a stronger ability to deal with occlusion and accurately detected missed and misdetected images, which solved the tracking failure caused by occlusion before improvement.
- (3)
- The network model and tracking method before and after the improvement were compared and verified. The improved network model can effectively reduce the rate of missed detection and false detection caused by target occlusion and improve the tracking failure caused by occlusion. The main performance was as follows: pedestrian tracking accuracy MOTA and accuracy MOTP were improved, tracking track integrity MT was improved, the track interruption frequency FM was significantly reduced in the tracking process, the IDSW value was significantly improved, and it had a better ability to deal with occlusion.
- (4)
- This paper mainly focuses on the special cases of missing detection, misdetection, and tracking failure caused by small pedestrians with multiple targets and partially blocked pedestrians in a dim driving environment and solves the problems of multi-target tracking failure caused by occlusion. The research results of this paper have important reference value for theoretical research and development of a human–vehicle collision avoidance system, such as the ADAS system, intelligent vehicle system, AEB system, and so on.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sun, S.J.; Akhtar, N.; Song, H.; Mian, A.; Shah, M. Deep Affinity Network for Multiple Object Tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 104–119. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gopal, D.G.; Jerlin, M.A.; Abirami, M. A smart parking system using IoT. World Rev. Entrep. Manag. Sustain. Dev. 2019, 15, 335–345. [Google Scholar] [CrossRef]
- Nagarajan, S.M.; Chatterjee, P.; Alnumay, W.; Muthukumaran, V. Integration of IoT based routing process for food supply chain management in sustainable smart cities. Sustain. Cities Soc. 2022, 76, 103448. [Google Scholar] [CrossRef]
- Nagarajan, S.M.; Chatterjee, P.; Alnumay, W.; Ghosh, U. Effective task scheduling algorithm with deep learning for Internet of Health Things (IoHT) in sustainable smart cities. Sustain. Cities Soc. 2021, 71, 102945. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the Computer Vision and Pattern Recognition, Piscataway, NJ, USA, 22 October 2014; pp. 580–587. [Google Scholar]
- Nam, W.; Dollar, P.; Han, J.H. Local decorrelation for improved pedestrian detection. In Proceedings of the Advances in Neural Information Processing Systems, New York, NY, USA, 8–13 December 2014; pp. 424–432. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 9 May 2016; pp. 779–788. [Google Scholar]
- Li, J.; Liang, X.; Shen, S.M. Scale-aware fast r-cnn for pedestrian detection. IEEE Trans. Multimed. 2018, 17, 985–996. [Google Scholar] [CrossRef] [Green Version]
- Gao, Z.; Li, S.B.; Chen, J.N.; Li, Z.J. Pedestrian Detection Method Based on YOLO Network. Comput. Eng. 2018, 44, 215–219. [Google Scholar]
- Feng, Y.; Li, J.Z. Improved convolutionanl neural network pedestrian detection method. Comput. Eng. Des. 2020, 41, 1452–1457. [Google Scholar]
- Wojke, N.; Bewley, A.; Paulus, D. Simple Online and Realtime Tracking with a Deep Association Metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017. [Google Scholar]
- Ma, C.; Yang, C.; Yang, F.; Zhuang, Y.; Zhang, Z.; Jia, H.; Xie, X. Trajectory factory: Tracklet cleaving and re-connection by deep siamese bi-gru for multiple object tracking. In Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA, 23–27 July 2018; pp. 1–6. [Google Scholar]
- Zhu, J.; Yang, H.; Liu, N.; Kim, M.; Zhang, W.; Yang, M.-H. Online multi-object tracking with dual matching attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 366–382. [Google Scholar]
- Wang, Z.Y.; Miao, D.Q.; Zhao, C.R.; Luo, S.; Wei, Z.H. A Pedestrian Tracking Algorithm Based on Multi-Granularity Feature. J. Comput. Res. Dev. 2020, 57, 996–1002. [Google Scholar]
- Li, Y.; Zhang, H.; Liang, X.; Huang, B. Event-triggered based distributed cooperative energy management for multienergy systems. IEEE Trans. Ind. Inf. 2019, 15, 2008–2022. [Google Scholar] [CrossRef]
- Li, Y.; Gao, D.W.; Gao, W.; Zhang, H.; Zhou, J. A Distributed Double-Newton Descent Algorithm for Cooperative Energy Management of Multiple Energy Bodies in Energy Internet. IEEE Trans. Ind. Inf. 2021, 17, 5993–6003. [Google Scholar] [CrossRef]
- Zhang, N.; Sun, Q.; Yang, L.; Li, Y. Event-Triggered Distributed Hybrid Control Scheme for the Integrated Energy System. IEEE Trans. Ind. Inform. 2022, 18, 835–846. [Google Scholar] [CrossRef]
- Yang, L.; Sun, Q.; Zhang, N.; Li, Y. Indirect Multi-energy Transactions of Energy Internet with Deep Reinforcement Learning Approach. IEEE Trans. Power Syst. 2022. [Google Scholar] [CrossRef]
- Saravanan, R. Selfish node detection based on evidence by trust authority and selfish replica allocation in danet. Int. J. Inf. Commun. Technol. 2016, 9, 473–491. [Google Scholar]
- Palanisamy, S.; Sankar, S.; Somula, R. Communication Trust and Energy-Aware Routing Protocol for WSN Using DS Theory. Int. J. Grid High Perform. Comput. 2021, 13, 24–36. [Google Scholar] [CrossRef]
Network Type | Filter Size | Output Size |
---|---|---|
Convolution layer | 3 × 3/1 | 32 × 128 × 64 |
Convolution layer | 3 × 3/1 | 32 × 128 × 64 |
Maximum pooling layer | 3 × 3/2 | 32 × 64 × 32 |
Residual block | 3 × 3/1 | 32 × 64 × 32 |
Residual block | 3 × 3/1 | 32 × 64 × 32 |
Residual block | 3 × 3/2 | 32 × 64 × 16 |
Residual block | 3 × 3/1 | 32 × 64 × 16 |
Residual block | 3 × 3/2 | 128 × 16 × 8 |
Residual block | 3 × 3/1 | 128 × 16 × 8 |
Fully connected layer | 128 | |
L2 normalized layer | 128 |
Model | MOTA | MOTP | MT | ML | FM | IDSW | FPS |
---|---|---|---|---|---|---|---|
YOLO+DeepSort | 49 | 60.11 | 24 | 22.6 | 986 | 611 | 36 |
Improved YOLO+DeepSort | 53.8 | 62.12 | 28.2 | 21.8 | 241 | 120 | 33 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, X.; Jia, Y.; Tong, X.; Li, Z. Research on Pedestrian Detection and DeepSort Tracking in Front of Intelligent Vehicle Based on Deep Learning. Sustainability 2022, 14, 9281. https://doi.org/10.3390/su14159281
Chen X, Jia Y, Tong X, Li Z. Research on Pedestrian Detection and DeepSort Tracking in Front of Intelligent Vehicle Based on Deep Learning. Sustainability. 2022; 14(15):9281. https://doi.org/10.3390/su14159281
Chicago/Turabian StyleChen, Xuewen, Yuanpeng Jia, Xiaoqi Tong, and Zirou Li. 2022. "Research on Pedestrian Detection and DeepSort Tracking in Front of Intelligent Vehicle Based on Deep Learning" Sustainability 14, no. 15: 9281. https://doi.org/10.3390/su14159281