Vehicle Re-Identification Method Based on Efficient Self-Attention CNN-Transformer and Multi-Task Learning Optimization
Abstract
:1. Introduction
2. Related Work
3. CNN-Transformer Based Vehicle Re-Identification Method
3.1. Overall Structure
3.2. Backbone Network
3.3. Two-Branch Feature Representation Network
3.4. Loss Function
4. Vehicle Re-Identification Method Based on Multi-Task Learning Optimization
4.1. Multi-Branch Feature Extraction Network Based on Multi-Task Learning
4.2. Group Convolution Strategy
5. Experimentation and Analysis
5.1. Experimental Setup
5.2. Ablation Experiments
5.3. Experiment Results and Analysis
Method | VeRi-776 | VehicleID | |||
---|---|---|---|---|---|
mAP | R1 | R1 | R5 | ||
CNN | PGAN [18] | 79.3 | 96.5 | 77.8 | 92.1 |
SAN [19] | 72.5 | 93.3 | 79.7 | 94.3 | |
FIDI [20] | 77.6 | 95.7 | 78.5 | 91.9 | |
PVEN [21] | 79.5 | 95.6 | 84.7 | 97.0 | |
SAVER [22] | 79.6 | 96.4 | 79.9 | 95.2 | |
CFVMNet [23] | 77.1 | 95.3 | 81.4 | 94.1 | |
CAL [5] | 74.3 | 95.4 | 82.5 | 94.7 | |
CLIP-ReID [11] | 80.3 | 96.8 | 85.2 | 97.1 | |
Transformer | TransReID [10] | 80.6 | 96.9 | 83.6 | 87.1 |
DCAL [24] | 80.2 | 96.9 | — | — | |
CLIP-ReID | 83.3 | 97.4 | 85.3 | 97.6 | |
The proposed | IBNT-Net | 83.0 | 97.6 | 86.9 | 97.7 |
IBNT-DL4G | 83.2 | 97.4 | — | — | |
IBNT-DL4B | 84.9 | 97.7 | 87.3 | 97.8 |
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Charbonnier, S.; Pitton, A.C.; Vassilev, A. Vehicle re-identification with a single magnetic sensor. In Proceedings of the 2012 IEEE International Instrumentation and Measurement Technology, Graz, Austria, 13–16 May 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 380–385. [Google Scholar]
- Zapletal, D.; Herout, A. Vehicle re-identification for automatic video traffic surveillance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Identification Workshops, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 25–31. [Google Scholar]
- Kim, J.H.; Oh, J.H. A land vehicle tracking algorithm using stand-alone GPS. Control. Eng. Pract. 2000, 8, 1189–1196. [Google Scholar] [CrossRef]
- Liu, H.; Tian, Y.; Yang, Y.; Pang, L.; Huang, T. Deep relative distance learning: Tell the difference between similar vehicles. In Proceedings of the IEEE Conference on Computer Vision and Pattern Identification, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2167–2175. [Google Scholar]
- Rao, Y.; Chen, G.; Lu, J.; Zhou, J. Counterfactual attention learning for fine-grained visual categorization and re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1025–1034. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Identification, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Ghosh, A.; Shanmugalingam, K.; Lin, W.Y. Relation preserving triplet mining for stabilising the triplet loss in re-identification systems. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 4840–4849. [Google Scholar]
- Gu, J.; Wang, K.; Luo, H.; Chen, C.; Jiang, W.; Fang, Y.; Zhang, S.; You, Y.; Zhao, J. Msinet: Twins contrastive search of multi-scale interaction for object reid. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Identification, Vancouver, BC, Canada, 17–24 June 2023; pp. 19243–19253. [Google Scholar]
- Li, H.; Chen, J.; Zheng, A.; Wu, Y.; Luo, Y. Day-Night Cross-domain Vehicle Re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 12626–12635. [Google Scholar]
- He, S.; Luo, H.; Wang, P.; Wang, F.; Li, H.; Jiang, W. Transreid: Transformer-based object re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 15013–15022. [Google Scholar]
- Li, S.; Sun, L.; Li, Q. CLIP-ReID: Exploiting vision-language model for image re-identification without concrete text labels. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 1405–1413. [Google Scholar]
- Srinivas, A.; Lin, T.Y.; Parmar, N.; Shlens, J.; Abbeel, P.; Vaswani, A. Bottleneck transformers for visual identification. In Proceedings of the IEEE/CVF Conference on computer Vision and Pattern Identification, Nashville, TN, USA, 20–25 June 2021; pp. 16519–16529. [Google Scholar]
- Pan, X.; Luo, P.; Shi, J.; Tang, X. Two at once: Enhancing learning and generalization capacities via ibn-net. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 464–479. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Identification, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2818–2826. [Google Scholar]
- Hermans, A.; Beyer, L.; Leibe, B. In defense of the triplet loss for person re-identification. arXiv 2017, arXiv:1703.07737. [Google Scholar]
- Liu, X.; Liu, W.; Mei, T.; Ma, H. Provid: Progressive and multimodal vehicle re-identification for large-scale urban surveillance. IEEE Trans. Multimed. 2017, 20, 645–658. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J.L. Adam: A method for stochastic optimization 3rd International Conference on Learning Representations. In Proceedings of the ICLR 2015-Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015; p. 1. [Google Scholar]
- Zhang, X.; Zhang, R.; Cao, J.; Gong, D.; You, M.; Shen, C. Part-guided attention learning for vehicle instance retrieval. IEEE Trans. Intell. Transp. Syst. 2020, 23, 3048–3060. [Google Scholar] [CrossRef]
- Qian, J.; Jiang, W.; Luo, H.; Yu, H. Stripe-based and attribute-aware network: A two-branch deep model for vehicle re-identification. Meas. Sci. Technol. 2020, 31, 095401. [Google Scholar] [CrossRef]
- Yan, C.; Pang, G.; Bai, X.; Liu, C.; Ning, X.; Gu, L.; Zhou, J. Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss. IEEE Trans. Multimed. 2021, 24, 1665–1677. [Google Scholar] [CrossRef]
- Meng, D.; Li, L.; Liu, X.; Li, Y.; Yang, S.; Zha, Z.J.; Gao, X.; Wang, S.; Huang, Q. Parsing-based view-aware embedding network for vehicle re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Identification, Seattle, WA, USA, 13–19 June 2020; pp. 7103–7112. [Google Scholar]
- Khorramshahi, P.; Peri, N.; Chen, J.C.; Chellappa, R. The devil is in the details: Self-supervised attention for vehicle re-identification. In Computer Vision–ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 369–386. [Google Scholar]
- Sun, Z.; Nie, X.; Xi, X.; Yin, Y. CFVMNet: A multi-branch network for vehicle re-identification based on common field of view. In Proceedings of the 28th ACM International Conference on Multimedia, Virtual, 12–16 October 2020; pp. 3523–3531. [Google Scholar]
- Zhu, H.; Ke, W.; Li, D.; Liu, J.; Tian, L.; Shan, Y. Dual cross-attention learning for fine-grained visual categorization and object re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Identification, New Orleans, LA, USA, 18–24 June 2022; pp. 4692–4702. [Google Scholar]
- Huang, X.; Xu, Z.; Wu, H.; Wang, J.; Xia, Q.; Xia, Y.; Li, J.; Gao, K.; Wen, C.; Wang, C. L4dr: Lidar-4dradar fusion for weather-robust 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 3806–3814. [Google Scholar]
- Zhang, X.; Wang, L.; Chen, J.; Fang, C.; Yang, G.; Wang, Y.; Yang, L.; Song, Z.; Liu, L.; Zhang, X.; et al. Dual radar: A multi-modal dataset with dual 4d radar for autonomous driving. Sci. Data 2025, 12, 439. [Google Scholar] [CrossRef] [PubMed]
Method | mAP | Rank-1 |
---|---|---|
ResNet50 | 79.66 | 96.25 |
ResNet50-IBN | 79.81 | 96.78 |
BoT50-ibn | 80.10 | 96.72 |
ResNet50-IBN + BoT(IBNT-Net) | 82.46 | 97.38 |
Method | mAP [%] | Rank-1 [%] | Params [M] | MACs [G] |
---|---|---|---|---|
MHSA | 82.46 | 97.38 | 36.12 | 8.03 |
MHSA + reduction | 82.53 | 97.38 | — | — |
MHSA + (FC + residual) | 82.70 | 97.44 | — | — |
MHSA + reduction + (FC + residual) | 83.03 | 97.56 | 35.44 | 7.85 |
Reduction Factor | mAP | Rank-1 |
---|---|---|
8 | 82.44 | 97.62 |
16 | 83.03 | 97.56 |
32 | 82.70 | 97.56 |
Method | MTL | mAP | Rank-1 |
---|---|---|---|
ResNet50-IBN | 79.81 | 96.78 | |
√ | 82.85 | 97.32 | |
BoT50-IBN | 80.10 | 96.72 | |
√ | 82.61 | 97.08 | |
IBNT-Net | 83.03 | 97.56 | |
√ | 84.85 | 97.68 |
Method | mAP [%] | Rank-1 [%] | Params [M] | MACs [G] |
---|---|---|---|---|
IBNT-Net | 83.03 | 97.56 | 35.44 | 7.85 |
IBNT-DL4B | 84.85 | 97.68 | 55.97 | 11.37 |
IBNT-DL2G | 84.26 | 97.14 | 22.01 | 7.50 |
IBNT-DL4G | 83.18 | 97.44 | 12.89 | 5.31 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Li, R.; Shao, Y. Vehicle Re-Identification Method Based on Efficient Self-Attention CNN-Transformer and Multi-Task Learning Optimization. Sensors 2025, 25, 2977. https://doi.org/10.3390/s25102977
Wang Y, Li R, Shao Y. Vehicle Re-Identification Method Based on Efficient Self-Attention CNN-Transformer and Multi-Task Learning Optimization. Sensors. 2025; 25(10):2977. https://doi.org/10.3390/s25102977
Chicago/Turabian StyleWang, Yu, Rui Li, and Yihan Shao. 2025. "Vehicle Re-Identification Method Based on Efficient Self-Attention CNN-Transformer and Multi-Task Learning Optimization" Sensors 25, no. 10: 2977. https://doi.org/10.3390/s25102977
APA StyleWang, Y., Li, R., & Shao, Y. (2025). Vehicle Re-Identification Method Based on Efficient Self-Attention CNN-Transformer and Multi-Task Learning Optimization. Sensors, 25(10), 2977. https://doi.org/10.3390/s25102977