Margin CosReid Network for Pedestrian Re-Identification
Abstract
:1. Introduction
- The proposed feature extraction model GBNeck is added behind the backbone network ResNet50 to avoid the overfitting and slow convergence problems caused by traditional CNNs and thus achieve a stronger generalization and more discriminative feature learning models.
- The proposed margin cosine softmax loss (MCSL) introduces a boundary margin parameter that can maximize differences between classes and minimize those within classes simultaneously to deal with outside interferences strongly.
2. Related Work
2.1. Feature-Based Learning Re-Identification Methods
2.2. Metric-Based Learning Re-Identification Methods
3. Margin CosReid Network
3.1. Proposed GBNeck
3.2. Margin Cosine Softmax Loss (MCSL)
3.2.1. Softmax Loss
3.2.2. Cosine Softmax Loss
3.2.3. Proposed Margin Cosine Softmax Loss
3.3. Loss Comparison
4. Experiment
4.1. Datasets
4.2. Experimental Results and Analysis
4.2.1. GBNeck
4.2.2. The Proposed MCSL
4.2.3. Visualization Result
4.3. Comparison with State-of-the-Art Methods
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zheng, L.; Shen, L.; Tian, L.; Wang, S.; Wang, J.; Tian, Q. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1116–1124. [Google Scholar]
- Jin, X.; Lan, C.; Zeng, W.; Chen, Z.; Zhang, L. Style normalization and restitution for generalizable person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3140–3149. [Google Scholar]
- Song, W.; Zheng, J.; Wu, Y.; Chen, C.; Liu, F. Discriminative feature extraction for video person re-identification via multi-task network. Appl. Intell. 2021, 51, 788–803. [Google Scholar] [CrossRef]
- Zhao, R.; Ouyang, W.L.; Wang, X.G. Person re-identification by salience matching. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 2528–2535. [Google Scholar]
- Zhao, R.; Ouyang, W.L.; Wang, X.G. Unsupervised salience learning for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 3586–3593. [Google Scholar]
- Liao, S.; Hu, Y.; Zhu, X.Y.; Li, S.Z. Person re-identification by Local Maximal Occurrence representation and metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2197–2206. [Google Scholar]
- Chen, L.S.; Liu, C.H.; Chiu, H.J. A neural network based approach for sentiment classification in the blogosphere. J. Inf. 2011, 5, 313–322. [Google Scholar] [CrossRef]
- Kalarani, P.; Brunda, S.S. An efficient approach for ensemble of SVM and ANN for sentiment classification. In Proceedings of the IEEE International Conference on Advances in Computer Applications, Coimbatore, India, 24 October 2016; pp. 99–103. [Google Scholar]
- Wang, H.; Sun, S.; Zhou, L.; Guo, L.; Li, C. Local feature-aware siamese matching model for vehicle re-identification. Appl. Sci. 2020, 10, 2474. [Google Scholar] [CrossRef] [Green Version]
- Fan, X.; Jiang, W.; Luo, H.; Mao, W.; Yu, H. Instance hard triplet loss for in-video person re-identification. Appl. Sci. 2020, 10, 2198. [Google Scholar] [CrossRef] [Green Version]
- Hermans, A.; Beyer, L.; Leibe, B. In defense of the triplet loss for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1–10. [Google Scholar]
- Chen, W.; Chen, X.; Zhang, J.; Huang, K. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1320–1329. [Google Scholar]
- Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 17–22 June 2006; pp. 1735–1742. [Google Scholar]
- Jin, P.X.; Yang, B. Active learning with re-sampling for support vector machine in person re-identification. In Proceedings of the International Conference on Machine Learning and Cybernetics, Tianjin, China, 14–17 July 2013; pp. 597–602. [Google Scholar]
- Zhang, Y.; Li, B.; Lu, H. Sample-specific SVM learning for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1278–1287. [Google Scholar]
- Tang, Y.C. Deep learning using linear support vector machines. arXiv 2013, arXiv:1306.0239. [Google Scholar]
- Ahmed, E.; Jones, M.; Marks, T.K. An improved deep learning architecture for person re-identification. In Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3908–3916. [Google Scholar]
- Bishop, C.M. Pattern recognition and machine learning. J. Electron. Imaging 2006, 16, 140–155. [Google Scholar]
- Zheng, Z.; Zheng, L.; Yang, Y. Unlabeled samples generated by GAN improve the person re-identification baseline in Vitro. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3774–3782. [Google Scholar]
- Wei, L.; Wei, Z.; Jin, Z.; Yu, Z.; Huang, J.; Cai, D.; He, X.; Hua, X. SIF: Self-inspirited feature learning for person re-identification. IEEE Trans. Image Process. 2020, 29, 4942–4951. [Google Scholar] [CrossRef] [PubMed]
- Zheng, L.; Zhang, H.; Sun, S.; Chandraker, M.; Yang, Y.; Tian, Q. Person re-identification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3346–3355. [Google Scholar]
- Lin, Y.; Zheng, L.; Zheng, Z.; Wu, Y.U.; Hu, Z.; Yan, C.; Yang, Y. Improving person re-identification by attribute and identity learning. Pattern Recognit. 2019, 95, 3346–3355. [Google Scholar] [CrossRef] [Green Version]
- Sun, Y.; Xu, Q.; Li, Y.; Zhang, C.; Li, Y.; Wang, S.; Sun, J. Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 393–402. [Google Scholar]
- Yang, W.; Huang, H.; Zhang, Z.; Chen, X.; Huang, K.; Zhang, S. Towards rich feature discovery with class activation maps augmentation for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1389–1398. [Google Scholar]
- Chen, T.; Ding, S.; Xie, J.; Yuan, Y.; Chen, W.; Yang, Y.; Ren, Z.; Wang, Z. ABD-net: Attentive but diverse person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 8350–8360. [Google Scholar]
- Qian, X.; Fu, Y.; Xiang, T.; Wang, W.; Qiu, J.; Wu, Y.; Jiang, Y.; Xue, X. Pose-normalized image generation for person re-identification. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 650–667. [Google Scholar]
- Wei, L.; Zhang, S.; Gao, W.; Tian, Q. Person transfer GAN to bridge domain gap for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 79–88. [Google Scholar]
- Sun, X.; Zheng, L. Dissecting person re-identification from the viewpoint of viewpoint. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 608–617. [Google Scholar]
- Liu, J.; Sun, C.; Xu, X.; Xu, B.M.; Yu, S.Y. A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl. Intell. 2019, 49, 3436–3446. [Google Scholar] [CrossRef] [Green Version]
- Kostinger, M.; Hirzer, M.; Wohlhart, P.; Roth, P.M.; Bischof, H. Large scale metric learning from equivalence constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 2288–2295. [Google Scholar]
- Cheng, D.; Gong, Y.; Zhou, S.; Wang, J.; Zheng, N. Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1335–1344. [Google Scholar]
- Fan, X.; Jiang, W.; Luo, H.; Fei, M. SphereReID: Deep hypersphere manifold embedding for person re-identification. J. Vis. Commun. Image Represent. 2019, 60, 51–58. [Google Scholar] [CrossRef] [Green Version]
- Deng, J.; Guo, J.; Xue, N.; Zafeiriou, S. ArcFace: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4685–4694. [Google Scholar]
- Wang, X.; Shrivastava, A.; Gupta, A. A-fast-RCNN: Hard positive generation via adversary for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3039–3048. [Google Scholar]
- Guo, A.J.X.; Zhu, F. Spectral-spatial feature extraction and classification by ANN supervised with center loss in hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2018, 57, 99. [Google Scholar] [CrossRef]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A discriminative feature learning approach for deep face recognition. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 499–515. [Google Scholar]
- Elsayed, G.F.; Krishnan, D.; Mobahi, H. Large margin deep networks for classification. In Proceedings of the International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 2–8 December 2018; pp. 850–860. [Google Scholar]
- Zhang, C.; Nguyen, T.; Sah, S.; Ptucha, R.; Loui, A.; Salvaggio, C. Batch-normalized recurrent highway networks. In Proceedings of the IEEE International Conference on Image Processing, Beijing, China, 17–20 September 2017; pp. 640–644. [Google Scholar]
- Laurent, C.; Pereyra, G.; Brakel, P.; Zhang, Y.; Bengio, Y. Batch Normalized Recurrent Neural Networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Shanghai, China, 20–25 March 2016; pp. 2657–2661. [Google Scholar]
- Hinton, G.; Srivastava, N.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Improving neural networks by preventing co-adaptation of feature detectors. Comput. Sci. 2012, 3, 212–223. [Google Scholar]
- Jie, S.; He, X.; Qing, L.; Yu, Y.; Xu, S.; Peng, Y. A new discriminative feature learning for person re-identification using additive angular margin softmax loss. In Proceedings of the UK/ China Emerging Technologies, Glasgow, UK, 21–22 August 2019; pp. 1–4. [Google Scholar]
- Liu, W.; Wen, Y.; Yu, Z.; Li, M.; Raj, B.; Song, L. SphereFace: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6738–6746. [Google Scholar]
- Liu, W.G.; Wen, Y.; Yu, Z.; Yang, M. Large-margin softmax loss for convolutional neural networks. In Proceedings of the International Conference on International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; p. 7. [Google Scholar]
- Wang, H.; Wang, Y.; Zhou, Z.; Ji, X.; Gong, D.; Zhou, J.; Li, Z.; Liu, W. CosFace: Large margin Cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5265–5274. [Google Scholar]
- Vieira, T. Exp-Normalize Trick. 2014. Available online: https://timvieira.github.io/blog/post/2014/02/11/exp-normalize-trick/ (accessed on 16 February 2020).
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–16 December 2012; pp. 1097–1105. [Google Scholar]
- Sun, Y.; Zheng, L.; Yang, Y.; Tian, Q.; Wang, S. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 480–496. [Google Scholar]
- Kingma, D.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference for Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Luo, H.; Jiang, W.; Gu, Y.; Liu, F.; Liao, X.; Lai, S.; Gu, J. A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans. Multimed. 2020, 22, 2597–2609. [Google Scholar] [CrossRef] [Green Version]
- Sarfraz, M.S.; Schumann, A.; Eberle, A.; Stiefelhagen, R. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 420–429. [Google Scholar]
- Zheng, L.; Huang, Y.; Lu, H.; Yang, Y. Pose-invariant embedding for deep person re-identification. IEEE Trans. Image Process. 2019, 28, 4500–4509. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sun, Y.; Zheng, L.; Deng, W.; Wang, S. SVDNet for pedestrian retrieval. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3820–3828. [Google Scholar]
- Su, C.; Li, J.; Zhang, S.; Xing, J.; Gao, W.; Tian, Q. Pose-driven deep convolutional model for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3980–3989. [Google Scholar]
- Zhang, Y.; Xiang, T.; Hospedales, T.M.; Lu, H.C. Deep mutual learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4320–4328. [Google Scholar]
- Zhong, Z.; Zheng, L.; Zheng, Z.; Li, S.; Yang, Y. Camera style adaptation for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5157–5166. [Google Scholar]
- Ristani, E.; Tomasi, C. Features for multi-target multi-camera tracking and re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 6036–6046. [Google Scholar]
- Shen, Y.; Xiao, T.; Li, H.; Yi, S.; Wang, X. End-to-end deep kronecker-product matching for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 6886–6895. [Google Scholar]
- Qi, L.; Huo, J.; Wang, L.; Shi, Y.; Gao, Y. A mask based deep ranking neural network for person retrieval. In Proceedings of the 2019 IEEE International Conference on Multimedia and Expo, Shanghai, China, 8–12 July 2019; pp. 496–501. [Google Scholar]
- Zheng, Z.; Zheng, L.; Yang, Y. Pedestrian alignment network for large-scale person re-identification. Proc. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 3037–3045. [Google Scholar] [CrossRef] [Green Version]
- Li, W.; Zhu, X.; Gong, S. Harmonious attention network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2285–2294. [Google Scholar]
Market-1501 | DukeMTMC-reID | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Network | Softmax | CSL | MCSL | Softmax | CSL | MCSL | ||||||
rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | |
Net-A | 62.9 | 34.6 | 74.3 | 48.3 | 77.3 | 51.6 | 56.8 | 31.6 | 61.1 | 35.6 | 65.1 | 37.8 |
Net-B | 75.1 | 51.0 | 85.9 | 66.7 | 86.8 | 68.4 | 69.7 | 43.5 | 76.0 | 51.6 | 77.9 | 59.8 |
Net-C | 78.4 | 54.4 | 87.3 | 67.3 | 88.9 | 70.5 | 76.7 | 59.1 | 78.3 | 57.5 | 78.6 | 60.0 |
GBNeck | 81.7 | 57.9 | 87.4 | 69.2 | 91.0 | 75.9 | 77.8 | 59.9 | 79.3 | 61.0 | 81.1 | 64.6 |
Market-1501 | DukeMTMC-reID | |||
---|---|---|---|---|
rank-1 | mAP | rank-1 | mAP | |
w/ downsampling layer | 91.0 | 75.9 | 81.1 | 64.6 |
w/o downsampling layer | 86.3 | 68.3 | 76.4 | 56.7 |
Method | Rank-1 | Rank-5 | Rank-10 | mAP |
---|---|---|---|---|
PIE [51] | 78.7 | 90.3 | 93.6 | 53.9 |
SVDNet [52] | 82.3 | 92.3 | 95.2 | 62.1 |
LSRO [19] | 83.4 | - | - | 66.1 |
PDC [53] | 84.1 | 92.7 | 94.9 | 63.4 |
APR [22] | 84.3 | 93.2 | 95.2 | 64.7 |
DML [54] | 87.7 | - | - | 68.8 |
PSE [50] | 87.7 | 94.5 | 96.8 | 69.0 |
Camstyle [55] | 88.1 | - | - | 68.7 |
PNGAN [26] | 89.4 | - | - | 72.6 |
AWTL [56] | 89.5 | - | - | 75.7 |
Deep KPM [57] | 90.1 | - | - | 75.3 |
MaskReID [58] | 90.4 | - | - | 75.4 |
MCSL+single-query | 91.0 | 96.6 | 98.0 | 75.9 |
MCSL+RK | 92.6 | 95.8 | 97.1 | 87.9 |
MCSL+multi-query | 93.8 | 97.9 | 98.8 | 83.0 |
MCSL+RK+multi-query | 94.2 | 98.1 | 98.9 | 88.5 |
Method | Rank-1 | Rank-5 | Rank-10 | mAP |
---|---|---|---|---|
LSRO [19] | 67.7 | - | - | 47.1 |
APR [22] | 70.7 | - | - | 51.9 |
PAN [59] | 71.6 | 83.9 | 90.6 | 51.5 |
PNGAN [26] | 73.6 | - | - | 53.2 |
SVDNet [52] | 76.7 | 86.4 | 89.9 | 56.8 |
MaskReID [58] | 78.8 | - | - | 61.9 |
PSE [50] | 79.8 | 89.7 | 92.2 | 62.0 |
AWTL [56] | 79.8 | - | - | 63.4 |
Deep KPM [57] | 80.3 | - | - | 63.2 |
HA-CNN [60] | 80.5 | - | - | 63.8 |
MCSL | 81.1 | 90.6 | 93.3 | 64.6 |
MCSL+RK | 82.9 | 91.5 | 94.2 | 73.4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yun, X.; Ge, M.; Sun, Y.; Dong, K.; Hou, X. Margin CosReid Network for Pedestrian Re-Identification. Appl. Sci. 2021, 11, 1775. https://doi.org/10.3390/app11041775
Yun X, Ge M, Sun Y, Dong K, Hou X. Margin CosReid Network for Pedestrian Re-Identification. Applied Sciences. 2021; 11(4):1775. https://doi.org/10.3390/app11041775
Chicago/Turabian StyleYun, Xiao, Min Ge, Yanjing Sun, Kaiwen Dong, and Xiaofeng Hou. 2021. "Margin CosReid Network for Pedestrian Re-Identification" Applied Sciences 11, no. 4: 1775. https://doi.org/10.3390/app11041775
APA StyleYun, X., Ge, M., Sun, Y., Dong, K., & Hou, X. (2021). Margin CosReid Network for Pedestrian Re-Identification. Applied Sciences, 11(4), 1775. https://doi.org/10.3390/app11041775