Unsupervised Person Re-Identification via Deep Attribute Learning
Abstract
1. Introduction
2. Related Work
2.1. Person Re-Identification
2.2. Attribute-Based Learning
3. Model Description
3.1. Problem Formulation
3.2. Overview
- (1)
- Pre-training attribute CNN model on the source dataset: We develop a part-based CNN architecture of two components for image and body attribute learning at a global level, and upper and lower body image and attribute learning at a local level. The architecture is trained on the source dataset of labeled attributes in a supervised manner to jointly learn an attribute-semantic and identity-discriminative feature representation simultaneously.
- (2)
- Self-training attribute CNN model on the target dataset: In the second stage, we adopt the same CNN architecture as in the first stage. At the beginning of the second stage, we use the parameters of the pre-trained attribute CNN model to initialize the attribute CNN model. We then extract the global and local image and attribute features on the unlabeled images in the target dataset and employ an unsupervised clustering approach to assign them pseudo labels. The attribute CNN model is fine-tuned using the inputs of the unlabeled images and pseudo labels. During self-training, we iteratively extract features with the newly trained CNN model, assign pseudo labels, and fine-tune the CNN model until it becomes stable.
3.3. Fully Supervised Attribute CNN Pre-Training
3.4. Unsupervised Adaptive Attribute Learning
3.5. Loss Function
3.5.1. Supervised Pre-Training
3.5.2. Unsupervised Adaptation
4. Experimental Results
4.1. Implementation Details
4.2. Datasets
4.2.1. Market1501
4.2.2. DukeMTMC-ReID
4.2.3. VIPeR and PRID
4.2.4. Evaluation Metrics
4.3. Comparative Study on Unsupervised Person ReID
4.4. Results on Unsupervised Attribute Recognition
4.5. Ablation Study
- (1)
- Ours-woID: without the person identification loss ;
- (2)
- Ours-woAttr: without the attribute identification loss ;
- (3)
- Ours-woTrip: without the batch hard triplet loss ;
- (4)
- Ours-woAdapt: without the pre-trained model adaption on the target dataset;
- (5)
- Ours-woAttrTrip: without attribute triplet loss ;
- (6)
- Ours-woST: without self-training iterations;
- (7)
- Ours-woGF: without the global branch in the deep CNN architecture;
- (8)
- Ours-woLF: without the part branch in the deep CNN architecture.
4.6. Visual Inspection
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- He, Y.; Wei, X.; Hong, X.; Shi, W.; Gong, Y. Multi-target multi-camera tracking by tracklet-to-target assignment. IEEE Trans. Image Process. 2020, 29, 5191–5205. [Google Scholar] [CrossRef]
- Xie, Z.; Ni, Z.; Yang, W.; Zhang, Y.; Chen, Y.; Zhang, Y.; Ma, X. A robust online multi-camera people tracking system with geometric consistency and state-aware re-id correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–18 June 2024; pp. 7007–7016. [Google Scholar]
- Amini-Omam, M.; Torkamani-Azar, F.; Ghorashi, S.A. Maximum Likelihood Estimation for Multiple Camera Target Tracking on Grassmann Tangent Subspace. IEEE Trans. Cybern. 2018, 48, 77–89. [Google Scholar] [CrossRef] [PubMed]
- Loy, C.C.; Xiang, T.; Gong, S. Time-delayed correlation analysis for multi-camera activity understanding. Int. J. Comput. Vis. 2010, 90, 106–129. [Google Scholar] [CrossRef]
- Vitello, P.; Capponi, A.; Fiandrino, C.; Giaccone, P.; Kliazovich, D.; Bouvry, P. High-precision design of pedestrian mobility for smart city simulators. In Proceedings of the IEEE International Conference on Communications, Kansas City, MO, USA, 20–24 May 2018; pp. 1–6. [Google Scholar]
- Zhang, S.; Zhang, Q.; Wei, X.; Zhang, Y.; Xia, Y. Person Re-Identification With Triplet Focal Loss. IEEE Access 2018, 6, 78092–78099. [Google Scholar] [CrossRef]
- Niculescu-Mizil, A.; Patel, D.; Melvin, I. MCTR: Multi Camera Tracking Transformer. In Proceedings of the Winter Conference on Applications of Computer Vision, Tucson, AZ, USA, 28 February–4 March 2025; pp. 874–884. [Google Scholar]
- Ma, A.J.; Yuen, P.C.; Li, J. Domain transfer support vector ranking for person re-identification without target camera label information. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 3567–3574. [Google Scholar]
- Dikmen, M.; Akbas, E.; Huang, T.S.; Ahuja, N. Pedestrian recognition with a learned metric. In Proceedings of the Asian Conference on Computer Vision, Queenstown, New Zealand, 8–12 November 2010; pp. 501–512. [Google Scholar]
- Xiong, F.; Gou, M.; Camps, O.; Sznaier, M. Person re-identification using kernel-based metric learning methods. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 1–16. [Google Scholar]
- Zheng, W.S.; Gong, S.; Xiang, T. Reidentification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 653–668. [Google Scholar] [CrossRef] [PubMed]
- Zheng, L.; Bie, Z.; Sun, Y.; Wang, J.; Su, C.; Wang, S.; Tian, Q. Mars: A video benchmark for large-scale person re-identification. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 868–884. [Google Scholar]
- Xiao, T.; Li, H.; Ouyang, W.; Wang, X. Learning deep feature representations with domain guided dropout for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1249–1258. [Google Scholar]
- Herzog, F.; Chen, J.; Teepe, T.; Gilg, J.; Hörmann, S.; Rigoll, G. Synthehicle: Multi-vehicle multi-camera tracking in virtual cities. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 1–11. [Google Scholar]
- Hermans, A.; Beyer, L.; Leibe, B. In defense of the triplet loss for person re-identification. arXiv 2017, arXiv:1703.07737. [Google Scholar] [CrossRef]
- Shankar, S.; Garg, V.K.; Cipolla, R. Deep-carving: Discovering visual attributes by carving deep neural nets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3403–3412. [Google Scholar]
- Chen, Q.; Huang, J.; Feris, R.; Brown, L.M.; Dong, J.; Yan, S. Deep domain adaptation for describing people based on fine-grained clothing attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5315–5324. [Google Scholar]
- Layne, R.; Hospedales, T.M.; Gong, S.; Mary, Q. Person re-identification by attributes. In Proceedings of the British Machine Vision Conference, Surrey, UK, 3–7 September 2012; Volume 2, p. 8. [Google Scholar]
- Lin, Y.; Zheng, L.; Zheng, Z.; Wu, Y.; Hu, Z.; Yan, C.; Yang, Y. Improving person re-identification by attribute and identity learning. Pattern Recognit. 2019, 95, 151–161. [Google Scholar] [CrossRef]
- Matsukawa, T.; Suzuki, E. Person re-identification using cnn features learned from combination of attributes. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancún, Mexico, 4–8 December 2016; IEEE: Cancun, Mexico, 2016; pp. 2428–2433. [Google Scholar]
- Zhang, S.; He, Y.; Wei, J.; Mei, S.; Wan, S.; Chen, K. Person re-identification with joint verification and identification of identity-attribute labels. IEEE Access 2019, 7, 126116–126126. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Sun, Y.; Chen, Y.; Wang, X.; Tang, X. Deep learning face representation by joint identification-verification. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 1988–1996. [Google Scholar]
- Sun, Y.; Wang, X.; Tang, X. Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2892–2900. [Google Scholar]
- Sun, Y.; Liang, D.; Wang, X.; Tang, X. Deepid3: Face recognition with very deep neural networks. arXiv 2015, arXiv:1502.00873. [Google Scholar] [CrossRef]
- Ma, B.; Su, Y.; Jurie, F. Bicov: A novel image representation for person re-identification and face verification. In Proceedings of the British Machive Vision Conference, Surrey, UK, 3–7 September 2012; pp. 1–11. [Google Scholar]
- Liu, C.; Gong, S.; Loy, C.C.; Lin, X. Person re-identification: What features are important? In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 391–401. [Google Scholar]
- Klare, B.F.; Jain, A.K. Heterogeneous face recognition using kernel prototype similarities. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1410–1422. [Google Scholar] [CrossRef] [PubMed]
- An, L.; Kafai, M.; Yang, S.; Bhanu, B. Reference-based person re-identification. In Proceedings of the International Conference on Advanced Video and Signal Based Surveillance, Krakow, Poland, 27–30 August 2013; pp. 244–249. [Google Scholar]
- Cheng, D.; Gong, Y.; Zhou, S.; Wang, J.; Zheng, N. Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1335–1344. [Google Scholar]
- Li, W.; Zhu, X.; Gong, S. Person re-identification by deep joint learning of multi-loss classification. arXiv 2017, arXiv:1705.04724. [Google Scholar]
- Li, D.; Chen, X.; Zhang, Z.; Huang, K. Learning deep context-aware features over body and latent parts for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 384–393. [Google Scholar]
- Li, W.; Zhu, X.; Gong, S. Harmonious attention network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2285–2294. [Google Scholar]
- Zhao, H.; Tian, M.; Sun, S.; Shao, J.; Yan, J.; Yi, S.; Wang, X.; Tang, X. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1077–1085. [Google Scholar]
- Deng, W.; Zheng, L.; Ye, Q.; Kang, G.; Yang, Y.; Jiao, J. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 994–1003. [Google Scholar]
- Fan, H.; Zheng, L.; Yan, C.; Yang, Y. Unsupervised person re-identification: Clustering and fine-tuning. ACM Trans. Multimed. Comput. Commun. Appl. 2018, 14, 83. [Google Scholar] [CrossRef]
- Liu, Z.; Wang, D.; Lu, H. Stepwise metric promotion for unsupervised video person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2429–2438. [Google Scholar]
- Song, L.; Wang, C.; Zhang, L.; Du, B.; Zhang, Q.; Huang, C.; Wang, X. Unsupervised domain adaptive re-identification: Theory and practice. Pattern Recognit. 2020, 102, 107173. [Google Scholar] [CrossRef]
- Shi, Z.; Hospedales, T.M.; Xiang, T. Transferring a semantic representation for person re-identification and search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4184–4193. [Google Scholar]
- Layne, R.; Hospedales, T.M.; Gong, S. Towards person identification and re-identification with attributes. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 402–412. [Google Scholar]
- Layne, R.; Hospedales, T.M.; Gong, S. Attributes-based re-identification. In Person Re-Identification; Springer: Berlin/Heidelberg, Germany, 2014; pp. 93–117. [Google Scholar]
- Su, C.; Yang, F.; Zhang, S.; Tian, Q.; Davis, L.S.; Gao, W. Multi-task learning with low rank attribute embedding for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3739–3747. [Google Scholar]
- Su, C.; Yang, F.; Zhang, S.; Tian, Q.; Davis, L.S.; Gao, W. Multi-task learning with low rank attribute embedding for multi-camera person re-identification. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1167–1181. [Google Scholar] [CrossRef] [PubMed]
- Schumann, A.; Stiefelhagen, R. Person re-identification by deep learning attribute-complementary information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 20–28. [Google Scholar]
- Wang, J.; Zhu, X.; Gong, S.; Li, W. Transferable joint attribute-identity deep learning for unsupervised person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2275–2284. [Google Scholar]
- Wang, G.; Yuan, Y.; Chen, X.; Li, J.; Zhou, X. Learning discriminative features with multiple granularities for person re-identification. In Proceedings of the ACM International Conference on Multimedia, Seoul, Republic of Korea, 22–26 October 2018; pp. 274–282. [Google Scholar]
- Fu, Y.; Wei, Y.; Wang, G.; Zhou, Y.; Shi, H.; Huang, T.S. Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6112–6121. [Google Scholar]
- Zheng, L.; Shen, L.; Tian, L.; Wang, S.; Wang, J.; Tian, Q. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1116–1124. [Google Scholar]
- Ristani, E.; Solera, F.; Zou, R.; Cucchiara, R.; Tomasi, C. Performance measures and a data set for multi-target, multi-camera tracking. In Computer Vision – ECCV 2016 Workshops Amsterdam, The Netherlands, October 8–10 and 15–16, 2016, Proceedings, Part II; Springer: Berlin/Heidelberg, Germany, 2016; pp. 17–35. [Google Scholar]
- Gray, D.; Brennan, S.; Tao, H. Evaluating appearance models for recognition, reacquisition, and tracking. In Proceedings of the IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), Rio de Janeiro, Brazil, 14–18 December 2007; Volume 3, pp. 1–7. [Google Scholar]
- Hirzer, M.; Beleznai, C.; Roth, P.M.; Bischof, H. Person re-identification by descriptive and discriminative classification. In Image Analysis 17th Scandinavian Conference, SCIA 2011, Ystad, Sweden, May 2011. Proceedings; Springer: Berlin/Heidelberg, Germany, 2011; pp. 91–102. [Google Scholar]
- Available online: https://pytorch.org/ (accessed on 5 August 2025).
- Deng, Y.; Luo, P.; Loy, C.C.; Tang, X. Pedestrian attribute recognition at far distance. In Proceedings of the ACM International Conference on Multimedia, Orlando, FL, USA, 3–7 November 2014; pp. 789–792. [Google Scholar]
- Peng, P.; Tian, Y.; Xiang, T.; Wang, Y.; Pontil, M.; Huang, T. Joint semantic and latent attribute modelling for cross-class transfer learning. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 1625–1638. [Google Scholar] [CrossRef] [PubMed]
- Zheng, L.; Yang, Y.; Hauptmann, A.G. Person re-identification: Past, present and future. arXiv 2016, arXiv:1610.02984. [Google Scholar] [CrossRef]
- Zheng, Z.; Zheng, L.; Yang, Y. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3754–3762. [Google Scholar]
- Liao, S.; Hu, Y.; Zhu, X.; Li, S.Z. Person re-identification by local maximal occurrence representation and metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2197–2206. [Google Scholar]
- Peng, P.; Xiang, T.; Wang, Y.; Pontil, M.; Gong, S.; Huang, T.; Tian, Y. Unsupervised cross-dataset transfer learning for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1306–1315. [Google Scholar]
- Yu, H.X.; Zheng, W.S.; Wu, A.; Guo, X.; Gong, S.; Lai, J.H. Unsupervised Person Re-identification by Soft Multilabel Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 2148–2157. [Google Scholar]
- Zhong, Z.; Zheng, L.; Luo, Z.; Li, S.; Yang, Y. Invariance matters: Exemplar memory for domain adaptive person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 598–607. [Google Scholar]
- Su, C.; Zhang, S.; Xing, J.; Gao, W.; Tian, Q. Deep attributes driven multi-camera person re-identification. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 475–491. [Google Scholar]
- Zhu, X.; Morerio, P.; Murino, V. Unsupervised Domain-Adaptive Person Re-Identification Based on Attributes. In Proceedings of the IEEE International Conference on Image Processing, Taipei, Taiwan, 22–25 September 2019; pp. 4110–4114. [Google Scholar]
- Lin, S.; Li, H.; Li, C.T.; Kot, A.C. Multi-task mid-level feature alignment network for unsupervised cross-dataset person re-identification. In Proceedings of the British Machine Vision Conference, Newcastle, UK, 3–6 September 2018. [Google Scholar]
- Zhong, Z.; Zheng, L.; Li, S.; Yang, Y. Generalizing a person retrieval model hetero-and homogeneously. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 172–188. [Google Scholar]
- Qi, L.; Wang, L.; Huo, J.; Zhou, L.; Shi, Y.; Gao, Y. A novel unsupervised camera-aware domain adaptation framework for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8080–8089. [Google Scholar]
- Li, Y.J.; Yang, F.E.; Liu, Y.C.; Yeh, Y.Y.; Du, X.; Frank Wang, Y.C. Adaptation and re-identification network: An unsupervised deep transfer learning approach to person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 172–178. [Google Scholar]
- Yu, H.X.; Wu, A.; Zheng, W.S. Cross-view asymmetric metric learning for unsupervised person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 994–1002. [Google Scholar]
- Farenzena, M.; Bazzani, L.; Perina, A.; Murino, V.; Cristani, M. Person re-identification by symmetry-driven accumulation of local features. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; IEEE: San Francisco, CA, USA, 2010; pp. 2360–2367. [Google Scholar]
- Zhao, R.; Ouyang, W.; Wang, X. Unsupervised salience learning for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 3586–3593. [Google Scholar]
- Wang, H.; Gong, S.; Xiang, T. Unsupervised learning of generative topic saliency for person re-identification. In Proceedings of the British Machine Vision Conference, Nottingham, UK, 1–5 September 2014. [Google Scholar]
- Lisanti, G.; Masi, I.; Bagdanov, A.D.; Del Bimbo, A. Person re-identification by iterative re-weighted sparse ranking. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 1629–1642. [Google Scholar] [CrossRef] [PubMed]
- Kodirov, E.; Xiang, T.; Gong, S. Dictionary learning with iterative laplacian regularisation for unsupervised person re-identification. In Proceedings of the 26th British Machine Vision Conference, Swansea, UK, 7–10 September 2015; Volume 3, p. 8. [Google Scholar]
- Fernando, B.; Habrard, A.; Sebban, M.; Tuytelaars, T. Unsupervised visual domain adaptation using subspace alignment. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2960–2967. [Google Scholar]
- Ma, A.J.; Li, J.; Yuen, P.C.; Li, P. Cross-domain person reidentification using domain adaptation ranking svms. IEEE Trans. Image Process. 2015, 24, 1599–1613. [Google Scholar] [CrossRef] [PubMed]
- Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 1–35. [Google Scholar]
Global Part | ||||||
---|---|---|---|---|---|---|
Upper-body Part | Lower-body Part | |||||
colors of upper-body clothing, wearing hat, carrying handbag | length of lower-body, age, gender, hair length, sleeve length, carrying bag, carrying backpack, type of lower-body clothing, colors of lower-body clothing, shoe type, color of shoes |
Infra Details | |
---|---|
GPU | NVIDIA Titan × 2 |
Training Hyperparameters (Supervised pre-training) | |
Input images | 384 × 128 pixels |
Batch size | 32 |
Training epoch | 320 |
Initial learning rate | |
Iteration | 30 |
Optimizer | SGD |
(for in Supervised Pre-training) | 2 |
(for in Supervised Pre-training) | 0.1 |
Training Hyperparameters (Unsupervised domain adaption) | |
Input images | 384 × 128 pixels |
Batch size | 32 |
Training epoch | 70 |
Initial Learning rate | from to |
Iteration | 30 |
Optimizer | SGD |
(for in Unsupervised Adaptation) | 0.1 |
(for in Unsupervised Adaptation) | 0.2 |
Model Weights | |
ResNet-50 | pre-trained with ImageNet |
Datasets | VIPeR | PRID | Market-1501 | DukeMTMC-ReID |
---|---|---|---|---|
#identities | 632 | 934 | 1501 | 1812 |
#images | 1264 | 1134 | 32,643 | 36,411 |
#cameras | 2 | 2 | 6 | 8 |
#training IDs | 316 | 100 | 750 | 702 |
#test IDs | 316 | 100 | 751 | 702 |
#probe images | 316 | 100 | 3368 | 2228 |
#gallery images | 316 | 649 | 19,732 | 17,661 |
#attributes | 105 | 105 | 27 | 23 |
Methods | Market-1501→DukeMTMC-ReID | ||||
---|---|---|---|---|---|
Source Label | mAP (%) | Top-1 (%) | Top-5 (%) | Top-10 (%) | |
LOMO [58] | ID | 4.8 | 12.3 | 21.3 | 26.6 |
Bow [49] | ID | 8.3 | 17.1 | 28.8 | 34.9 |
UDML [59] | ID | 7.3 | 18.5 | 31.4 | 37.6 |
SPGAN [36] | ID | 26.4 | 46.9 | 62.6 | 68.5 |
HHL [65] | ID | 27.2 | 46.9 | 61.0 | 66.7 |
PUL [37] | ID | 16.4 | 30.4 | 44.5 | 50.7 |
UCDA-CCE [66] | ID | 36.7 | 55.4 | - | - |
Theory [39] | ID | 49.0 | 68.4 | 80.1 | 83.5 |
ARN [67] | ID | 33.4 | 60.2 | 73.9 | 79.5 |
MAR [60] | ID | 48.0 | 67.1 | 79.8 | - |
ENC [61] | ID | 40.4 | 63.3 | 75.8 | 80.4 |
TJ-AIDL [46] | ID + Attr | 23.0 | 44.3 | 59.6 | 65.0 |
MMFA [64] | ID + Attr | 24.7 | 45.3 | 59.8 | 66.3 |
Present | ID + Attr | 54.2 | 73.1 | 81.3 | 83.8 |
Methods | DukeMTMC-ReID→Market-1501 | ||||
---|---|---|---|---|---|
Source Label | mAP (%) | Top-1 (%) | Top-5 (%) | Top-10 (%) | |
LOMO [58] | ID | 8.0 | 27.2 | 41.6 | 49.1 |
Bow [49] | ID | 14.8 | 35.8 | 52.4 | 60.3 |
UDML [59] | ID | 12.4 | 34.5 | 52.6 | 59.6 |
CAMEL [68] | ID | 26.3 | 54.5 | - | - |
SPGAN [36] | ID | 26.9 | 58.1 | 76.0 | 82.7 |
HHL [65] | ID | 31.4 | 62.2 | 78.8 | 84.0 |
PUL [37] | ID | 20.1 | 44.7 | 59.1 | 65.6 |
UCDA-CCE [66] | ID | 34.5 | 64.3 | - | - |
Theory [39] | ID | 53.7 | 75.8 | 89.5 | 93.2 |
ARN [67] | ID | 39.4 | 70.3 | 80.4 | 86.3 |
MAR [60] | ID | 40.0 | 67.7 | 81.9 | - |
ENC [61] | ID | 43.0 | 75.1 | 87.6 | 91.6 |
TJ-AIDL [46] | ID + Attr | 26.5 | 58.2 | 74.8 | 81.1 |
SSDAL [62] | ID + Attr | 19.6 | - | - | 39.4 |
MMFA [64] | ID + Attr | 27.4 | 56.7 | 75.0 | 81.8 |
Present | ID + Attr | 55.8 | 78.2 | 88.2 | 90.8 |
Methods | Source Label | VIPeR | PRID | ||||
---|---|---|---|---|---|---|---|
Top-1 (%) | Top-5 (%) | Top-10 (%) | Top-1 (%) | Top-5 (%) | Top-10 (%) | ||
SDALF [69] | ID | 19.9 | 38.9 | 49.4 | 16.3 | 29.6 | 38.0 |
eSDC [70] | ID | 26.7 | 50.7 | 62.4 | - | - | - |
GTS [71] | ID | 25.1 | 50.0 | 62.5 | - | - | - |
ISR [72] | ID | 27.0 | 49.8 | 61.2 | 17.0 | 34.4 | 42.0 |
DLLR [73] | ID | 29.6 | 54.8 | 64.8 | 21.1 | 43.7 | 55.8 |
kLFDA_N [10] | ID + Attr | 15.9 | 42.4 | 50.0 | 9.1 | 27.3 | 35.0 |
SADA [74] + kLFDA [10] | ID + Attr | 15.2 | 41.4 | 49.8 | 8.7 | 26.4 | 34.8 |
AdaRSVM [75] | ID + Attr | 10.9 | 23.7 | 33.1 | 4.9 | 13.1 | 18.4 |
Adversarial [76] | ID + Attr | 22.8 | 38.6 | 50.3 | - | - | - |
JSLAM [55] | ID + Attr | 34.6 | 60.1 | 69.5 | 25.6 | 47.9 | 58.5 |
SSDAL [62] | ID + Attr | 37.9 | - | - | 20.1 | - | - |
TJ-AIDL [46] | ID + Attr | 38.5 | - | - | 34.8 | - | - |
Present | ID + Attr | 40.2 | 62.2 | 71.3 | 35.5 | 48.1 | 60.6 |
Methods | VIPeR (%) | GRID (%) |
---|---|---|
SSDAL-Stage1 [62] | 57.2 | 60.7 |
SSDAL-Stage1 and 3 [62] | 57.1 | 61.1 |
SSDAL-Stage1 and 2 [62] | 56.9 | 60.6 |
SSDAL [62] | 58.6 | 62.7 |
Present | 63.2 | 65.1 |
Methods | mAP (%) | Top-1 (%) | Top-5 (%) | Top-10 (%) |
---|---|---|---|---|
Supervised Person ReID on Market-1501 | ||||
Ours-woID | 62.8 | 82.3 | 91.7 | 94.7 |
Ours-woAttr | 77.8 | 91.5 | 96.7 | 98.0 |
Ours-woTrip | 71.4 | 88.7 | 95.7 | 97.2 |
Ours-woGF | 75.6 | 85.4 | 93.8 | 96.1 |
Ours-woLF | 74.7 | 85.1 | 92.7 | 95.8 |
Present | 81.5 | 92.7 | 96.8 | 98.1 |
Unsupervised Person ReID on DukeMTMC-ReID | ||||
Ours-woAdapt | 18.4 | 27.9 | 38.0 | 44.0 |
Ours-woST | 28.0 | 42.7 | 55.9 | 61.2 |
Ours-woID | 47.7 | 67.0 | 78.1 | 82.2 |
Ours-woTrip | 35.0 | 51.8 | 65.0 | 69.6 |
Ours-woAttrT | 51.9 | 71.4 | 79.6 | 82.8 |
Ours-woGF | 53.2 | 69.5 | 76.6 | 80.4 |
Ours-woLF | 51.7 | 62.2 | 75.1 | 75.8 |
Present | 54.2 | 73.1 | 81.3 | 83.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, S.; Xu, Y.; Zhang, X.; Cheng, B.; Wang, K. Unsupervised Person Re-Identification via Deep Attribute Learning. Future Internet 2025, 17, 371. https://doi.org/10.3390/fi17080371
Zhang S, Xu Y, Zhang X, Cheng B, Wang K. Unsupervised Person Re-Identification via Deep Attribute Learning. Future Internet. 2025; 17(8):371. https://doi.org/10.3390/fi17080371
Chicago/Turabian StyleZhang, Shun, Yaohui Xu, Xuebin Zhang, Boyang Cheng, and Ke Wang. 2025. "Unsupervised Person Re-Identification via Deep Attribute Learning" Future Internet 17, no. 8: 371. https://doi.org/10.3390/fi17080371
APA StyleZhang, S., Xu, Y., Zhang, X., Cheng, B., & Wang, K. (2025). Unsupervised Person Re-Identification via Deep Attribute Learning. Future Internet, 17(8), 371. https://doi.org/10.3390/fi17080371