MDPI - Publisher of Open Access Journals

22 pages, 7733 KiB

Open AccessArticle

Parsing-Guided Differential Enhancement Graph Learning for Visible-Infrared Person Re-Identification

by Xingpeng Li, Huabing Liu, Chen Xue, Nuo Wang and Enwen Hu

Electronics 2025, 14(15), 3118; https://doi.org/10.3390/electronics14153118 - 5 Aug 2025

Visible-Infrared Person Re-Identification (VI-ReID) is of crucial importance in applications such as monitoring and security. However, challenges faced from intra-class variations and cross-modal differences are often exacerbated by inaccurate infrared analysis and insufficient structural modeling. To address these issues, we propose Parsing-guided Differential [...] Read more.

Visible-Infrared Person Re-Identification (VI-ReID) is of crucial importance in applications such as monitoring and security. However, challenges faced from intra-class variations and cross-modal differences are often exacerbated by inaccurate infrared analysis and insufficient structural modeling. To address these issues, we propose Parsing-guided Differential Enhancement Graph Learning (PDEGL), a novel framework that learns discriminative representations through a dual-branch architecture synergizing global feature refinement with part-based structural graph analysis. In particular, we introduce a Differential Infrared Part Enhancement (DIPE) module to correct infrared parsing errors and a Parsing Structural Graph (PSG) module to model high-order topological relationships between body parts for structural consistency matching. Furthermore, we design a Position-sensitive Spatial-Channel Attention (PSCA) module to enhance global feature discriminability. Extensive evaluations on the SYSU-MM01, RegDB, and LLCM datasets demonstrate that our PDEGL method achieves competitive performance. Full article

► Show Figures

Figure 1

20 pages, 2285 KiB

Open AccessArticle

WormNet: A Multi-View Network for Silkworm Re-Identification

by Hongkang Shi, Minghui Zhu, Linbo Li, Yong Ma, Jianmei Wu, Jianfei Zhang and Junfeng Gao

Animals 2025, 15(14), 2011; https://doi.org/10.3390/ani15142011 - 8 Jul 2025

Viewed by 227

Abstract

Re-identification (ReID) has been widely applied in person and vehicle recognition tasks. This study extends its application to a novel domain: insect (silkworm) recognition. However, unlike person or vehicle ReID, silkworm ReID presents unique challenges, such as the high similarity between individuals, arbitrary [...] Read more.

Re-identification (ReID) has been widely applied in person and vehicle recognition tasks. This study extends its application to a novel domain: insect (silkworm) recognition. However, unlike person or vehicle ReID, silkworm ReID presents unique challenges, such as the high similarity between individuals, arbitrary poses, and significant background noise. To address these challenges, we propose a multi-view network for silkworm ReID, called WormNet, which is built upon an innovative strategy termed extraction purification extraction interaction. Specifically, we introduce a multi-order feature extraction module that captures a wide range of fine-grained features by utilizing convolutional kernels of varying sizes and parallel cardinality, effectively mitigating issues of high individual similarity and diverse poses. Next, a feature mask module (FMM) is employed to purify the features in the spatial domain, thereby reducing the impact of background interference. To further enhance the data representation capabilities of the network, we propose a channel interaction module (CIM), which combines an efficient channel attention network with global response normalization (GRN) in parallel to recalibrate features, enabling the network to learn crucial information at both the local and global scales. Additionally, we introduce a new silkworm ReID dataset for network training and evaluation. The experimental results demonstrate that WormNet achieves an mAP value of 54.8% and a rank-1 value of 91.4% on the dataset, surpassing both state-of-the-art and related networks. This study offers a valuable reference for ReID in insects and other organisms. Full article

(This article belongs to the Section Animal System and Management)

► Show Figures

Figure 1

20 pages, 2423 KiB

Open AccessArticle

Symmetry-Guided Prototype Alignment and Entropy Consistency for Multi-Source Pedestrian ReID in Power Grids: A Domain Adaptation Framework

by Jia He, Lei Zhang, Xiaofeng Zhang, Tong Xu, Kejun Wang, Pengsheng Li and Xia Liu

Symmetry 2025, 17(5), 672; https://doi.org/10.3390/sym17050672 - 28 Apr 2025

Viewed by 420

Abstract

This study proposes a multi-source unsupervised domain adaptation framework for person re-identification (ReID), addressing cross-domain feature discrepancies and label scarcity in electric power field operations. Inspired by symmetry principles in feature space optimization, the framework integrates (1) a Reverse Attention-based Feature Fusion (RAFF) [...] Read more.

This study proposes a multi-source unsupervised domain adaptation framework for person re-identification (ReID), addressing cross-domain feature discrepancies and label scarcity in electric power field operations. Inspired by symmetry principles in feature space optimization, the framework integrates (1) a Reverse Attention-based Feature Fusion (RAFF) module aligning cross-domain features using symmetry-guided prototype interactions that enforce bidirectional style-invariant representations and (2) a Self-Correcting Pseudo-Label Loss (SCPL) dynamically adjusting confidence thresholds using entropy symmetry constraints to balance source-target domain knowledge transfer. Experiments demonstrate 92.1% rank-1 accuracy on power industry benchmarks, outperforming DDAG and MTL by 9.5%, with validation confirming robustness in operational deployments. The symmetric design principles significantly enhance model adaptability to inherent symmetry breaking caused by heterogeneous power grid environments. Full article

(This article belongs to the Special Issue Navigating New Horizons: Symmetry and Advances in the Integration and Active Support of Large-Scale Renewable Energy)

► Show Figures

Figure 1

21 pages, 2285 KiB

Open AccessArticle

Unsupervised Aerial-Ground Re-Identification from Pedestrian to Group for UAV-Based Surveillance

by Ling Mei, Yiwei Cheng, Hongxu Chen, Lvxiang Jia and Yaowen Yu

Drones 2025, 9(4), 244; https://doi.org/10.3390/drones9040244 - 25 Mar 2025

Cited by 1 | Viewed by 642

Abstract

Person re-identification (ReID) plays a crucial role in advancing UAV-based surveillance applications, enabling robust tracking and event analysis. However, existing methods in UAV scenarios primarily focus on individual pedestrians, requiring cumbersome annotation efforts and lacking seamless integration with ground-based surveillance systems. These limitations [...] Read more.

Person re-identification (ReID) plays a crucial role in advancing UAV-based surveillance applications, enabling robust tracking and event analysis. However, existing methods in UAV scenarios primarily focus on individual pedestrians, requiring cumbersome annotation efforts and lacking seamless integration with ground-based surveillance systems. These limitations hinder the broader development of UAV-based monitoring. To address these challenges, this paper proposes an Unsupervised Aerial-Ground Re-identification from Pedestrian to Group (UAGRPG) framework. Specifically, we introduce a neighbor-aware collaborative learning (NCL) and gradual graph matching (GGC) strategy to uncover the implicit associations between cross-modality groups in an unsupervised manner. Furthermore, we develop a collaborative cross-modality association learning (CCAL) module to bridge feature disparities and achieve soft alignment across modalities. To quantify the optimal group similarity between aerial and ground domains, we design a minimum pedestrian distance transformation strategy. Additionally, we introduce a new AG-GReID dataset, and extensive experiments demonstrate that our approach achieves state-of-the-art performance on both pedestrian and group re-identification tasks in aerial-ground scenarios, validating its effectiveness in integrating ground and UAV-based surveillance. Full article

► Show Figures

Figure 1

26 pages, 8705 KiB

Open AccessArticle

Person Re-Identification with Attribute-Guided, Robust-to-Low-Resolution Drone Footage Considering Fog/Edge Computing

by Bongjun Kim, Sunkyu Kim, Seokwon Park and Junho Jeong

Sensors 2025, 25(6), 1819; https://doi.org/10.3390/s25061819 - 14 Mar 2025

Viewed by 736

Abstract

In aerial surveillance using drones, person re-identification (ReID) is crucial for public safety. However, low resolutions in drone footage often leads to a significant drop in ReID performance of subjects. To investigate this issue, rather than relying solely on real-world datasets, we employed [...] Read more.

In aerial surveillance using drones, person re-identification (ReID) is crucial for public safety. However, low resolutions in drone footage often leads to a significant drop in ReID performance of subjects. To investigate this issue, rather than relying solely on real-world datasets, we employed a synthetic dataset that systematically captures variations in drone altitude and distance. We also utilized an eXplainable Artificial Intelligence (XAI) framework to analyze how low resolutions affect ReID. Based on our findings, we propose a method that improves ReID accuracy by filtering out attributes that are not robust in low-resolution environments and retaining only those features that remain reliable. Experiments on the Market1501 dataset show a 6.59% percentage point improvement in accuracy at a 16% resolution scale. We further discuss the effectiveness of our approach in drone-based aerial surveillance systems under Fog/Edge Computing paradigms. Full article

(This article belongs to the Special Issue Integration of Edge/Fog Artificial Intelligence into Smart Distributed Systems)

► Show Figures

Figure 1

19 pages, 2976 KiB

Open AccessArticle

BiFFN: Bi-Frequency Guided Feature Fusion Network for Visible–Infrared Person Re-Identification

by Xingyu Cao, Pengxin Ding, Jie Li and Mei Chen

Sensors 2025, 25(5), 1298; https://doi.org/10.3390/s25051298 - 20 Feb 2025

Viewed by 750

Abstract

Visible–infrared person re-identification (VI-ReID) aims to minimize the modality gaps of pedestrian images across different modalities. Existing methods primarily focus on extracting cross-modality features from the spatial domain, which often limits the comprehensive extraction of useful information. Compared with conventional approaches that either [...] Read more.

Visible–infrared person re-identification (VI-ReID) aims to minimize the modality gaps of pedestrian images across different modalities. Existing methods primarily focus on extracting cross-modality features from the spatial domain, which often limits the comprehensive extraction of useful information. Compared with conventional approaches that either focus on single-frequency components or employ simple multi-branch fusion strategies, our method fundamentally addresses the modality discrepancy through systematic frequency-space co-learning. To address this limitation, we propose a novel bi-frequency feature fusion network (BiFFN) that effectively extracts and fuses features from both high- and low-frequency domains and spatial domain features to reduce modality gaps. The network introduces a frequency-spatial enhancement (FSE) module to enhance feature representation across both domains. Additionally, the deep frequency mining (DFM) module optimizes cross-modality information utilization by leveraging distinct features of high- and low-frequency features. The cross-frequency fusion (CFF) module further aligns low-frequency features and fuses them with high-frequency features to generate middle features that incorporate critical information from each modality. To refine the distribution of identity features in the common space, we develop a unified modality center (UMC) loss, which promotes a more balanced inter-modality distribution while preserving discriminative identity information. Extensive experiments demonstrate that the proposed BiFFN achieves state-of-the-art performance in VI-ReID. Specifically, our method achieved a Rank-1 accuracy of 77.5% and an mAP of 75.9% on the SYSU-MM01 dataset under the all-search mode. Additionally, it achieved a Rank-1 accuracy of 58.5% and an mAP of 63.7% on the LLCM dataset under the IR-VIS mode. These improvements verify that our model, with the integration of feature fusion and the incorporation of frequency domains, significantly reduces modality gaps and outperforms previous methods. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

22 pages, 5145 KiB

Open AccessArticle

Identity Hides in Darkness: Learning Feature Discovery Transformer for Nighttime Person Re-Identification

by Xin Yuan, Ying He and Guozhu Hao

Sensors 2025, 25(3), 862; https://doi.org/10.3390/s25030862 - 31 Jan 2025

Viewed by 819

Abstract

Person re-identification (Re-ID) aims to retrieve all images of the specific person captured by non-overlapping cameras and scenarios. Regardless of the significant success achieved by daytime person Re-ID methods, they will perform poorly due to the degraded imaging quality under low-light conditions. Therefore, [...] Read more.

Person re-identification (Re-ID) aims to retrieve all images of the specific person captured by non-overlapping cameras and scenarios. Regardless of the significant success achieved by daytime person Re-ID methods, they will perform poorly due to the degraded imaging quality under low-light conditions. Therefore, some works attempt to synthesize low-light images to explore the challenges in the nighttime, which omits the fact that synthetic images may not realistically reflect the challenges of person Re-ID at night. Moreover, other works follow the “enhancement-then-match” manner, but it is still hard to capture discriminative identity features owing to learning enlarged irrelevant noise for identifying pedestrians. To this end, we propose a novel nighttime person Re-ID method, termed Feature Discovery Transformer (FDT), explicitly capturing the pedestrian identity information hidden in darkness at night. More specifically, the proposed FDT model contains two novel modules: the Frequency-wise Reconstruction Module (FRM) and the Attribute Guide Module (AGM). In particular, to reduce noise disturbance and discover pedestrian identity details, the FRM utilizes the Discrete Haar Wavelet Transform to acquire the high- and low-frequency components for learning person features. Furthermore, to avoid high-frequency components being over-smoothed by low-frequency ones, we propose a novel Normalized Contrastive Loss (NCL) to help the model obtain the identity details in high-frequency components for extracting discriminative person features. Then, to further decrease the negative bias caused by appearance-irrelevant features and enhance the pedestrian identity features, the AGM improves the robustness of the learned features by integrating the auxiliary information, i.e., camera ID and viewpoint. Extensive experimental results demonstrate that our proposed FDT model can achieve state-of-the-art performance on two realistic nighttime person Re-ID benchmarks, i.e., Night600 and

{RGBNT 201}_{r g b}

datasets. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

15 pages, 775 KiB

Open AccessArticle

Robust Fine-Grained Learning for Cloth-Changing Person Re-Identification

by Qingze Yin, Guodong Ding, Tongpo Zhang and Yumei Gong

Mathematics 2025, 13(3), 429; https://doi.org/10.3390/math13030429 - 27 Jan 2025

Viewed by 1261

Abstract

Cloth-changing Person Re-Identification (CC-ReID) poses a significant challenge in tracking pedestrians across cameras while accounting for changes in clothing appearance. Despite recent progress in CC-ReID, existing methods predominantly focus on learning the unique biological features of pedestrians, often overlooking constraints that promote the [...] Read more.

Cloth-changing Person Re-Identification (CC-ReID) poses a significant challenge in tracking pedestrians across cameras while accounting for changes in clothing appearance. Despite recent progress in CC-ReID, existing methods predominantly focus on learning the unique biological features of pedestrians, often overlooking constraints that promote the learning of cloth-agnostic features. Addressing this limitation, we propose a Robust Fine-grained Learning Network (RFLNet) to effectively learn robust cloth-agnostic features by leveraging fine-grained semantic constraints. Specifically, we introduce a four-body-part attention module to enhance the learning of detailed pedestrian semantic features. To further strengthen the model’s robustness to clothing variations, we employ a random erasing algorithm, encouraging the network to concentrate on cloth-irrelevant attributes. Additionally, we design a fine-grained semantic loss to guide the model in learning identity-related, detailed semantic features, thereby improving its focus on cloth-agnostic regions. Comprehensive experiments on widely used CC-ReID benchmarks demonstrate the effectiveness of RFLNet. Our method achieves state-of-the-art performance, including a 0.7% increase in mAP on PRCC and a 1.6% improvement in rank-1 accuracy on DeepChange. Full article

► Show Figures

Figure 1

15 pages, 2715 KiB

Open AccessArticle

Cross-Domain Person Re-Identification Based on Multi-Branch Pose-Guided Occlusion Generation

by Pengnan Liu, Yanchen Wang, Yunlong Li, Deqiang Cheng and Feixiang Xu

Sensors 2025, 25(2), 473; https://doi.org/10.3390/s25020473 - 15 Jan 2025

Viewed by 1189

Abstract

Aiming at the problems caused by a lack of feature matching due to occlusion and fixed model parameters in cross-domain person re-identification, a method based on multi-branch pose-guided occlusion generation is proposed. This method can effectively improve the accuracy of person matching and [...] Read more.

Aiming at the problems caused by a lack of feature matching due to occlusion and fixed model parameters in cross-domain person re-identification, a method based on multi-branch pose-guided occlusion generation is proposed. This method can effectively improve the accuracy of person matching and enable identity matching even when pedestrian features are misaligned. Firstly, a novel pose-guided occlusion generation module is designed to enhance the model’s ability to extract discriminative features from non-occluded areas. Occlusion data are generated to simulate occluded person images. This improves the model’s learning ability and addresses the issue of misidentifying occlusion samples. Secondly, a multi-branch feature fusion structure is constructed. By fusing different feature information from the global and occlusion branches, the diversity of features is enriched. This enrichment improves the model’s generalization. Finally, a dynamic convolution kernel is constructed to calculate the similarity between images. This approach achieves effective point-to-point matching and resolves the problem of fixed model parameters. Experimental results indicate that, compared to current mainstream algorithms, this method shows significant advantages in the first hit rate (Rank-1), mean average precision (mAP), and generalization performance. In the MSMT17→DukeMTMC-reID dataset, after re-ranking (Rerank) and time-tift (Tlift) for the two indicators on Market1501, the mAP and Rank-1 reached 80.5%, 84.3%, 81.9%, and 93.1%. Additionally, the algorithm achieved 51.6% and 41.3% on DukeMTMC-reID→Occluded-Duke, demonstrating good recognition performance on the occlusion dataset. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

19 pages, 9593 KiB

Open AccessArticle

An Investigation of the Domain Gap in CLIP-Based Person Re-Identification

by Andrea Asperti, Leonardo Naldi and Salvatore Fiorilla

Sensors 2025, 25(2), 363; https://doi.org/10.3390/s25020363 - 9 Jan 2025

Cited by 1 | Viewed by 1882

Abstract

Person re-identification (re-id) is a critical computer vision task aimed at identifying individuals across multiple non-overlapping cameras, with wide-ranging applications in intelligent surveillance systems. Despite recent advances, the domain gap—performance degradation when models encounter unseen datasets—remains a critical challenge. CLIP-based models, leveraging multimodal [...] Read more.

Person re-identification (re-id) is a critical computer vision task aimed at identifying individuals across multiple non-overlapping cameras, with wide-ranging applications in intelligent surveillance systems. Despite recent advances, the domain gap—performance degradation when models encounter unseen datasets—remains a critical challenge. CLIP-based models, leveraging multimodal pre-training, offer potential for mitigating this issue by aligning visual and textual representations. In this study, we provide a comprehensive quantitative analysis of the domain gap in CLIP-based re-id systems across standard benchmarks, including Market-1501, DukeMTMC-reID, MSMT17, and Airport, simulating real-world deployment conditions. We systematically measure the performance of these models in terms of mean average precision (mAP) and Rank-1 accuracy, offering insights into the challenges faced during dataset transitions. Our analysis highlights the specific advantages introduced by CLIP’s visual–textual alignment and evaluates its contribution relative to strong image encoder baselines. Additionally, we evaluate the impact of extending training sets with non-domain-specific data and incorporating random erasing augmentation, achieving an average improvement of +4.3% in mAP and +4.0% in Rank-1 accuracy. Our findings underscore the importance of standardized benchmarks and systematic evaluations for enhancing reproducibility and guiding future research. This work contributes to a deeper understanding of the domain gap in re-id, while highlighting pathways for improving model robustness and generalization in diverse, real-world scenarios. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

19 pages, 8290 KiB

Open AccessArticle

Multi-Scale Contrastive Learning with Hierarchical Knowledge Synergy for Visible-Infrared Person Re-Identification

by Yongheng Qian and Su-Kit Tang

Sensors 2025, 25(1), 192; https://doi.org/10.3390/s25010192 - 1 Jan 2025

Cited by 1 | Viewed by 1334

Abstract

Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and [...] Read more.

Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and infrared images. However, exclusively relying on high-level semantic information from the network’s final layers can restrict shared feature representations and overlook the benefits of low-level details. Different from these methods, we propose a multi-scale contrastive learning network (MCLNet) with hierarchical knowledge synergy for VI-ReID. MCLNet is a novel two-stream contrastive deep supervision framework designed to train low-level details and high-level semantic representations simultaneously. MCLNet utilizes supervised contrastive learning (SCL) at each intermediate layer to strengthen visual representations and enhance cross-modality feature learning. Furthermore, a hierarchical knowledge synergy (HKS) strategy for pairwise knowledge matching promotes explicit information interaction across multi-scale features and improves information consistency. Extensive experiments on three benchmarks demonstrate the effectiveness of MCLNet. Full article

(This article belongs to the Special Issue Artificial Intelligence in Computer Vision: Methods and Applications—2nd Edition)

► Show Figures

Figure 1

14 pages, 2622 KiB

Open AccessArticle

Cross-View Multi-Scale Re-Identification Network in the Perspective of Ground Rotorcraft Unmanned Aerial Vehicle

by Wenji Yin, Yueping Peng, Hexiang Hao, Baixuan Han, Zecong Ye and Wenchao Liu

Mathematics 2024, 12(23), 3739; https://doi.org/10.3390/math12233739 - 27 Nov 2024

Viewed by 935

Abstract

Traditional Re-Identification (Re-ID) schemes often rely on multiple cameras from the same perspective to search for targets. However, the collaboration between fixed cameras and unmanned aerial vehicles (UAVs) is gradually becoming a new trend in the surveillance field. Facing the significant perspective differences [...] Read more.

Traditional Re-Identification (Re-ID) schemes often rely on multiple cameras from the same perspective to search for targets. However, the collaboration between fixed cameras and unmanned aerial vehicles (UAVs) is gradually becoming a new trend in the surveillance field. Facing the significant perspective differences between fixed cameras and UAV cameras, the task of Re-ID is facing unprecedented challenges. In the setting of a single perspective, although significant advancements have been made in person Re-ID models, their performance markedly deteriorates when confronted with drastic viewpoint changes, such as transitions from aerial to ground-level perspectives. This degradation in performance is primarily attributed to the stark variations between viewpoints and the significant differences in subject posture and background across various perspectives. Existing methods focusing on learning local features have proven to be suboptimal in cross-perspective Re-ID tasks. The reason lies in the perspective distortion caused by the top-down viewpoint of drones, and the richer and more detailed texture information observed from a ground-level perspective, which leads to notable discrepancies in local features. To address this issue, the present study introduces a Multi-scale Across View Model (MAVM) that extracts features at various scales to generate a richer and more robust feature representation. Furthermore, we incorporate a Cross-View Alignment Module (AVAM) that fine-tunes the attention weights, optimizing the model’s response to critical areas such as the silhouette, attire textures, and other key features. This enhancement ensures high recognition accuracy even when subjects change posture and lighting conditions. Extensive experiments conducted on the public dataset AG-ReID have demonstrated the superiority of our proposed method, which significantly outperforms existing state-of-the-art techniques. Full article

► Show Figures

Figure 1

18 pages, 1672 KiB

Open AccessArticle

Pedestrian Re-Identification Based on Fine-Grained Feature Learning and Fusion

by Anming Chen and Weiqiang Liu

Sensors 2024, 24(23), 7536; https://doi.org/10.3390/s24237536 - 26 Nov 2024

Cited by 1 | Viewed by 1046

Abstract

Video-based pedestrian re-identification (Re-ID) is used to re-identify the same person across different camera views. One of the key problems is to learn an effective representation for the pedestrian from video. However, it is difficult to learn an effective representation from one single [...] Read more.

Video-based pedestrian re-identification (Re-ID) is used to re-identify the same person across different camera views. One of the key problems is to learn an effective representation for the pedestrian from video. However, it is difficult to learn an effective representation from one single modality of a feature due to complicated issues with video, such as background, occlusion, and blurred scenes. Therefore, there are some studies on fusing multimodal features for video-based pedestrian Re-ID. However, most of these works fuse features at the global level, which is not effective in reflecting fine-grained and complementary information. Therefore, the improvement in performance is limited. To obtain a more effective representation, we propose to learn fine-grained features from different modalities of the video, and then they are aligned and fused at the fine-grained level to capture rich semantic information. As a result, a multimodal token-learning and alignment model (MTLA) is proposed to re-identify pedestrians across camera videos. An MTLA consists of three modules, i.e., a multimodal feature encoder, token-based cross-modal alignment, and correlation-aware fusion. Firstly, the multimodal feature encoder is used to extract the multimodal features from the visual appearance and gait information views, and then fine-grained tokens are learned and denoised from these features. Then, the token-based cross-modal alignment module is used to align the multimodal features at the token level to capture fine-grained semantic information. Finally, the correlation-aware fusion module is used to fuse the multimodal token features by learning the inter- and intra-modal correlation, in which the features refine each other and a unified representation is obtained for pedestrian Re-ID. To evaluate the performance of fine-grained features alignment and fusion, we conduct extensive experiments on three benchmark datasets. Compared with the state-of-art approaches, all the evaluation metrices of mAP and Rank-K are improved by more than 0.4 percentage points. Full article

(This article belongs to the Special Issue Sensor-Based Behavioral Biometrics)

► Show Figures

Figure 1

23 pages, 4473 KiB

Open AccessArticle

A Study of Occluded Person Re-Identification for Shared Feature Fusion with Pose-Guided and Unsupervised Semantic Segmentation

by Junsuo Qu, Zhenguo Zhang, Yanghai Zhang and Chensong He

Electronics 2024, 13(22), 4523; https://doi.org/10.3390/electronics13224523 - 18 Nov 2024

Viewed by 1470

Abstract

The human body is often occluded by a variety of obstacles in the monitoring system, so occluded person re-identification is still a long-standing challenge. Recent methods based on pose guidance or external semantic clues have improved the representation and related performance of features; [...] Read more.

The human body is often occluded by a variety of obstacles in the monitoring system, so occluded person re-identification is still a long-standing challenge. Recent methods based on pose guidance or external semantic clues have improved the representation and related performance of features; there are still problems, such as weak model representation and unreliable semantic clues. To solve the above problems, we proposed a feature extraction network, named shared feature fusion with pose-guided and unsupervised semantic segmentation (SFPUS). This network will extract more discriminative features and reduce the occlusion noise on pedestrian matching. Firstly, the multibranch joint feature extraction module (MFE) is used to extract feature sets containing pose information and high-order semantic information. This module not only provides robust extraction capabilities but can also precisely segment occlusion and the body. Secondly, in order to obtain multiscale discriminant features, the multiscale correlation feature matching fusion module (MCF) is used to match the two feature sets, and the Pose–Semantic Fusion Loss is designed to calculate the similarity of the feature sets between different modes and fuse them into a feature set. Thirdly, to solve the problem of image occlusion, we use unsupervised cascade clustering to better prevent occlusion interference. Finally, performances of the proposed method and various existing methods are compared on the Occluded-Duke, Occluded-ReID, Market-1501 and Duke-MTMC datasets. The accuracy of Rank-1 reached 65.7%, 80.8%, 94.8% and 89.6%, respectively, and the mAP accuracy reached 58.8%, 72.5%, 91.8% and 80.1%. The experiment results demonstrate that our proposed SFPUS holds promising prospects and performs admirably compared with state-of-the-art methods. Full article

(This article belongs to the Special Issue Advances in Computer Vision and Deep Learning and Its Applications)

► Show Figures

Figure 1

22 pages, 7282 KiB

Open AccessArticle

QuEst: Adversarial Attack Intensity Estimation via Query Response Analysis

by Eun Gi Lee, Chi Hyeok Min and Seok Bong Yoo

Mathematics 2024, 12(22), 3508; https://doi.org/10.3390/math12223508 - 9 Nov 2024

Viewed by 875

Abstract

Deep learning has dramatically advanced computer vision tasks, including person re-identification (re-ID), substantially improving matching individuals across diverse camera views. However, person re-ID systems remain vulnerable to adversarial attacks that introduce imperceptible perturbations, leading to misidentification and undermining system reliability. This paper addresses [...] Read more.

Deep learning has dramatically advanced computer vision tasks, including person re-identification (re-ID), substantially improving matching individuals across diverse camera views. However, person re-ID systems remain vulnerable to adversarial attacks that introduce imperceptible perturbations, leading to misidentification and undermining system reliability. This paper addresses the challenge of robust person re-ID in the presence of adversarial examples by estimating attack intensity to enable effective detection and adaptive purification. The proposed approach leverages the observation that adversarial examples in retrieval tasks disrupt the relevance and internal consistency of retrieval results, degrading re-ID accuracy. This approach estimates the attack intensity and dynamically adjusts the purification strength by analyzing the query response data, addressing the limitations of fixed purification methods. This approach also preserves the performance of the model on clean data by avoiding unnecessary manipulation while improving the robustness of the system and its reliability in the presence of adversarial examples. The experimental results demonstrate that the proposed method effectively detects adversarial examples and estimates the attack intensity through query response analysis. This approach enhances purification performance when integrated with adversarial purification techniques in person re-ID systems. Full article

(This article belongs to the Special Issue Mathematical Optimization in Machine Learning, Computer Vision, and Data Mining, 2nd Edition)

► Show Figures

Figure 1

Search Results (117)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (117)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI