Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (18)

Search Parameters:
Keywords = visible-infrared person re-identification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 33907 KB  
Article
GLCN: Graph-Aware Locality-Enhanced Cross-Modality Re-ID Network
by Junjie Cao, Yuhang Yu, Rong Rong and Xing Xie
J. Imaging 2026, 12(1), 42; https://doi.org/10.3390/jimaging12010042 - 13 Jan 2026
Viewed by 242
Abstract
Cross-modality person re-identification faces challenges such as illumination discrepancies, local occlusions, and inconsistent modality structures, leading to misalignment and sensitivity issues. We propose GLCN, a framework that addresses these problems by enhancing representation learning through locality enhancement, cross-modality structural alignment, and intra-modality compactness. [...] Read more.
Cross-modality person re-identification faces challenges such as illumination discrepancies, local occlusions, and inconsistent modality structures, leading to misalignment and sensitivity issues. We propose GLCN, a framework that addresses these problems by enhancing representation learning through locality enhancement, cross-modality structural alignment, and intra-modality compactness. Key components include the Locality-Preserved Cross-branch Fusion (LPCF) module, which combines Local–Positional–Channel Gating (LPCG) for local region and positional sensitivity; Cross-branch Context Interpolated Attention (CCIA) for stable cross-branch consistency; and Graph-Enhanced Center Geometry Alignment (GE-CGA), which aligns class-center similarity structures across modalities to preserve category-level relationships. We also introduce Intra-Modal Prototype Discrepancy Mining Loss (IPDM-Loss) to reduce intra-class variance and improve inter-class separation, thereby creating more compact identity structures in both RGB and IR spaces. Extensive experiments on SYSU-MM01, RegDB, and other benchmarks demonstrate the effectiveness of our approach. Full article
Show Figures

Figure 1

22 pages, 7733 KB  
Article
Parsing-Guided Differential Enhancement Graph Learning for Visible-Infrared Person Re-Identification
by Xingpeng Li, Huabing Liu, Chen Xue, Nuo Wang and Enwen Hu
Electronics 2025, 14(15), 3118; https://doi.org/10.3390/electronics14153118 - 5 Aug 2025
Viewed by 1040
Abstract
Visible-Infrared Person Re-Identification (VI-ReID) is of crucial importance in applications such as monitoring and security. However, challenges faced from intra-class variations and cross-modal differences are often exacerbated by inaccurate infrared analysis and insufficient structural modeling. To address these issues, we propose Parsing-guided Differential [...] Read more.
Visible-Infrared Person Re-Identification (VI-ReID) is of crucial importance in applications such as monitoring and security. However, challenges faced from intra-class variations and cross-modal differences are often exacerbated by inaccurate infrared analysis and insufficient structural modeling. To address these issues, we propose Parsing-guided Differential Enhancement Graph Learning (PDEGL), a novel framework that learns discriminative representations through a dual-branch architecture synergizing global feature refinement with part-based structural graph analysis. In particular, we introduce a Differential Infrared Part Enhancement (DIPE) module to correct infrared parsing errors and a Parsing Structural Graph (PSG) module to model high-order topological relationships between body parts for structural consistency matching. Furthermore, we design a Position-sensitive Spatial-Channel Attention (PSCA) module to enhance global feature discriminability. Extensive evaluations on the SYSU-MM01, RegDB, and LLCM datasets demonstrate that our PDEGL method achieves competitive performance. Full article
Show Figures

Figure 1

19 pages, 2976 KB  
Article
BiFFN: Bi-Frequency Guided Feature Fusion Network for Visible–Infrared Person Re-Identification
by Xingyu Cao, Pengxin Ding, Jie Li and Mei Chen
Sensors 2025, 25(5), 1298; https://doi.org/10.3390/s25051298 - 20 Feb 2025
Cited by 3 | Viewed by 1575
Abstract
Visible–infrared person re-identification (VI-ReID) aims to minimize the modality gaps of pedestrian images across different modalities. Existing methods primarily focus on extracting cross-modality features from the spatial domain, which often limits the comprehensive extraction of useful information. Compared with conventional approaches that either [...] Read more.
Visible–infrared person re-identification (VI-ReID) aims to minimize the modality gaps of pedestrian images across different modalities. Existing methods primarily focus on extracting cross-modality features from the spatial domain, which often limits the comprehensive extraction of useful information. Compared with conventional approaches that either focus on single-frequency components or employ simple multi-branch fusion strategies, our method fundamentally addresses the modality discrepancy through systematic frequency-space co-learning. To address this limitation, we propose a novel bi-frequency feature fusion network (BiFFN) that effectively extracts and fuses features from both high- and low-frequency domains and spatial domain features to reduce modality gaps. The network introduces a frequency-spatial enhancement (FSE) module to enhance feature representation across both domains. Additionally, the deep frequency mining (DFM) module optimizes cross-modality information utilization by leveraging distinct features of high- and low-frequency features. The cross-frequency fusion (CFF) module further aligns low-frequency features and fuses them with high-frequency features to generate middle features that incorporate critical information from each modality. To refine the distribution of identity features in the common space, we develop a unified modality center (UMC) loss, which promotes a more balanced inter-modality distribution while preserving discriminative identity information. Extensive experiments demonstrate that the proposed BiFFN achieves state-of-the-art performance in VI-ReID. Specifically, our method achieved a Rank-1 accuracy of 77.5% and an mAP of 75.9% on the SYSU-MM01 dataset under the all-search mode. Additionally, it achieved a Rank-1 accuracy of 58.5% and an mAP of 63.7% on the LLCM dataset under the IR-VIS mode. These improvements verify that our model, with the integration of feature fusion and the incorporation of frequency domains, significantly reduces modality gaps and outperforms previous methods. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

19 pages, 8290 KB  
Article
Multi-Scale Contrastive Learning with Hierarchical Knowledge Synergy for Visible-Infrared Person Re-Identification
by Yongheng Qian and Su-Kit Tang
Sensors 2025, 25(1), 192; https://doi.org/10.3390/s25010192 - 1 Jan 2025
Cited by 3 | Viewed by 2193
Abstract
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and [...] Read more.
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and infrared images. However, exclusively relying on high-level semantic information from the network’s final layers can restrict shared feature representations and overlook the benefits of low-level details. Different from these methods, we propose a multi-scale contrastive learning network (MCLNet) with hierarchical knowledge synergy for VI-ReID. MCLNet is a novel two-stream contrastive deep supervision framework designed to train low-level details and high-level semantic representations simultaneously. MCLNet utilizes supervised contrastive learning (SCL) at each intermediate layer to strengthen visual representations and enhance cross-modality feature learning. Furthermore, a hierarchical knowledge synergy (HKS) strategy for pairwise knowledge matching promotes explicit information interaction across multi-scale features and improves information consistency. Extensive experiments on three benchmarks demonstrate the effectiveness of MCLNet. Full article
Show Figures

Figure 1

14 pages, 5380 KB  
Article
Cross-Modality Person Re-Identification Method with Joint-Modality Generation and Feature Enhancement
by Yihan Bi, Rong Wang, Qianli Zhou, Zhaolong Zeng, Ronghui Lin and Mingjie Wang
Entropy 2024, 26(8), 681; https://doi.org/10.3390/e26080681 - 13 Aug 2024
Viewed by 1667
Abstract
In order to minimize the disparity between visible and infrared modalities and enhance pedestrian feature representation, a cross-modality person re-identification method is proposed, which integrates modality generation and feature enhancement. Specifically, a lightweight network is used for dimension reduction and augmentation of visible [...] Read more.
In order to minimize the disparity between visible and infrared modalities and enhance pedestrian feature representation, a cross-modality person re-identification method is proposed, which integrates modality generation and feature enhancement. Specifically, a lightweight network is used for dimension reduction and augmentation of visible images, and intermediate modalities are generated to bridge the gap between visible images and infrared images. The Convolutional Block Attention Module is embedded into the ResNet50 backbone network to selectively emphasize key features sequentially from both channel and spatial dimensions. Additionally, the Gradient Centralization algorithm is introduced into the Stochastic Gradient Descent optimizer to accelerate convergence speed and improve generalization capability of the network model. Experimental results on SYSU-MM01 and RegDB datasets demonstrate that our improved network model achieves significant performance gains, with an increase in Rank-1 accuracy of 7.12% and 6.34%, as well as an improvement in mAP of 4.00% and 6.05%, respectively. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

15 pages, 17295 KB  
Article
Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification
by Feng Zhou, Zhuxuan Cheng, Haitao Yang, Yifeng Song and Shengpeng Fu
Electronics 2024, 13(14), 2825; https://doi.org/10.3390/electronics13142825 - 18 Jul 2024
Cited by 2 | Viewed by 1919
Abstract
The visible-infrared person re-identification (VI-ReID) task aims to retrieve the same pedestrian between visible and infrared images. VI-ReID is a challenging task due to the huge modality discrepancy and complex intra-modality variations. Existing works mainly complete the modality alignment at one stage. However, [...] Read more.
The visible-infrared person re-identification (VI-ReID) task aims to retrieve the same pedestrian between visible and infrared images. VI-ReID is a challenging task due to the huge modality discrepancy and complex intra-modality variations. Existing works mainly complete the modality alignment at one stage. However, aligning modalities at different stages has positive effects on the intra-class and inter-class distances of cross-modality features, which are often ignored. Moreover, discriminative features with identity information may be corrupted in the processing of modality alignment, further degrading the performance of person re-identification. In this paper, we propose a progressive discriminative feature learning (PDFL) network that adopts different alignment strategies at different stages to alleviate the discrepancy and learn discriminative features progressively. Specifically, we first design an adaptive cross fusion module (ACFM) to learn the identity-relevant features via modality alignment with channel-level attention. For well preserving identity information, we propose a dual-attention-guided instance normalization module (DINM), which can well guide instance normalization to align two modalities into a unified feature space through channel and spatial information embedding. Finally, we generate multiple part features of a person to mine subtle differences. Multi-loss optimization is imposed during the training process for more effective learning supervision. Extensive experiments on the public datasets of SYSU-MM01 and RegDB validate that our proposed method performs favorably against most state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)
Show Figures

Figure 1

21 pages, 1046 KB  
Article
Transformer-Based Feature Compensation Network for Aerial Photography Person and Ground Object Recognition
by Guoqing Zhang, Chen Zheng and Zhonglin Ye
Remote Sens. 2024, 16(2), 268; https://doi.org/10.3390/rs16020268 - 10 Jan 2024
Cited by 1 | Viewed by 1642
Abstract
Visible-infrared person re-identification (VI-ReID) aims at matching pedestrian images with the same identity between different modalities. Existing methods ignore the problems of detailed information loss and the difficulty in capturing global features during the feature extraction process. To solve these issues, we propose [...] Read more.
Visible-infrared person re-identification (VI-ReID) aims at matching pedestrian images with the same identity between different modalities. Existing methods ignore the problems of detailed information loss and the difficulty in capturing global features during the feature extraction process. To solve these issues, we propose a Transformer-based Feature Compensation Network (TFCNet). Firstly, we design a Hierarchical Feature Aggregation (HFA) module, which recursively aggregates the hierarchical features to help the model preserve detailed information. Secondly, we design the Global Feature Compensation (GFC) module, which exploits Transformer’s ability to capture long-range dependencies in sequences to extract global features. Extensive results show that the rank-1/mAP of our method on the SYSU-MM01 and RegDB datasets reaches 60.87%/58.87% and 91.02%/75.06%, respectively, which is better than most existing excellent methods. Meanwhile, to demonstrate our method‘s transferability, we also conduct related experiments on two aerial photography datasets. Full article
Show Figures

Figure 1

16 pages, 7134 KB  
Article
Dual-Stage Attribute Embedding and Modality Consistency Learning-Based Visible–Infrared Person Re-Identification
by Zhuxuan Cheng, Huijie Fan, Qiang Wang, Shiben Liu and Yandong Tang
Electronics 2023, 12(24), 4892; https://doi.org/10.3390/electronics12244892 - 5 Dec 2023
Cited by 2 | Viewed by 1809
Abstract
Visible–infrared person re-identification (VI-ReID) is an emerging technology for realizing all-weather smart surveillance systems. To address the problem of pedestrian discriminative information being difficult to obtain and easy to lose, as well as the wide modality difference in the VI-ReID task, in this [...] Read more.
Visible–infrared person re-identification (VI-ReID) is an emerging technology for realizing all-weather smart surveillance systems. To address the problem of pedestrian discriminative information being difficult to obtain and easy to lose, as well as the wide modality difference in the VI-ReID task, in this paper we propose a two-stage attribute embedding and modality consistency learning-based VI-ReID method. First, the attribute information embedding module introduces the fine-grained pedestrian information in the attribute label into the transformer backbone, enabling the backbone to extract identity-discriminative pedestrian features. After obtaining the pedestrian features, the attribute embedding enhancement module is utilized to realize the second-stage attribute information embedding, which reduces the adverse effect of losing the person discriminative information due to the deepening of network. Finally, the modality consistency learning loss is designed for constraining the network to mine the consistency information between two modalities in order to reduce the impact of modality difference on the recognition results. The results show that our method reaches 74.57% mAP on the SYSU-MM01 dataset in All Search mode and 87.02% mAP on the RegDB dataset in IR-to-VIS mode, with a performance improvement of 6.00% and 2.56%, respectively, proving that our proposed method is able to reach optimal performance compared to existing state-of-the-art methods. Full article
(This article belongs to the Special Issue Lifelong Machine Learning-Based Efficient Robotic Object Perception)
Show Figures

Figure 1

16 pages, 3488 KB  
Article
Graph Sampling-Based Multi-Stream Enhancement Network for Visible-Infrared Person Re-Identification
by Jinhua Jiang, Junjie Xiao, Renlin Wang, Tiansong Li, Wenfeng Zhang, Ruisheng Ran and Sen Xiang
Sensors 2023, 23(18), 7948; https://doi.org/10.3390/s23187948 - 18 Sep 2023
Cited by 1 | Viewed by 1867
Abstract
With the increasing demand for person re-identification (Re-ID) tasks, the need for all-day retrieval has become an inevitable trend. Nevertheless, single-modal Re-ID is no longer sufficient to meet this requirement, making Multi-Modal Data crucial in Re-ID. Consequently, a Visible-Infrared Person Re-Identification (VI Re-ID) [...] Read more.
With the increasing demand for person re-identification (Re-ID) tasks, the need for all-day retrieval has become an inevitable trend. Nevertheless, single-modal Re-ID is no longer sufficient to meet this requirement, making Multi-Modal Data crucial in Re-ID. Consequently, a Visible-Infrared Person Re-Identification (VI Re-ID) task is proposed, which aims to match pairs of person images from the visible and infrared modalities. The significant modality discrepancy between the modalities poses a major challenge. Existing VI Re-ID methods focus on cross-modal feature learning and modal transformation to alleviate the discrepancy but overlook the impact of person contour information. Contours exhibit modality invariance, which is vital for learning effective identity representations and cross-modal matching. In addition, due to the low intra-modal diversity in the visible modality, it is difficult to distinguish the boundaries between some hard samples. To address these issues, we propose the Graph Sampling-based Multi-stream Enhancement Network (GSMEN). Firstly, the Contour Expansion Module (CEM) incorporates the contour information of a person into the original samples, further reducing the modality discrepancy and leading to improved matching stability between image pairs of different modalities. Additionally, to better distinguish cross-modal hard sample pairs during the training process, an innovative Cross-modality Graph Sampler (CGS) is designed for sample selection before training. The CGS calculates the feature distance between samples from different modalities and groups similar samples into the same batch during the training process, effectively exploring the boundary relationships between hard classes in the cross-modal setting. Some experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate the superiority of our proposed method. Specifically, in the VIS→IR task, the experimental results on the RegDB dataset achieve 93.69% for Rank-1 and 92.56% for mAP. Full article
(This article belongs to the Special Issue Multi-Modal Data Sensing and Processing)
Show Figures

Figure 1

19 pages, 5812 KB  
Article
Cross-Modality Person Re-Identification Algorithm Based on Two-Branch Network
by Jianfeng Song, Jin Yang, Chenyang Zhang and Kun Xie
Electronics 2023, 12(14), 3193; https://doi.org/10.3390/electronics12143193 - 24 Jul 2023
Cited by 3 | Viewed by 3291
Abstract
Person re-identification is the technique of identifying the same person in different camera shots, known as ReID for short. Most existing models focus on single-modality person re-identification involving only visible images. However, the visible modality is not suitable for low-light environments or at [...] Read more.
Person re-identification is the technique of identifying the same person in different camera shots, known as ReID for short. Most existing models focus on single-modality person re-identification involving only visible images. However, the visible modality is not suitable for low-light environments or at night, when crime is frequent. In contrast, infrared images can reflect the nighttime environment, and most surveillance systems are equipped with dual-mode cameras that can automatically switch between visible and infrared modalities based on light conditions. In contrast to visible-light cameras, infrared (IR) cameras can still capture enough information from the scene in those dark environments. Therefore, the problem of visible-infrared cross-modality person re-identification (VI-ReID) is proposed. To improve the identification rate of cross-modality person re-identification, a cross-modality person re-identification method based on a two-branch network is proposed. Firstly, we use infrared image colorization technology to convert infrared images into color images to reduce the differences between modalities and propose a visible-infrared cross-modality person re-identification algorithm based on Two-Branch Network with Double Constraints (VI-TBNDC), which consists of two main components: a two-branch network for feature extraction and a double-constrained identity loss for feature learning. The two-branch network extracts the features of both data sets separately, and the double-constrained identity loss ensures that the learned feature representations are discriminative enough to distinguish different people from two different patterns. The effectiveness of the proposed method is verified by extensive experimental analysis, and the method achieves good recognition accuracy on the visible-infrared image person re-identification standard dataset SYSU-MM01. Full article
Show Figures

Figure 1

16 pages, 4585 KB  
Article
Joint Modal Alignment and Feature Enhancement for Visible-Infrared Person Re-Identification
by Ronghui Lin, Rong Wang, Wenjing Zhang, Ao Wu and Yihan Bi
Sensors 2023, 23(11), 4988; https://doi.org/10.3390/s23114988 - 23 May 2023
Cited by 2 | Viewed by 2479
Abstract
Visible-infrared person re-identification aims to solve the matching problem between cross-camera and cross-modal person images. Existing methods strive to perform better cross-modal alignment, but often neglect the critical importance of feature enhancement for achieving better performance. Therefore, we proposed an effective method that [...] Read more.
Visible-infrared person re-identification aims to solve the matching problem between cross-camera and cross-modal person images. Existing methods strive to perform better cross-modal alignment, but often neglect the critical importance of feature enhancement for achieving better performance. Therefore, we proposed an effective method that combines both modal alignment and feature enhancement. Specifically, we introduced Visible-Infrared Modal Data Augmentation (VIMDA) for visible images to improve modal alignment. Margin MMD-ID Loss was also used to further enhance modal alignment and optimize model convergence. Then, we proposed Multi-Grain Feature Extraction (MGFE) Structure for feature enhancement to further improve recognition performance. Extensive experiments have been carried out on SYSY-MM01 and RegDB. The result indicates that our method outperforms the current state-of-the-art method for visible-infrared person re-identification. Ablation experiments verified the effectiveness of the proposed method. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

29 pages, 4869 KB  
Review
Person Re-Identification with RGB–D and RGB–IR Sensors: A Comprehensive Survey
by Md Kamal Uddin, Amran Bhuiyan, Fateha Khanam Bappee, Md Matiqul Islam and Mahmudul Hasan
Sensors 2023, 23(3), 1504; https://doi.org/10.3390/s23031504 - 29 Jan 2023
Cited by 15 | Viewed by 6738
Abstract
Learning about appearance embedding is of great importance for a variety of different computer-vision applications, which has prompted a surge in person re-identification (Re-ID) papers. The aim of these papers has been to identify an individual over a set of non-overlapping cameras. Despite [...] Read more.
Learning about appearance embedding is of great importance for a variety of different computer-vision applications, which has prompted a surge in person re-identification (Re-ID) papers. The aim of these papers has been to identify an individual over a set of non-overlapping cameras. Despite recent advances in RGB–RGB Re-ID approaches with deep-learning architectures, the approach fails to consistently work well when there are low resolutions in dark conditions. The introduction of different sensors (i.e., RGB–D and infrared (IR)) enables the capture of appearances even in dark conditions. Recently, a lot of research has been dedicated to addressing the issue of finding appearance embedding in dark conditions using different advanced camera sensors. In this paper, we give a comprehensive overview of existing Re-ID approaches that utilize the additional information from different sensor-based methods to address the constraints faced by RGB camera-based person Re-ID systems. Although there are a number of survey papers that consider either the RGB–RGB or Visible-IR scenarios, there are none that consider both RGB–D and RGB–IR. In this paper, we present a detailed taxonomy of the existing approaches along with the existing RGB–D and RGB–IR person Re-ID datasets. Then, we summarize the performance of state-of-the-art methods on several representative RGB–D and RGB–IR datasets. Finally, future directions and current issues are considered for improving the different sensor-based person Re-ID systems. Full article
(This article belongs to the Section Sensors Development)
Show Figures

Figure 1

25 pages, 1407 KB  
Review
Survey of Cross-Modal Person Re-Identification from a Mathematical Perspective
by Minghui Liu, Yafei Zhang and Huafeng Li
Mathematics 2023, 11(3), 654; https://doi.org/10.3390/math11030654 - 28 Jan 2023
Cited by 8 | Viewed by 4721
Abstract
Person re-identification (Re-ID) aims to retrieve a particular pedestrian’s identification from a surveillance system consisting of non-overlapping cameras. In recent years, researchers have begun to focus on open-world person Re-ID tasks based on non-ideal situations. One of the most representative of these is [...] Read more.
Person re-identification (Re-ID) aims to retrieve a particular pedestrian’s identification from a surveillance system consisting of non-overlapping cameras. In recent years, researchers have begun to focus on open-world person Re-ID tasks based on non-ideal situations. One of the most representative of these is cross-modal person Re-ID, which aims to match probe data with target data from different modalities. According to the modalities of probe and target data, we divided cross-modal person Re-ID into visible–infrared, visible–depth, visible–sketch, and visible–text person Re-ID. In cross-modal person Re-ID, the most challenging problem is the modal gap. According to the different methods of narrowing the modal gap, we classified the existing works into picture-based style conversion methods, feature-based modality-invariant embedding mapping methods, and modality-unrelated auxiliary information mining methods. In addition, by generalizing the aforementioned works, we find that although deep-learning-based models perform well, the black-box-like learning process makes these models less interpretable and generalized. Therefore, we attempted to interpret different cross-modal person Re-ID models from a mathematical perspective. Through the above work, we attempt to compensate for the lack of mathematical interpretation of models in previous person Re-ID reviews and hope that our work will bring new inspiration to researchers. Full article
Show Figures

Figure 1

17 pages, 1463 KB  
Article
Margin-Based Modal Adaptive Learning for Visible-Infrared Person Re-Identification
by Qianqian Zhao, Hanxiao Wu and Jianqing Zhu
Sensors 2023, 23(3), 1426; https://doi.org/10.3390/s23031426 - 27 Jan 2023
Cited by 4 | Viewed by 3200
Abstract
Visible-infrared person re-identification (VIPR) has great potential for intelligent transportation systems for constructing smart cities, but it is challenging to utilize due to the huge modal discrepancy between visible and infrared images. Although visible and infrared data can appear to be two domains, [...] Read more.
Visible-infrared person re-identification (VIPR) has great potential for intelligent transportation systems for constructing smart cities, but it is challenging to utilize due to the huge modal discrepancy between visible and infrared images. Although visible and infrared data can appear to be two domains, VIPR is not identical to domain adaptation as it can massively eliminate modal discrepancies. Because VIPR has complete identity information on both visible and infrared modalities, once the domain adaption is overemphasized, the discriminative appearance information on the visible and infrared domains would drain. For that, we propose a novel margin-based modal adaptive learning (MMAL) method for VIPR in this paper. On each domain, we apply triplet and label smoothing cross-entropy functions to learn appearance-discriminative features. Between the two domains, we design a simple yet effective marginal maximum mean discrepancy (M3D) loss function to avoid an excessive suppression of modal discrepancies to protect the features’ discriminative ability on each domain. As a result, our MMAL method could learn modal-invariant yet appearance-discriminative features for improving VIPR. The experimental results show that our MMAL method acquires state-of-the-art VIPR performance, e.g., on the RegDB dataset in the visible-to-infrared retrieval mode, the rank-1 accuracy is 93.24% and the mean average precision is 83.77%. Full article
Show Figures

Figure 1

11 pages, 416 KB  
Article
Minimizing Maximum Feature Space Deviation for Visible-Infrared Person Re-Identification
by Zhixiong Wu and Tingxi Wen
Appl. Sci. 2022, 12(17), 8792; https://doi.org/10.3390/app12178792 - 1 Sep 2022
Cited by 2 | Viewed by 1876
Abstract
Visible-infrared person re-identification (VIPR) has great potential for intelligent video surveillance systems at night, but it is challenging due to the huge modal gap between visible and infrared modalities. For that, this paper proposes a minimizing maximum feature space deviation (MMFSD) method for [...] Read more.
Visible-infrared person re-identification (VIPR) has great potential for intelligent video surveillance systems at night, but it is challenging due to the huge modal gap between visible and infrared modalities. For that, this paper proposes a minimizing maximum feature space deviation (MMFSD) method for VIPR. First, this paper calculates visible and infrared feature centers of each identity. Second, this paper defines feature space deviations based on these feature centers to measure the modal gap between visible and infrared modalities. Third, this paper minimizes the maximum feature space deviation to significantly reduce the modal gap between visible and infrared modalities. Experimental results show the superiority of the proposed method, e.g., on the RegDB dataset, the rank-1 accuracy reaches 92.19%. Full article
(This article belongs to the Special Issue Recent Applications of Computer Vision for Automation and Robotics)
Show Figures

Figure 1

Back to TopTop