Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (3)

Search Parameters:
Keywords = multi-modal re-identification metric

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1672 KiB  
Article
Pedestrian Re-Identification Based on Fine-Grained Feature Learning and Fusion
by Anming Chen and Weiqiang Liu
Sensors 2024, 24(23), 7536; https://doi.org/10.3390/s24237536 - 26 Nov 2024
Cited by 1 | Viewed by 1038
Abstract
Video-based pedestrian re-identification (Re-ID) is used to re-identify the same person across different camera views. One of the key problems is to learn an effective representation for the pedestrian from video. However, it is difficult to learn an effective representation from one single [...] Read more.
Video-based pedestrian re-identification (Re-ID) is used to re-identify the same person across different camera views. One of the key problems is to learn an effective representation for the pedestrian from video. However, it is difficult to learn an effective representation from one single modality of a feature due to complicated issues with video, such as background, occlusion, and blurred scenes. Therefore, there are some studies on fusing multimodal features for video-based pedestrian Re-ID. However, most of these works fuse features at the global level, which is not effective in reflecting fine-grained and complementary information. Therefore, the improvement in performance is limited. To obtain a more effective representation, we propose to learn fine-grained features from different modalities of the video, and then they are aligned and fused at the fine-grained level to capture rich semantic information. As a result, a multimodal token-learning and alignment model (MTLA) is proposed to re-identify pedestrians across camera videos. An MTLA consists of three modules, i.e., a multimodal feature encoder, token-based cross-modal alignment, and correlation-aware fusion. Firstly, the multimodal feature encoder is used to extract the multimodal features from the visual appearance and gait information views, and then fine-grained tokens are learned and denoised from these features. Then, the token-based cross-modal alignment module is used to align the multimodal features at the token level to capture fine-grained semantic information. Finally, the correlation-aware fusion module is used to fuse the multimodal token features by learning the inter- and intra-modal correlation, in which the features refine each other and a unified representation is obtained for pedestrian Re-ID. To evaluate the performance of fine-grained features alignment and fusion, we conduct extensive experiments on three benchmark datasets. Compared with the state-of-art approaches, all the evaluation metrices of mAP and Rank-K are improved by more than 0.4 percentage points. Full article
(This article belongs to the Special Issue Sensor-Based Behavioral Biometrics)
Show Figures

Figure 1

25 pages, 5475 KiB  
Article
Lightweight Multimodal Domain Generic Person Reidentification Metric for Person-Following Robots
by Muhammad Adnan Syed, Yongsheng Ou, Tao Li and Guolai Jiang
Sensors 2023, 23(2), 813; https://doi.org/10.3390/s23020813 - 10 Jan 2023
Cited by 6 | Viewed by 2446
Abstract
Recently, person-following robots have been increasingly used in many real-world applications, and they require robust and accurate person identification for tracking. Recent works proposed to use re-identification metrics for identification of the target person; however, these metrics suffer due to poor generalization, and [...] Read more.
Recently, person-following robots have been increasingly used in many real-world applications, and they require robust and accurate person identification for tracking. Recent works proposed to use re-identification metrics for identification of the target person; however, these metrics suffer due to poor generalization, and due to impostors in nonlinear multi-modal world. This work learns a domain generic person re-identification to resolve real-world challenges and to identify the target person undergoing appearance changes when moving across different indoor and outdoor environments or domains. Our generic metric takes advantage of novel attention mechanism to learn deep cross-representations to address pose, viewpoint, and illumination variations, as well as jointly tackling impostors and style variations the target person randomly undergoes in various indoor and outdoor domains; thus, our generic metric attains higher recognition accuracy of target person identification in complex multi-modal open-set world, and attains 80.73% and 64.44% Rank-1 identification in multi-modal close-set PRID and VIPeR domains, respectively. Full article
(This article belongs to the Special Issue Recognition Robotics)
Show Figures

Figure 1

18 pages, 3672 KiB  
Article
Multi-Modal Long-Term Person Re-Identification Using Physical Soft Bio-Metrics and Body Figure
by Nadeen Shoukry, Mohamed A. Abd El Ghany and Mohammed A.-M. Salem
Appl. Sci. 2022, 12(6), 2835; https://doi.org/10.3390/app12062835 - 10 Mar 2022
Cited by 4 | Viewed by 3967
Abstract
Person re-identification is the task of recognizing a subject across different non-overlapping cameras across different views and times. Most state-of-the-art datasets and proposed solutions tend to address the problem of short-term re-identification. Those models can re-identify a person as long as they are [...] Read more.
Person re-identification is the task of recognizing a subject across different non-overlapping cameras across different views and times. Most state-of-the-art datasets and proposed solutions tend to address the problem of short-term re-identification. Those models can re-identify a person as long as they are wearing the same clothes. The work presented in this paper addresses the task of long-term re-identification. Therefore, the proposed model is trained on a dataset that incorporates clothes variation. This paper proposes a multi-modal person re-identification model. The first modality includes soft bio-metrics: hair, face, neck, shoulders, and part of the chest. The second modality is the remaining body figure that mainly focuses on clothes. The proposed model is composed of two separate neural networks, one for each modality. For the first modality, a two-stream Siamese network with pre-trained FaceNet as a feature extractor for the first modality is utilized. Part-based Convolutional Baseline classifier with a feature extractor network OSNet for the second modality. Experiments confirm that the proposed model can outperform several state-of-the-art models achieving 81.4 % accuracy on Rank-1, 82.3% accuracy on Rank-5, 83.1% accuracy on Rank-10, and 83.7% accuracy on Rank-20. Full article
(This article belongs to the Special Issue Modern Computer Vision and Image Processing)
Show Figures

Figure 1

Back to TopTop