Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessReview

Peer-Review Record

Causality and “In-the-Wild” Video-Based Person Re-Identification: A Survey

Electronics 2025, 14(13), 2669; https://doi.org/10.3390/electronics14132669

by Md Rashidunnabi^1,2,*

, Kailash Hambarde¹

and Hugo Proença^1,2

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Electronics 2025, 14(13), 2669; https://doi.org/10.3390/electronics14132669

Submission received: 26 May 2025 / Revised: 22 June 2025 / Accepted: 28 June 2025 / Published: 1 July 2025

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper reviews causal reasoning in video-based person Re-ID, addressing real-world challenges by isolating identity features from confounders like clothing and lighting through structural causal models.

What does 'In-the-Wild' in the title refer to?
The taxonomy mentioned in the abstract and contributions on page 2 is only appears twice.
In Section 2.3, the quantitative analysis of traditional methods could be provided.
The values 32% and 8% in Figure 4 should be explained.
The functions are not labeled.
In section 5.1, the “adversarial disentanglement” needs a detailed discussion as the counterfactual interventions.
The data presented in Sections 7 and 8 lack supporting references or empirical evidence from the literature.
The citations in reference could be updated.
The acknowledgment section does not need to be numbered.

Author Response

We sincerely thank the reviewer for their thorough and constructive feedback, which has significantly improved the quality and clarity of our manuscript. We have carefully addressed all suggestions and made the corresponding revisions as detailed below. We have made all the changes in the revised version of the manuscript.

Comments 1: What does 'In-the-Wild' in the title refer to?

Response 1: In the revised manuscript we have clarified the meaning by adding the explanatory sentence "Here, 'in-the-wild' refers to unconstrained, real-world surveillance scenarios that exhibit large variations in illumination, viewpoint, occlusion, weather and attire" in the Introduction where the term first appears.

Comments 2: The taxonomy mentioned in the abstract and contributions on page 2 is only appears twice.

Response 2: In the revised manuscript we have added a comprehensive new Section 4 "Taxonomy of Causal Video-based Person Re-ID Methods" that systematically categorizes methods into three families: (i) generative disentanglement, (ii) domain-invariant causal modeling, and (iii) causal transformer architectures, with detailed analysis and cross-references throughout the manuscript.

Comments 3: In Section 2.3, the quantitative analysis of traditional methods could be provided. Response 3: Meaningful quantitative comparisons would require re-implementing all approaches under identical conditions, constituting original experimental work beyond this survey's scope. Different papers use varying protocols making direct comparisons potentially misleading. We focus on qualitative analysis to elucidate conceptual foundations.

Comments 4: The values 32% and 8% in Figure 4 should be explained.

Response 4: In the revised manuscript we have clarified in Figure 4's caption and Section 3.1 that these are illustrative values demonstrating conceptual differences between correlation-based and causal approaches, not experimental results.

Comments 5: The functions are not labeled.

Response 5: In the revised manuscript we have added comprehensive mathematical notation and explicit function definitions throughout the manuscript, including 12 numbered equations with complete component labels in all major figures.

Comments 6: In section 5.1, the "adversarial disentanglement" needs a detailed discussion as the counterfactual interventions.

Response 6: In the revised manuscript we have substantially expanded Section 6 with theoretical foundations, optimization objectives (Equations 13-18), three-phase training process, architectural analysis, empirical evidence, and implementation challenges to match the depth provided for counterfactual interventions.

Comments 7: The data presented in Sections 7 and 8 lack supporting references or empirical evidence from the literature. Response 7: In the revised manuscript we have added extensive supporting references throughout Sections 7 and 8 for all technical claims including causal method effectiveness, scalability challenges, fairness concerns, interpretability limitations, and privacy issues, with direct citations to peer-reviewed sources. It is being noted that other reviewer has suggested to merge two sections together. Now, I have added these two sections together.

Comments 8: The citations in reference could be updated.

Response 8: We will update the bibliography by replacing arXiv preprints with published versions where available, incorporating key 2024-2025 works, and removing superseded entries to ensure current and comprehensive coverage.

Comments 9: The acknowledgment section does not need to be numbered.

Response 9: In the revised manuscript we have changed the Acknowledgements to an unnumbered section using \section*{} format in accordance with journal style guidelines.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1. As a survey article, it does not propose new algorithmic frameworks or evaluation methods, but rather summarizes existing work. It is recommended to strengthen the authors' own perspectives in the conclusion section, such as by offering more structured and specific suggestions for future model design.

2. The description in Chapter 2 on "Traditional Approaches" is somewhat lengthy. It is advised to streamline this section and focus more on the contrast with causal modeling.

Author Response

Comments 1: As a survey article, it does not propose new algorithmic frameworks or evaluation methods, but rather summarizes existing work. It is recommended to strengthen the authors' own perspectives in the conclusion section, such as by offering more structured and specific suggestions for future model design.

Response 1: In the revised manuscript we have strengthened our perspective by adding an "Author Recommendations for Next-Generation Models" paragraph to the conclusion with three concrete design guidelines: (i) adopt modular SCM-first pipelines separating identity, domain and noise factors; (ii) couple counterfactual training with lightweight shift-equivariant backbones; and (iii) evaluate with cross-modal, open-set protocols that surface failure modes early.

Comments 2: The description in Chapter 2 on "Traditional Approaches" is somewhat lengthy. It is advised to streamline this section and focus more on the contrast with causal modeling.

Response 2: In the revised manuscript we have streamlined Section 2 by condensing technical descriptions and adding a dedicated "Critical Limitations vs. Causal Approaches" subsection that directly contrasts traditional weaknesses (spurious correlations, lack of invariance) with causal solutions (SCM-based disentanglement, counterfactual reasoning). This reduces section length by ~40% while emphasizing the paradigmatic shift from correlation-based to causal approaches.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The article is a survey on the role of causal reasoning as an
alternative to correlation-based approaches in video-based person
re-identification.

To ensure that a wider audience knows what the work is about, the
title could have "re-identification" instead of "Re-ID".

Below are comments on each paper section.

1 Introduction

The introduction begins by clearly explaining the difference between
re-identification in still images and in video. It comments on the
fragility of existing systems when applied to real situations and
advocates the use of causal methods, explaining clearly what they
are. It then clearly announces the contributions of the review.

Figure 2 should be cited in the text.

The last paragraph of the introduction announces the following
sections, except Section 6.

2. Fundamentals of Person Re-Identification

Figure 3 mentions LSTM, but it is not cited in the text.

This section goes into much more detail about some of the concepts
presented in the introduction. It details the disadvantages of
traditional methods and, again, advocates for the causal inference
methods. It goes through the characteristics of visual attributes,
evaluation metrics and common datasets well. It is a good section on
fundamentals.

3. Causal Foundations for Person Re-Identification

In Figure 7, shouldn't the label for the center be
(b) Conterfactual ...?

The foundations of causal methods are well explained and, once again,
their use is advocated instead of correlation methods.

4. State-of-the-Art Methods

Figure 8 should be cited in the text.

State-of-the-art methods are well presented, in good detail, and well
evaluated.

5. Causal Disentanglement in Video-Based Person Re-Identification

This section presents well the practical techniques used to achieve
causal disentanglement in video-based person Re-ID systems and their
results.

6. Discussion

This section discusses well the main challenges and limits of the best
techniques presented in the previous sections.

7. Challenges and Open Problems

This section has good content. Due to its nature, it could be merged
with the previous section to form a single section called "Discussion
and Challenges".

8. Future Directions

This section makes good suggestions for future directions, especially
regarding hardware usage and the use of self-supervised learning.

9. Conclusion

The conclusion followed logically from the rest of the work.

References

In several places, "Proceedings of the Proceedings of" should be
replaced by "Proceedings of".

The list of references is extensive, comprehensive, adequate and
up-to-date.

Author Response

Comments 1: The title should use the full word "re-identification" instead of the abbreviation "Re-ID" so a wider audience immediately understands the topic.

Response 1: In the revised manuscript we have updated the title to use the full word "re-identification" instead of "Re-ID" throughout the document to ensure broader accessibility and immediate understanding of the topic.

Comments 2: Figure 2 is not cited in the Introduction text.

Response 2: In the revised manuscript we have added a citation to Figure 2 in the Introduction section where the figure content is discussed, ensuring all figures are properly referenced in the text.

Comments 3: The roadmap paragraph at the end of the Introduction lists every section except Section 6.

Response 3: In the revised manuscript we have updated the roadmap paragraph at the end of the Introduction to include the missing reference to Section 6, providing a complete overview of the document structure.

Comments 4: Figure 3 mentions "LSTM," but the acronym is never introduced in the surrounding text.

Response 4: In the revised manuscript we have introduced the LSTM acronym (Long Short-Term Memory) in the text surrounding Figure 3 and expanded all acronyms in the figure caption to improve readability.

Comments 5: In Figure 7 the centre label reads "Conterfactual" instead of "Counterfactual."

Response 5: The typo "Conterfactual" in Figure 7 has been corrected to "Counterfactual."

Comments 6: Figure 8 is not cited in the text.

Response 6: In the revised manuscript we have added a citation to Figure 8 (fig:Memory_Attention_Disentanglement) in the "Memory and Attention Mechanisms for Causal Disentanglement" subsection where the figure content is discussed. In the revised document it is figure 9.

Comments 7: Sections 6 ("Discussion") and 7 ("Challenges and Open Problems") could be merged into one section. Response 7: In the revised manuscript we have successfully merged the "Discussion" and "Challenges and Open Problems" sections into a single comprehensive "Discussion" section that addresses both current advancements and key challenges in the field.

Comments 8: Several references contain the redundant phrase "Proceedings of the Proceedings of."

Response 8: In the revised manuscript we have corrected the redundant "Proceedings of the Proceedings of" phrases by removing "Proceedings of the" from the booktitle fields in the bibliography entries, as the bibliography style automatically adds the appropriate prefix.

Author Response File: Author Response.pdf

Article Menu

Causality and “In-the-Wild” Video-Based Person Re-Identification: A Survey

Further Information

Guidelines

MDPI Initiatives

Follow MDPI