Next Article in Journal
NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
Next Article in Special Issue
Multimodal Attention Dynamic Fusion Network for Facial Micro-Expression Recognition
Previous Article in Journal
A Multiscale Recursive Attention Gate Federation Method for Multiple Working Conditions Fault Diagnosis
Previous Article in Special Issue
AM3F-FlowNet: Attention-Based Multi-Scale Multi-Branch Flow Network
 
 
Article
Peer-Review Record

Task-Decoupled Knowledge Transfer for Cross-Modality Object Detection

Entropy 2023, 25(8), 1166; https://doi.org/10.3390/e25081166
by Chiheng Wei, Lianfa Bai, Xiaoyu Chen * and Jing Han *
Reviewer 2: Anonymous
Entropy 2023, 25(8), 1166; https://doi.org/10.3390/e25081166
Submission received: 19 April 2023 / Revised: 27 May 2023 / Accepted: 2 August 2023 / Published: 4 August 2023
(This article belongs to the Special Issue Machine and Deep Learning for Affective Computing)

Round 1

Reviewer 1 Report

This research article presents a novel approach to cross-modality object detection that uses infrared modality as a supplement or replacement for visible modality. The authors investigate the impact of various task-relevant features on cross-modality object detection and suggest a knowledge transfer algorithm based on classification and localization decoupling analysis. The paper proposes a task-decoupled pre-training method to adjust the attributes of various tasks learned by the pre-training model and a task-relevant hyperparameter evolution method to increase the network's adaptability to attribute changes in pre-training weights. The results show that the proposed method improves the accuracy of multiple modalities in multiple datasets and reaches the state-of-the-art level in the FLIR ADAS dataset.

Overall, this is a well-written and informative article that presents a novel approach to cross-modality object detection. The methodology is well-explained, and the results are presented clearly and convincingly. The authors provide sufficient detail on their experiments and analysis, and the conclusion is well-supported by the results. The research presented in this article is valuable and can have a significant impact on the field of cross-modality object detection. Therefore, I would recommend a last English proofreading of the entire article before accepting the manuscript.

Overall the article can be followed. However, a last English proofreading with a native speaker is suggested to increase the quality of the manuscript.

Author Response

Please see the attachment. 

 

Author Response File: Author Response.docx

Reviewer 2 Report

1. What is the red box in Figure 1? What is used for? What is the heatmap? How to obtain it?

2. The notations in equations 1 and 2 should be defined in detail.

3. In equation 3, three coefficients were added into the loss function. How to obtain these three coefficients? How to guarantte the loss function can learn more from regression features? Please justify it.

4. Which part is used to decouple classification and localization features? Equation 3? They are still integrated in the loss function.

5. What do the curves mean in Figure 5.

 

NA

Author Response

Please see the attachment.

 

Author Response File: Author Response.docx

Back to TopTop