Next Article in Journal
A Color Matching Method for Mosaic HY-1 Satellite Images in Antarctica
Previous Article in Journal
Estimating Soil Organic Matter (SOM) Using Proximal Remote Sensing: Performance Evaluation of Prediction Models Adjusted at Local Scale in the Brazilian Cerrado
 
 
Article
Peer-Review Record

Real-Time Segmentation of Artificial Targets Using a Dual-Modal Efficient Attention Fusion Network

Remote Sens. 2023, 15(18), 4398; https://doi.org/10.3390/rs15184398
by Ying Shen, Xiancai Liu, Shuo Zhang, Yixuan Xu, Dawei Zeng, Shu Wang * and Feng Huang *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Remote Sens. 2023, 15(18), 4398; https://doi.org/10.3390/rs15184398
Submission received: 6 July 2023 / Revised: 27 August 2023 / Accepted: 31 August 2023 / Published: 7 September 2023
(This article belongs to the Section Remote Sensing Image Processing)

Round 1

Reviewer 1 Report

The article is well organized,but it would be great if more ablation experiments were added. like 1,2,3,1&2,1&3,2&3,1&2&3....

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Please see the comments in the uploaded file.

Comments for author File: Comments.pdf

Minor editing of English language required.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The article begins by highlighting the limitations of existing spectral-polarimetric systems in terms of image sampling resolution, which often leads to the loss of crucial target information. Furthermore, the authors acknowledge the deficiency in current segmentation algorithms, which fail to properly utilize the multimodal features of the collected data, ultimately leading to reduced accuracy and robustness.

To tackle these issues, the ESPFNet is introduced as a real-time segmentation algorithm specifically designed for spectral-polarimetric camouflage target detection. The network's architecture is based on an encoder-decoder structure that efficiently combines the feature information from both spectral feature images and polarization encoded images. The Coordination Attention Bimodal Fusion (CABF) module is introduced to harmonize positional and channel information across different stages of the encoding layer, enhancing the integration of spectral and polarization features. The Complex Atrous Spatial Pyramid Pooling (CASPP) module connects the encoder and decoder, improving the overall feature information output. Additionally, the Residual Decoding Block (RDB) in the decoder extracts fused multi-scale features, further enhancing segmentation performance.

 

In Introduction, authors explained the proposed used technology. Introduction is written very well.

The potential applications of this research are vast, with implications in enabling autonomous reconnaissance by UAVs in complex scenes. The proposed ESPFNet algorithm could significantly enhance the efficiency and accuracy of target detection, making it an invaluable asset for military, surveillance, and search-and-rescue operations. By leveraging spectral-polarimetric information and implementing a novel fusion network architecture, the authors have provided an innovative solution to address critical challenges in target detection and segmentation.

In conclusion, the article presents a well-structured and comprehensive study that introduces the ESPFNet algorithm for improved UAV reconnaissance capabilities in detecting camouflaged targets. Through a combination of innovative architecture, careful feature fusion, and rigorous experimentation, the proposed method demonstrates its potential to revolutionize target detection in complex scenarios.

 

Add a statistical significance test to assess if the differences between the compared models are really significant. If your results data are not parametric, you could use the Sign Test or Wilcoxon or Friedman tests. 

 

I have objections to the discussion section. The authors need to re-organize ,the results and discussion therein to better highlight to the reader what was done and what is relevant. The gain of the presented technique for the addressed application should be made more explicit in the form: What do the findings allow what was possible before. Authors should discuss the results and how they can be interpreted in perspective of previous studies and of the working hypotheses.

Conclusions are correct.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Review of the manuscript:

Real-time Dual-Modal Segmentation of Camouflage Targets using an Efficient Spectral-Polarimetric Attention Fusion Net- work


1.The findings
are sufficiently novel to warrant publication.


2. The conclusions are adequately supported by the data presented.

3. The article is clearly and logically written so that it can be understood by one who is not an expert in the specific field. The work provides an important contribution to its field, consistent with the scope of the journal.

The paper is describing the actual problematics. This study proposes the joint use of the snapshot spectral camera and the snapshot polarization camera to acquire spectral-polarimetric images. These images are registered and used as inputs to improve the spatial resolution in detecting multiple targets in complex backgrounds while reducing the impact of illumination variations. Furthermore, the study introduces an Efficient Spectral-polarimetric Fusion Network (ESPFNet) for real-time semantic segmentation, which incorporates a highly efficient attention mechanism. The network utilizes a Coordination Attention Bi-modal Fusion (CABF) module based on position attention to capture both the spectral and the polarization information of the targets. Additionally, a Complex Atrous Spatial Pyramid Pooling (CASPP) module is proposed to enhance the feature extraction capability. The network also incorporates a Residual Decoding Block (RDB) to extract fused features and improve segmentation performance. To facilitate the research, a dataset of spectral-polarimetric images on ground camouflage targets is acquired using UAVs equipped with both the snapshot spectral camera and the snapshot polarization camera.


Comments:

Chapter 2.2, equation 1: please introduce the citation of the equation.

Page 5: Please introduce applied Newton polynomial interpolation algorithm equation.

Chapter 3.1.: Please describe the dataset SPICO in detail (condition of snapshotting, iris, time) Please introduce the producer and country of producer of the cameras.

Chapter 3.2.2: Please introduce the citation of the equations (9,10.11).

Page 10: Please explain DoLP and IS images.

Page 13: Is the improving the MIoU by 1.0% and 0.90%, and the MPA by 0.90% and 0.80% of ESPFNet sufficient?

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The comments have been handled.

Minor editing of English language required.

Back to TopTop