A Multi-Scale Mask Convolution-Based Blind-Spot Network for Hyperspectral Anomaly Detection
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors
The paper proposes a blind spot network based on multi-scale blind spot convolution for HAD. The model was compared to several state of art methods. A mask convolution module combined with PD operation to adapt to the detection of abnormal objects at different scale. The experimental results reveal the outperformance of the model with several accuracy assessment metrics. Revisions and recommendations have been proposed to the author to enhance the quality and presentation of the manuscript. The following revisions are provided for the authors.
1. Abstract: the sentence in line 22 is very long and perplexing, revise it.
2. The use of multi scale mask convolution is prominent in HSI processing. Please use the paper with the Digital Object Identifier 10.1109/JSTARS.2024.3352080 to discuss your results. Indeed, this paper reveals insights about the aforementioned approach in HSI unmixing.
3. Attempts to Enhance the exploitation of the spectral-spatial information have been performed with model as S2DWMTrans DOI 10.1109/JSTARS.2022.3232762. You can use it to benchmark your proposed method.
4. Explain the selection and choice of the nine used state-of-art models for comparison.
5. Add validation section for the readers, describing the used method for assessing your results.
6. The complexity of the proposed blind spot network, which includes multi-scale blind spot convolution, dynamic fusion module, spatial-spectral joint module, and preprocessing technique, may result in high computational costs due to its computationally intensive nature. That could be assessed within the methodology or highlighted in a discussion section.
7. Some typo errors in the manuscript need to be checked. Ex: Line 144, Line 214: re-write properly, Line 400: needs specificity.
8. Revise line 524: In summary, our proposed MMBSN has demonstrated remarkable performance in 524 visualization and qualitative as well as quantitative evaluation.
9. Revise line 568: reverse “In general” and ‘’the preprocessing’’. (References are needed in the paragraph).
10. Add the perspectives of the current work.
Comments on the Quality of English Language
Minor editing of English language required
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsGeneral Comment:
In this paper, the authors propose a new blind spot network based on multi-scale mask convolution for HAD, the network can adapt to the detection of multi-scale anomaly targets and achieve excellent performance on four data sets. However, there are many shortcomings in both the description of the method and the experimental aspects, please check the detail comments. This paper is suggested with Major Revision before publication.
Detail Comments:
1. The authors mention that a spatial domain-based screening approach is inspired by local mean filtering algorithm, but it is unclear why they adopted a strategy of dividing the outer window into 9 inner windows for local spatial screening. Why not take a strategy where each pixel within the outer window is considered as the center of an inner window? Such a filtering strategy might be moreaccurate.
2. The descriptions of the background feature attention module (BFAM) and the dynamic learnable fusion module (DLFM) are too broad. The processes and principles of these modules should bedetailed and thoroughly explained.
3. In the DLFM, the difference feature will shield the negative difference feature 𝑋𝑜𝑢𝑡 − 𝑋𝑜𝑢𝑡afterthe RELU layer. The author should confirm whether the difference feature is absolute value. If the absolute value is not taken, will someabnormal features not be effectively suppressed?
4. In this paper, it is mentioned that it is necessary to focus on comparing the three methods:PDBSNet, BockNet and MMBSN. However, the subsequent analysis provides only a brief and general overview, without a detailed comparison of the performance of the three blind spot networks.Please add some demonstrations about this type of comparison.
5. There is no introduction of parameter setting. Although the parameter analysis is done, the final parameters of the experiment are not explained. Please add this item.
6. The paper proposes that the detection performance of masked convolutions varies at differentscales, but it does not provide experimental evidence to support this claim. Additional experiments should be conducted to validate the performance of masked convolutions at different scales.
7. Fig.16 is too blurry and should be re-provided with some clearer pictures.
8. Please add more methods on target detection in the related work section.
[1] ISNet: Shape matters for infrared small target detection, CVPR.
[2] Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection, ACM MM.
[3] Exploring feature compensation and cross-level correlation for infrared small target detection, ACM MM.
[4] Dim2Clear Network for Infrared Small Target Detection, TGRS
[5] Chfnet: Curvature half-level fusion network for single-frame infrared small target detection, Remote sensing.
[6] Thermodynamics-Inspired Multi-Feature Network for Infrared Small Target Detection, Remote sensing.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf