Hyperspectral–Polarization–LiDAR Multimodal Image Fusion Method for Few-Shot Scenarios
Round 1
Reviewer 1 Report (Previous Reviewer 2)
Comments and Suggestions for AuthorsThe revised manuscript represents a genuine improvement over the previous version. The most significant advances are: (1) the correction of the k=5/k=3 contradiction that undermined the experimental logic; (2) the addition of a proper mathematical convergence framework; (3) substantially improved statistical reporting of classification results; and (4) the addition of a Discussion section that acknowledges key limitations.
However, the reference problem is not fully resolved. Three comparison methods — DTCWT, NSCT, and the DLT registration algorithm — are still cited with papers that demonstrably do not correspond to those methods. This is not an editorial triviality: citations to comparison methods must correctly identify the source algorithms, as reproducibility and fair evaluation depend on this. The authors were explicitly asked about this and provided a correction list, yet the corrections are incomplete.
A secondary issue concerns the standard deviation values in Table 3, which appear to contain a formatting or unit error. Values such as DTCWT: 0.33 ± 1.08×10⁻⁶ s and LatLRR: 434.58 ± 3.2×10⁻⁶ s are unrealistically precise for wall-clock measurements and are likely presented incorrectly.
No new major inconsistencies were introduced by the revision. The conclusions do not overstate the findings, and the statistical analysis is now properly qualified (including the non-significant 1-shot result). The manuscript is close to publication quality but requires targeted corrections before final acceptance.
Author Response
Detailed revisions are illustrated in the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report (New Reviewer)
Comments and Suggestions for AuthorsThe manuscript presents a multimodal fusion framework combining hyperspectral, polarization, and LiDAR data for few-shot scenarios. The work is relevant; however, the following points should be addressed to improve technical clarity and rigor.
- The novelty is not clearly distinguished from recent multimodal fusion works. Please add a paragraph in the Introduction explicitly stating the unique contribution and include a comparison table with at least 3 recent methods (e.g., CNN/Transformer-based) highlighting differences in approach and performance.
- The use of L1 norm (polarization/depth) and L2 norm (spectral) needs stronger justification. Please provide theoretical reasoning and include an ablation study showing results for different norm combinations and without the multi-scale optimization.
- The method lacks reproducibility. Please provide a clear algorithm or pseudocode of the full pipeline and explicitly list all parameters (η, ε, k, δ), including how k=3 and δ=30 were selected.
- The experimental validation is limited to self-collected data. Please include results on at least one public dataset or provide cross-scene validation, and clearly define the few-shot setting (number of samples per class and train/test split).
The manuscript is generally understandable, but the English language requires significant improvement. Several sentences are overly long and grammatically incorrect, which affects readability. Professional English editing is strongly recommended.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 2 Report (New Reviewer)
Comments and Suggestions for AuthorsThe revised manuscript has improved substantially in terms of experimental validation and methodological clarity. The added public-dataset experiments and algorithmic details strengthen the overall contribution. However, a few minor issues still require clarification.
- The final selected values of k and δ used in the experiments should be stated explicitly and consistently in the experimental section.
- The few-shot setting should be described more clearly, including the exact number of training samples per class and the train/test split used for evaluation.
- In Figures 9–11, adding enlarged local comparisons for the highlighted regions would make the visual improvements of the proposed method more convincing.
- The discussion regarding lower SSIM values compared with some baseline methods should be presented more cautiously unless supported by additional quantitative noise analysis.
The manuscript is generally readable and has improved compared with the previous version. However, several sentences still require minor grammatical correction and refinement of technical phrasing to improve clarity and readability.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1. Although the paper provides some description of the division-of-focal-plane polarization imaging camera within the "hyperspectral polarization integrated imaging system," the description of how the 2nm spectral resolution filter is achieved is insufficient. It is recommended to add supplementary text (covering the principles of spectral, polarization, and spatial image acquisition, imaging frame rate, etc.) or provide references related to the system.
2. Check all the references format throughout the text.
3. It is recommended to further explain why P(I)=DOP(I)|▽AOP(I)|I was chosen. Is there a theoretical or experimental basis for this choice? Were any attempts made to compare it with other alternatives?
4. Check whether "k is set to 5" in line 432 is correct, as it appears to contradict the context.
5. Discuss the poor performance of SSIM and MI in Table 2.
6. It is recommended to supplement the detailed steps of data preprocessing to ensure the reproducibility of the method. This includes the processing of raw polarization images and the generation method of LiDAR depth images.
Comments on the Quality of English LanguageCan be improved
Author Response
Please see the attachment.
Author Response File:
Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript addresses the topic of hyperspectral–polarimetric–LiDAR multimodal image fusion for small-sample scenarios and presents a conceptually interesting framework based on physical feature mapping and multi-scale optimization; however, the current study suffers from several substantial methodological and editorial shortcomings that significantly undermine its scientific reliability. I identify a critical internal contradiction in the parameter selection procedure, raising doubts about whether the reported experimental results correspond to a consistent configuration, and also selective reporting of evaluation metrics, where inferior performance in structural similarity and mutual information is omitted from the main discussion despite contradicting the claimed superiority of the method. Additional concerns include multiple incorrect and duplicated references, inaccurate description of the LiDAR hardware and scanning mechanism, insufficient justification of the central “small sample” claim, and the complete absence of statistical validation of the reported results, as all quantitative metrics are presented without uncertainty estimates, repeated experiments, or variance analysis. The mathematical formulation, although conceptually plausible, lacks adequate theoretical justification and convergence discussion, and the manuscript structure omits a dedicated discussion section addressing limitations, computational cost, and generalizability. Furthermore, several inconsistencies in spatial resolution calculations, terminology, and editorial formatting indicate insufficient technical proofreading. Taken together, these issues suggest that the work in its present form does not meet the methodological rigor and editorial standards expected for publication, and substantial revision addressing the identified conceptual, experimental, and referencing problems would be necessary before the manuscript could be reconsidered.
Author Response
Please see the attachment.
Author Response File:
Author Response.docx
