Unsupervised Wildfire Detection Using Multispectral MTG-FCI Data: A Feasibility Study
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis manuscript, titled "Unsupervised Wildfire Detection Using Multispectral MTG-FCI Data: A Feasibility Study", evaluates two unsupervised approaches (a fixed-threshold method and a lightweight U-Net autoencoder) for near-real-time wildfire detection using data from the Meteosat Third Generation Flexible Combined Imager (MTG-FCI).
- The paper reports that “383 fire events” were identified across 11 days, but an event is never clearly defined. Is it a spatially contiguous cluster of fire pixels? A temporal sequence?
- The autoencoder was trained on “fire‑free imagery,” but it remains unclear whether the training data come from the same seasonal and geographic domain (Italy, summer) as the test data.
- The validation is largely qualitative (visual agreement with dNBR, FRP, EFFIS). For a feasibility study, quantitative metrics (e.g., precision, recall, F1-score, false alarm rate) should be reported at the pixel or event level, especially given that no absolute ground truth exists.
- The reconstruction (Fig. 3b) shows black borders due to the 64×64 sliding window with stride 32. The paper does not quantify how much of the image area is lost or how reconstruction error behaves near boundaries.
Author Response
Dear Reviewer, thank you very much for your valuable and constructive comments. We have thoroughly addressed every point raised in your report. For the sake of clarity, layout preservation, and mathematical formatting (including tables and equations), our comprehensive point-by-point responses and the corresponding updated text excerpts have been compiled into a dedicated PDF document. Please refer to the attached file: "Response_to_Reviewer_1.pdf".
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsI have a few comments below to clarify your manuscript.
Lines 12–24: "3–5 hours early detection" is presented as a general success. However, given that a 30-minute delay was observed in the Foggia case (Rows 463–464), slightly softening this statement to "in some cases" or "in major events" would provide more accurate expectation management.
Lines 217–226: The choice of d = 4 is convincingly justified in terms of reconstruction quality. However, adding a single performance metric such as detection sensitivity or susceptibility to the table would more directly demonstrate the impact of bottleneck size on detection quality.
Line 268 (α = 0.7): A brief explanation of how this weighting coefficient in the local score formula was determined should be added. If it is an empirical choice, this should be stated; if a susceptibility analysis was performed, it should be summarized.
Lines 492–497: It is proposed to add two items to the current limitations section: (1) the geographical and seasonal generalizability of the analysis outside of the Italian summer season, (2) the potential for urban heat islands and industrial thermal anomalies to generate systematic false positives. This addition can be kept brief as the Catania case has already been addressed.
Author Response
Dear Reviewer, thank you very much for your valuable and constructive comments. We have thoroughly addressed every point raised in your report. For the sake of clarity, layout preservation, and mathematical formatting (including tables and equations), our comprehensive point-by-point responses and the corresponding updated text excerpts have been compiled into a dedicated PDF document. Please refer to the attached file: "Response_to_Reviewer_2.pdf".
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors1. Insufficient Description of Innovation (Contribution Needs Strengthening)
Currently, BLU-Net is essentially still a U-Net autoencoder applied to anomaly detection, highly similar to existing literature (anomaly detection + remote sensing). Authors are advised to clearly state: "Compared to existing MTG/SEVIRI or GOES research, what is the technological breakthrough of this study?" Otherwise, it may be perceived as applied rather than innovative research.
2. Lack of Quantitative Comparison with Existing Methods
Currently, only "threshold vs BLU-Net" is compared, but it lacks:
Comparison with existing methods (such as MODIS/VIIRS detection algorithms, GOES AI methods)
Comparison with other deep learning models (CNN, Transformer)
Baseline comparison is recommended; otherwise, the method's advantages cannot be demonstrated.
3. Incomplete Evaluation Metrics (Lack of Standard Metrics)
Current results mainly include:
Visual comparison
Case studies
But standard evaluation metrics are lacking, such as:
Precision / Recall / F1-score
False alarm rate
Detection latency statistics
Quantitative evaluation is recommended; otherwise, the academic persuasiveness is insufficient.
4. Ground truth issue not properly handled.
The authors point out that there is no absolute ground truth, but the current approach (FRP + EFFIS + Sentinel-2) still leans towards indirect validation.
Recommendation: Establish a "pseudo ground truth" or adopt event-based evaluation
Otherwise, the validation may be questioned for lack of rigor.
5. Data range too small (only 11 days).
The study only analyzes 11 days of data (), resulting in a limited sample size.
Recommendation: Expand the time range (at least a whole quarter or multiple years) or explain whether the data selection is representative.
6. BLU-Net architecture design lacks theoretical support.
Although there is an ablation study (bottleneck=4), it lacks:
Theoretical explanation;
Comparison of different models (e.g., standard U-Net vs Light U-Net);
It is recommended to design rationale or control experiments.
7. Hyperparameter settings too empirical.
For example:
α = 0.7
percentile = 95 / 99.95
There is currently no explanation of how these parameters were chosen or whether they are stable.
8. Too many image results but lack of comprehensive analysis
The article contains numerous case figures (Fig. 4–9), but:
Lack of statistical aggregation, making it difficult to draw general conclusions.
Recommendation:
Add tables summarizing performance or error analysis.
9. Discussion section is too descriptive, lacking critical analysis
Discussions mostly focus on "method effectiveness," but rarely discuss:
What scenarios would cause failure?
Misjudgment of cloud/surface high temperature
Integration limitations with existing systems.
Recommendation: Add failure case analysis.
10. Paper positioning needs to be more specific (Feasibility vs. Contribution)
Currently positioned as a "feasibility study," but if the journal is an SCI (e.g., J. Imaging), a certain level of academic depth is still required.
Recommendation: Clearly position it as technical validation + operational potential or increase the depth of the method to meet research paper standards.
Author Response
Dear Reviewer, thank you very much for your valuable and constructive comments. We have thoroughly addressed every point raised in your report. For the sake of clarity, layout preservation, and mathematical formatting (including tables and equations), our comprehensive point-by-point responses and the corresponding updated text excerpts have been compiled into a dedicated PDF document. Please refer to the attached file: "Response_to_Reviewer_3.pdf".
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThe revised manuscript has significantly improved in terms of methodological clarity, quantitative validation, and architectural justification. In particular, the added ablation study and the introduction of Sensitivity, FPR, and F1-score metrics substantially strengthen the technical contribution of the work.
Although the validation framework has been improved, the manuscript should more explicitly discuss the limitations of using FRP and EFFIS products as indirect ground-truth references. A brief clarification of the validation strategy's uncertainty would improve scientific transparency.
The discussion section would benefit from a more critical analysis of potential failure cases, including cloud contamination, highly reflective hot surfaces, and seasonal environmental variability that may affect anomaly detection performance.
While the BLU-Net architecture is now better justified, the manuscript would still benefit from a concise comparison with a standard U-Net or another commonly used anomaly-detection baseline, even if only discussed qualitatively.
Some minor formatting and editorial issues remain in the manuscript. For example, unresolved placeholders such as "Tab. ??" should be corrected, and several long theoretical paragraphs could be streamlined for readability.
The manuscript is now much stronger than the original submission and demonstrates promising potential for near-real-time wildfire detection using MTG-FCI data. After addressing the remaining minor issues above, the manuscript could be suitable for publication.
Author Response
Dear Reviewer, thank you very much for your valuable and constructive comments on this revised version. We have thoroughly addressed every remaining point raised in your report to further enhance the scientific transparency and critical depth of our work. For the sake of clarity, layout preservation, and precise mathematical formatting, our detailed point-by-point responses—including the exact excerpts of the added text—have been compiled into a dedicated PDF document. Please refer to the attached file: 'Response_to_Reviewer_3.pdf'. We believe the manuscript is now significantly strengthened thanks to your insights.
Author Response File:
Author Response.pdf
