Active Fire Detection from Landsat-8 Imagery Using Deep Multiple Kernel Learning
Round 1
Reviewer 1 Report
In this paper, aiming at the problem of active fire detection (AFD), an efficient deep multiple kernel learning network for multiscale AFD from Landsat 8 images is proposed. The network uses convolution layers with multiple kernels and different dilation rates for multi-scale feature extraction. In addition, the feasibility of the proposed method is verified by the comparison of experimental data, and the optimal multi-scale parameter combination is obtained. However, the reviewer still has the following comments:
- The main novelty of this paper is the use of deep multiple kernels learning methods for AFD. According to the introduction of the article, this model is based on the release of large-scale datasets for AFD, was there a deep learning model proposed for active fire detection before this model? If there was, please add the corresponding references.
- In this paper, the experiments are regarded to various scale parameters in the model, and some quantitative comparisons with other classical methods need to be added.
- The resolution of some figures need to be improved, such as Figure 2 and Figure 3.
- The figure number in the manuscript is wrong, and there are two Figure 3.
Author Response
Please see the attached file.
Author Response File: Author Response.docx
Reviewer 2 Report
This paper addresses an interesting topic, the study is well structured and the manuscript provides all necessary detail to replicate it. The effects of modifying different aspects of the CNN architecture are thoroughly discussed, and I think that the knowledge accumulated during this work would be useful for the community. However, I have two important concerns about the overall contribution of this work.
First, the authors claim that they are providing an improvement over a “reference” previous work (de Almeida et al., 2021). However, the performance of the proposed CNN architectures is not compared against the performance of de Almeida et al.’s methods. Not only that, but the authors of this article have not used the same dataset for testing. One of the main contributions of de Almeida et al. (2021) was that they trained their networks on automatically labelled data and then validated the results using a manually annotated testing dataset. The authors of this work seem to have used the automated labels only, for training, validation and testing. If that is the case, these results are not even comparable to those of de Almeida et al. (2021). The authors should better discuss the relationship of this work with the referenced previous work and, if they are using the previous work as a benchmark, conduct a meaningful comparison.
Second --and this is a shortcoming shared by de Almeida et al.’s work--, reference fire masks treated as ground truth are not actually so and this must be handled with care. What the authors call “ground truth” does not denote the actual location of the fire on the ground, but an annotation conducted on the same images by different means. While other applications of computer vision may easily get away with this discrepancy, the difference matters in remote sensing. Finding an automated way to correctly identify fire pixels in satellite imagery is definitely helpful, but its implications should not be overstated. None of these methods will be able to detect a fire if the satellite sensor does not capture it correctly (e.g., because of cloud cover, smoke, lack of spatial resolution or incorrect band selection). Similarly, the CNN may be training on a wrong dataset if the so-called ‘manual truth’ is incorrect. Almeida et al. did not explain how they manually annotated their ‘ground truth’, nor did they validate it against any actual ground detection. This limitation is inherited from the previous work by de Almeida et al. (2021), but it should be discussed and the applicability of these methods should be clearly stated: they allow automatically extrapolating the annotation of fire pixels in thousands of satellite images based on a reduced sample annotated manually.
See some other minor comments below:
Line 47: When referring to previous work on computer vision for fire detection, these works should be cited:
Martinez-de Dios, J.R., B.C. Arrue, A. Ollero, L. Merino, and F. Gómez-Rodríguez. 2008. “Computer Vision Techniques for Forest Fire Perception.” Image and Vision Computing 26 (4): 550–62. https://doi.org/10.1016/j.imavis.2007.07.002.
Toulouse, Tom, Lucile Rossi, Turgay Celik, and Moulay Akhloufi. 2015. “Automatic Fire Pixel Detection Using Image Processing: A Comparative Analysis of Rule-Based and Machine Learning-Based Methods.” Signal, Image and Video Processing, 1–8. https://doi.org/10.1007/s11760-015-0789-x.
Valero, M. M., O. Rios, E. Pastor, and E. Planas. 2018. “Automated Location of Active Fire Perimeters in Aerial Infrared Imaging Using Unsupervised Edge Detectors.” International Journal of Wildland Fire 27 (4): 241–56. https://doi.org/10.1071/WF17093.
Line 92: Cloud detection capabilities are not relevant in this study
191: The indices referred to (ref. 41) are fire severity indices, not fire detection indices. They are fundamentally different. That sentence should be removed.
338: The Jaccard index is missing a reference. 51 is not the correct reference for this.
341: The abbreviations TP, FP, etc. must be defined when first used in the text.
464: Heading 4.3. Qualitative Evaluation of MultiScale-Net is duplicated.
479-480: “This result shows that the simultaneous use of kernels of different sizes accurately extracts the spectral-spatial properties associated with active fire, allowing even a single pixel of fire surrounded by background pixels to be properly segmented”. This contradicts the fact that B4K357D2 and B1K357D2 both use the same number of kernel sizes. Please explain better.
The images displayed in figures should have a scale.
The use of the word “firing” is confusing. By using it, the authors seem to mean “pixels that correspond to active fire”, but this term is not common. Please rephrase.
While the English is generally acceptable, there are some parts of the manuscript where errors make the meaning hard to understand. Please revise the writing.
Author Response
Please see the attached file.
Author Response File: Author Response.docx
Reviewer 3 Report
The manuscript entitled “Deep Multiple Kernel Learning for Active Fire Detection from Landsat 8 Imagery” presents an interesting method for fire mapping based on freely available data and well-established modelling approaches. The manuscript examines several scenarios for the development of the classifier with interesting results, while the approach is applied in multiple scenes with significant differences in the fire-non fire mosaic. The achieved accuracy is high demonstrating the convincing work done by the authors.
The manuscript is generally well written although some linguistic and expressional changes need to be made. The introduction provides all the required information for the reader to be able to follow the manuscript while the state-of-the-art on the subject is nicely presented. The methodology and results and discussion sections are quite complicated but I understand that the methodology is also complicated the multiple scenarios employed add to this complexity. However, this complexity adds significant value to the results by demonstrating the comprehensive approach adopted. I don’t have any major concerns about this study. Perhaps the authors could simplify the presentation without losing information and jeopardizing the transferability of the approach, because I believe this would increase the chances of the approach to be adopted in similar research and at an operational level.
I don’t have any major concerns about the study. I have highlighted some points in the text that the authors might wish to consider when revising the manuscript.
Comments for author File: Comments.pdf
Author Response
Please see the attached file.
Author Response File: Author Response.docx
Reviewer 4 Report
Dear Editor/Authors,
I have read the manuscript remotesensing-1558038, entitled "Deep Multiple Kernel Learning for Active Fire Detection from Landsat 8 Imagery", written by Rostami et collab., and submitted for publication in Remote Sensing Journal. The paper address a very important topic related to applications of remote sensing images in wildfire monitoring and management. The paper is well written, and the methods used contributed to high impact results; based also, on the detailed discussions and conclusions, I recommend the paper for publishing. Some minor corrections/suggestions: the paper seems to be a bit large, so, I suggest to reduce the method part, and to resize Table 2 and to compress somehow figures 2 to 10. I suggest to use simply “methods” instead “proposed methodology” (row 173).
Best regards
Author Response
Please see the attached file.
Author Response File: Author Response.docx