Next Article in Journal
Automatic Point Cloud Colorization of Ground-Based LiDAR Data Using Video Imagery without Position and Orientation System
Next Article in Special Issue
Classification of Coniferous and Broad-Leaf Forests in China Based on High-Resolution Imagery and Local Samples in Google Earth Engine
Previous Article in Journal
Precipitation Microphysics of Locally-Originated Typhoons in the South China Sea Based on GPM Satellite Observations
Previous Article in Special Issue
Climate-Change-Driven Droughts and Tree Mortality: Assessing the Potential of UAV-Derived Early Warning Metrics
 
 
Article
Peer-Review Record

Orthomosaicking Thermal Drone Images of Forests via Simultaneously Acquired RGB Images

Remote Sens. 2023, 15(10), 2653; https://doi.org/10.3390/rs15102653
by Rudraksh Kapil 1,2,*, Guillermo Castilla 3, Seyed Mojtaba Marvasti-Zadeh 2, Devin Goodsman 3, Nadir Erbilgin 2,† and Nilanjan Ray 1,†
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Remote Sens. 2023, 15(10), 2653; https://doi.org/10.3390/rs15102653
Submission received: 17 April 2023 / Revised: 16 May 2023 / Accepted: 17 May 2023 / Published: 19 May 2023
(This article belongs to the Collection Feature Paper Special Issue on Forest Remote Sensing)

Round 1

Reviewer 1 Report

In this manuscript, the orthographic mosaic of thermal imaging images obtained by aerial photography is studied. By using RGB images obtained at the same time as thermal imaging images for orthographic mosaic, an accurate and high-precision thermal imaging orthographic mosaic map can be obtained in a single flight. The content of this manuscript is of great significance to the field of remote sensing monitoring, which can greatly improve the efficiency of generating thermal imaging remote sensing maps. At the same time, this manuscript also opens up the proposed method and provides processing tools, which provides a brand-new solution for practitioners and has great application value. There is almost no need to change the content of this article, only a few problems need to be improved:

1. The abstract of the manuscript is well-structured, reasonably concise, and effectively conveys the content of the paper. However, incorporating numerical indicators obtained in the study into the results section could strengthen the persuasiveness of the article.

2. The introduction section could benefit from a more comprehensive overview of the problems related to computer vision detection and references to relevant research (Optimization strategies of fruit detection to overcome the challenge of unstructured background in field orchard environment: a review; Precision Agriculture. Rachis detection and three-dimensional localization of cut off point for vision-based banana robot. Computers and Electronics in Agriculture.).

3. I observed that the registered sample images in this manuscript are all forest areas containing only trees. It is suggested to add images containing large areas of land, water and forest areas containing abnormal trees as examples, and compare the orthographic mosaic effect of thermal imaging images of such areas.

4. In Chapter 3.5, the sample images of crown detection all have similar lighting environments, so it is suggested to show the detection effect of crown images in other lighting environments, such as RGB images all in the shadow area, or thermal imaging images when the trees are not completely cooled just after the sun sets.

Author Response

 

Point 1: In this manuscript, the orthographic mosaic of thermal imaging images obtained by aerial photography is studied. By using RGB images obtained at the same time as thermal imaging images for orthographic mosaic, an accurate and high-precision thermal imaging orthographic mosaic map can be obtained in a single flight. The content of this manuscript is of great significance to the field of remote sensing monitoring, which can greatly improve the efficiency of generating thermal imaging remote sensing maps. At the same time, this manuscript also opens up the proposed method and provides processing tools, which provides a brand-new solution for practitioners and has great application value. There is almost no need to change the content of this article, only a few problems need to be improved:

 

Response 1: We thank the reviewer for thoroughly reading the manuscript and providing the following recommendations.

 

Point 2: The abstract of the manuscript is well-structured, reasonably concise, and effectively conveys the content of the paper. However, incorporating numerical indicators obtained in the study into the results section could strengthen the persuasiveness of the article.

 

Response 2: Thank you for pointing this out. We have included the quantitative results using the MI metric to concretely highlight the improvement achieved by the NGF-based co-registration. Specifically, this has been added to lines 12-14.

 

Point 3: The introduction section could benefit from a more comprehensive overview of the problems related to computer vision detection and references to relevant research (Optimization strategies of fruit detection to overcome the challenge of unstructured background in field orchard environment: a review; Precision Agriculture. Rachis detection and three-dimensional localization of cut off point for vision-based banana robot. Computers and Electronics in Agriculture.).

 

Response 3: Thank you for referring us to these works. These works discuss object detection strategies relevant to forest monitoring. However, our study focuses on the orthomosaicking process of forest imagery, and detection (of tree crowns) is used as an auxiliary task to demonstrate the benefit of our proposed workflow that generates geometrically-aligned orthomosaics. With this in mind (and considering the already lengthy introduction), we believe discussing such works may distract readers from our key focus in this work.  

 

Point 4: I observed that the registered sample images in this manuscript are all forest areas containing only trees. It is suggested to add images containing large areas of land, water and forest areas containing abnormal trees as examples, and compare the orthographic mosaic effect of thermal imaging images of such areas.

 

Response 4: The current study is one important component of a more comprehensive work currently undertaken. We intentionally only consider dense forest areas in this work. The orthomosaicking results for the other suggested sites should be as good if not better, considering that images of those regions are expected to have more salient features for effective SfM processing. To aid future work in these directions, we provide an open-source tool that could be tweaked or tested by other researchers working in agricultural or water related topics.

 

Point 5: In Chapter 3.5, the sample images of crown detection all have similar lighting environments, so it is suggested to show the detection effect of crown images in other lighting environments, such as RGB images all in the shadow area, or thermal imaging images when the trees are not completely cooled just after the sun sets.

 

Response 5: That is an interesting idea. However, in this work the purpose of reporting the tree crown detection results was purely to highlight the geometric alignment of the generated RGB and thermal orthomosaics. Therefore, we displayed the trees that were detected with the highest confidence. For lower-confidence detections (e.g. shorter trees in low-lighting), the bounding boxes would still correspond to the same geographical regions in both orthomosaics (due to the alignment). Note that we used a pre-existing RGB detector without modification for this part of the study, so handling such situations is out of the scope of the manuscript.

Reviewer 2 Report

The paper describes a new workflow that generates two geometrically aligned orthomosaics from simultaneously acquired RGB and thermal drone images. The thermal images are used for texturing the RGB orthomosaic, which introduces fine details useful in field data analysis – with good applicability to forests monitoring.

I find the paper not easy to read, I had to re-read the text many times.

I found non-relevant paragraphs summarizing the content of the paper in introduction (lines 161-168) or summarizing the content of the chapters (e.g., lines 170-175). Please consider deleting or replacing with more relevant content.

The paper is very technical, offering the full work-flow details.

I particularly liked the workflow charts in Figures 1 and 3 – but I suggest enlarging in order to be easier to read and better impact.

I also suggest enlarging also Figure 6, as the text in here is very difficult to read.

Using oblique photo acquisition is a plus for SfM 3D orthomosaics – why using only nadir photos? How does obliques photo acquisition affects thermal imaging? –> is this a no go? Please consider explaining within the article text.

In my experience, this is a technique relevant also for archaeological surveys, offering good quality details of buried walls (warm), filled trenches (cold), etc.

English language is fine.

Author Response

We sincerely thank the reviewer for thoroughly reading the manuscript and for their generous feedback. 

 

Point 1: The paper describes a new workflow that generates two geometrically aligned orthomosaics from simultaneously acquired RGB and thermal drone images. The thermal images are used for texturing the RGB orthomosaic, which introduces fine details useful in field data analysis – with good applicability to forests monitoring.

  I find the paper not easy to read, I had to re-read the text many times.

 

Response 1: We agree it may be difficult to read because of the complexity of the cross-disciplinary subject matter, and readers not familiar with the specific methods used may require a second reading, but we made our best effort to explain everything in detail and in (hopefully) clear language.

Point 2: I found non-relevant paragraphs summarizing the content of the paper in introduction (lines 161-168) or summarizing the content of the chapters (e.g., lines 170-175). Please consider deleting or replacing with more relevant content.

Response 2: Thank you for pointing this out. For the mentioned outline paragraph in lines 161-168, we agree it is redundant and have removed it to help the introduction text be more focussed. As for the brief paragraph in lines 170-175, we believe it will be helpful to orient readers at the start of the Materials and Proposed Methods section, considering this one section details all of the following: our data, proposed method (in-depth), auxillary detection task, open-source tool, and performance assessment.

 

Point 3: The paper is very technical, offering the full work-flow details.

Response 3: We agree with this point – this is intentional to ensure reproducibility.

Point 4: I particularly liked the workflow charts in Figures 1 and 3 – but I suggest enlarging in order to be easier to read and better impact. I also suggest enlarging also Figure 6, as the text in here is very difficult to read.

Response 4: Thank you for the suggestion, we have increased the sizes of Figures 1 and 3 to the full extent of the page to improve readability. The size of Figure 6 has been increased as much as possible without disrupting the organization of the proceeding text and figures. We have also removed a redundant clause in the caption of Figure 6 to facilitate its larger size. 

Point 5: Using oblique photo acquisition is a plus for SfM 3D orthomosaics – why using only nadir photos? How does obliques photo acquisition affects thermal imaging? –> is this a no go? Please consider explaining within the article text.

Response 5: We agree that oblique imagery helps the SfM process, but the instrument we used does not allow for ‘smart oblique’ capture, and acquiring oblique photos in a 2nd flight will likely bring more issues than help (for the purposes of orthomosaicking) due to changing wind and illumination conditions.

Point 6: In my experience, this is a technique relevant also for archaeological surveys, offering good quality details of buried walls (warm), filled trenches (cold), etc.

Response 6: This is an interesting suggestion and we hope other researchers working in the archaeology field can corroborate applicability. Our open-source tool should be beneficial for such work in the future.

Reviewer 3 Report

This study offers a method to improve the precision of the lower-resolution thermal UAV images of forests using the more precise simultaneously acquired RGB images.

The method is proposing a thermal orthomosaicking workflow that leverages simultaneously acquired RGB images, by producing a surface mesh via structure from motion, while thermal images are only used to texture this mesh and yield a thermal orthomosaic. Prior to texturing, RGB-thermal image pairs are co-registered using an affine transformation derived from a machine learning technique. The results show that the thermal orthomosaic generated from the workflow is of better quality than other existing methods, is geometrically aligned with the RGB orthomosaic, preserves radiometric information from the original thermal imagery.

 The Introduction section of the manuscript is providing the background of the study, highlighting that existing workflows typically generate orthomosaics for RGB and thermal data of forests separately. Thermal images typically lack enough contrast and the orthomosaics generated with these standard workflows are having gaps, leading to incomplete coverage of the entire area of interest. A technique to improve the thermal orthomosaicking workflow by deriving the thermal image positions using RGB image alignment, or by using unregistered images or by stacking each pair of RGB and thermal images into a 4-channel image prior to orthomosaicking and separate them afterward. The new integrated RGB and thermal processing workflow is overcoming the mentioned challenges of thermal orthomosaicking of forest scenes, applicable to drone instruments that can simultaneously capture RGB and thermal images.

 The Materials and Methods section pinpoints that the study area is in the the northeast of Cynthia, Alberta (Canada), at an elevation of around 950m above sea level., with the main vegetation type being temperate mixed forests. A Zenmuse H20T instrument mounted on a DJI Matrice 300 RTK quadcopter was used to take nadir wide-angle RGB images of 3040x4056 pixels that covers the most terrain (83° field of view - FOV) and thermal images of 650x512 pixels with a FOV of 41°. The proposed integrated orthomosaicking workflow comprises in RGB orthomosaic generation, thermal image conversion (from R-JPEG to grayscale TIFF), RGB-thermal image co-registration, and thermal orthomosaic generation. The section describes in detail proposed integrated orthomosaicking workflow, (following the steps of RGB orthomosaic generation, thermal image conversion, RGB-thermal image co-registration, thermal orthomosaic generation) for this the authors having created an open-source tool that offers both a command line interface and a graphical user interface (GUI).

 The Experimental Results section shows the use of mutual information (MI) techniques between an RGB and thermal image pair as a quantitative metric to measure the performance of each image co-registration technique, by measuring the similarity in gray-level distributions for two images using their histograms. The paper present qualitative comparisons by displaying the co-registered images and orthomosaics of the unregistered baseline and of proposed workflow. The workflow preserves the radiometric information present in the individual thermal images that are used to generate the orthomosaic. As a result of the good geometric alignment between the generated orthomosaics, the same trees appear in each of the corresponding pairs at the same locations.

The Discussion section shows that automated intensity-based co-registration through gradient descent of the NGF loss function on average outperformed the other coregistration techniques. The co-registration of individual RGB and thermal images allows to properly reuse the intermediate outputs of the RGB orthomosaicking process to bypass the more problematic initial stages of the thermal orthomosaic generation. The workflow requires to first crop the central region of the RGB images to prevent the

barrel distortion present near the edges of such images from propagating to the generated orthomosaic.

 

The Conclusion section highlights that proposed workflow leverages the intermediate outputs of RGB orthomosaic generation and only uses thermal images for texturing, thereby overcoming those issues. Using an automated intensity-based image co-registration method of the individual images, good geometric alignment between the individual thermal and RGB images was achieved; which translates to the co-registration  of the two generated orthomosaics. This geometric alignment of the thermal and RGB orthomosaics is advantageous for downstream forest monitoring tasks, being directly applicable to the same tree crowns in the thermal orthomosaic generated from the workflow. The study showed that the proposed workflow preserves the radiometric information present in the individual thermal images. The authors developed a free open-source, easy-to-use and flexible tool that implements the proposed workflow, allowing all the underlying algorithms’ parameters to be conveniently tuned through a GUI for specific project applications as required.

 I recommend publication in present form.

Author Response

Thank you for the feedback. 

 

Back to TopTop