Next Article in Journal
Attribution of NDVI Dynamics over the Globe from 1982 to 2015
Next Article in Special Issue
Parameter Flexible Wildfire Prediction Using Machine Learning Techniques: Forward and Inverse Modelling
Previous Article in Journal
Identification and Evaluation of the Polycentric Urban Structure: An Empirical Analysis Based on Multi-Source Big Data Fusion
Previous Article in Special Issue
Generating Terrain Data for Geomorphological Analysis by Integrating Topographical Features and Conditional Generative Adversarial Networks
 
 
Article
Peer-Review Record

Tradeoffs between UAS Spatial Resolution and Accuracy for Deep Learning Semantic Segmentation Applied to Wetland Vegetation Species Mapping

Remote Sens. 2022, 14(11), 2703; https://doi.org/10.3390/rs14112703
by Troy M. Saltiel 1,*,†, Philip E. Dennison 1, Michael J. Campbell 1, Tom R. Thompson 2 and Keith R. Hambrecht 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2022, 14(11), 2703; https://doi.org/10.3390/rs14112703
Submission received: 25 April 2022 / Revised: 27 May 2022 / Accepted: 31 May 2022 / Published: 4 June 2022
(This article belongs to the Special Issue Machine Learning Techniques Applied to Geosciences and Remote Sensing)

Round 1

Reviewer 1 Report

please see attached file.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The submitted manuscript is proposing an image classification of fine spatial resolution imagery, obtained by means of unoccupied aircraft systems (UAS). The aim of the method is that of a detailed mapping of vegetation. The manuscript is giving an accurate review of literature and it is proposing a detailed description of method, in particular about CNN. Moreover, in the section on Image Processing, the authors have properly described all the problems concerning the schene recording. The manuscript is really interesting.
There is only a question, that the authors could discuss, if possible. It is concerning the cost of UAS, software and analysis. Is the cost affordable? And also, in a general framework, who can be the interested parties of the proposed method? Government agencies or individuals?

Author Response

We would like to thank you for your positive comments on our manuscript. 

 

There is only a question, that the authors could discuss, if possible. It is concerning the cost of UAS, software and analysis. Is the cost affordable? 

We have added a paragraph to the discussion that addresses this question and frames it within the research objectives of this study (L427-435): 

“Based on our results, coarsening spatial resolution from 7.6 cm to 22.8 cm in this study area could allow for increased spatial coverage or reduce image acquisition time while only moderately reducing accuracy. Acquisition time represents a significant cost and limits mapping of large study areas at fine spatial resolutions. Processing software provides an additional cost tradeoff that should be considered before undertaking similar projects. This research relied on commercial software (Pix4D and ENVI) requiring licensing that may be prohibitively expensive for small projects. Free and open source alternatives exist, such as OpenDroneMap [39] for image stitching and Tensorflow [40] or PyTorch [41] for modeling, but require additional experience and/or programming skills.”

 

And also, in a general framework, who can be the interested parties of the proposed method? Government agencies or individuals?

 

We don’t quite follow this comment. We do not propose a new method, but provide novel insight into image spatial resolution and CNN-based model classification accuracy. The results may provide insight to anyone using UAS for research, which may include researchers from government agencies, research institutions, non-profits, etc.

Reviewer 3 Report

Review

The authors examined the accuracy of the classification of plant textures depending on the resolution of images obtained from UAVs with the use of convolutional networks (CNN). The paper is interesting and can potentially be useful for optimizing precise UAS-based mapping projects.

However, there are a few things that can be added or improved:

  1. In the abstract, the authors write that they used the CNN network for classification, and in Chapter 2.4 they write that they used the U-net CNN network. It is not the same.
  2. Figure 2 shows the plant classes. The same images visible from above are missing.
  3. There is no information about the basic values of the convolutional network parameters. The authors limited themselves to the statement that : „U-Net CNN model parameters include patch size, augmentation scale, augmentation rotation, number of epochs, number of patches per batch, number of patches per epoch, patch sampling rate, class weight, and loss weight”(line 215).
  4. Equation 1 - where this formula for the number of epochs come from?
  5. The confusion matrix is often used to show the results of the classification. This method allows for a better visualization of the results. Perhaps it is worth showing this matrix?
  6. The colors in the legend (Figs. 8 and 9) for Cattail and Bulrush are black and in the chart are blue.

Author Response

We thank the reviewer for their suggestions to provide clarification in parts of the manuscript.

 

In the abstract, the authors write that they used the CNN network for classification, and in Chapter 2.4 they write that they used the U-net CNN network. It is not the same.

 

We now clarify that we used a U-Net CNN for our study instead of broadly stating CNN (L10-12): 

“We evaluated two methods for the simulation of coarser spatial resolution imagery, averaging before and after orthomosaic stitching, and then trained and applied a U-Net CNN model for each resolution and method.”

 

Figure 2 shows the plant classes. The same images visible from above are missing.

 

We have added a new figure (Figure 3) that shows the land cover classes from the orthomosaic imagery.

 

There is no information about the basic values of the convolutional network parameters. The authors limited themselves to the statement that : „U-Net CNN model parameters include patch size, augmentation scale, augmentation rotation, number of epochs, number of patches per batch, number of patches per epoch, patch sampling rate, class weight, and loss weight”(line 215).

 

Parameter values are described in the following paragraph (L229-236): 

“ENVI automatically sets the number of patches per batch and the number of patches per epoch because these parameters are impacted by the graphics card video random access memory (VRAM; Nvidia Tesla T4 with 16 GB of VRAM). The rest of the parameters were set to their default values and not varied; the augmentation scale and rotation were set to on, the patch size was 208 pixels, the number of patches per batch and number of patches per epoch were determined by ENVI, the patch sampling rate was 16, the class weight was 2, and the loss weight was 0.”

 

We have also added a model architecture diagram, Figure 6.

 

Equation 1 - where this formula for the number of epochs come from?

 

Equation 1 was empirically determined. This is now explained in the manuscript (L229): 

“Equation 1 was determined by experimentation to ensure each model converged.”

 

The confusion matrix is often used to show the results of the classification. This method allows for a better visualization of the results. Perhaps it is worth showing this matrix?

 

We considered this, but decided not to include a confusion matrix because there are 33 total predictions, each with their own matrix. It would be misleading to include only one confusion matrix, since the results can vary considerably, and the inclusion of 33 matrices would be overwhelming and unnecessary. We find the use of precision, recall, and F1 score as per-class accuracy measures to be appropriate to summarize the results.

 

The colors in the legend (Figs. 8 and 9) for Cattail and Bulrush are black and in the chart are blue.

 

We added a sentence to the caption of each figure (now Figures 10 and 11)  to more clearly state how data are represented with both color and shape.

Reviewer 4 Report

This paper presents semantic segmentation of aerial images of wetland vegetation and studies the performance in different UAS spatial resolutions. The original resolution from 7.6 cm can be coarsened up to 22.8 cm without significant degradation in the overall performance which is also supported by experimental results. Although there is no novelty in the segmentation algorithm used, several experiments and presentations seem good for the consideration of the publication. Following are my few comments for improvement: 

  1. Just one segmentation algorithm and two test sites are used in the experimentation. In order to generalize the value, 22.8 cm reduction in resolution more experiments would be nice. At least if possible I recommend adding two more test sites in a different scenario.
  2. In Figure 5, there is no original figure at 7.6 cm resolution. It will be nice for the reader if you add the original figure along with figures in different resolutions. 
  3. In Tables 5 and 6, you can include original results at 7.6 cm. Also, include original measures for each resolution. Example, accuracy in 22.8 cm resolution - 0.85 (5%↓) indicates accuracy 0.85 in 22.8 cm resolution is 5% less than original accuracy (in 7.6 cm resolution)

Author Response

We appreciate the reviewer’s constructive feedback.

 

Just one segmentation algorithm and two test sites are used in the experimentation. In order to generalize the value, 22.8 cm reduction in resolution more experiments would be nice. At least if possible I recommend adding two more test sites in a different scenario.

 

Under this collaboration with Utah DNR, DNR only collected data over this site for the purpose of monitoring invasive Phragmites. We plan to examine additional sites and scenarios when we are able to procure funding for sensor and drone equipment. In the meantime, we believe that our study provided enough experimentation to generalize for the classification of graminoid vegetation species, and that our novel approach using two different image simulation scenarios can support and inspire future research that includes several more applications.

 

In Figure 5, there is no original figure at 7.6 cm resolution. It will be nice for the reader if you add the original figure along with figures in different resolutions. 

 

We redid this figure (now Figure 7) to include a map at the original resolution (7.6 cm).

 

In Tables 5 and 6, you can include original results at 7.6 cm. Also, include original measures for each resolution. Example, accuracy in 22.8 cm resolution - 0.85 (5%↓) indicates accuracy 0.85 in 22.8 cm resolution is 5% less than original accuracy (in 7.6 cm resolution)

 

We added the results for the original resolution and now report the actual values in addition to the percent change from the original resolution.

Round 2

Reviewer 1 Report

I have no further comments.

Back to TopTop