Bamboo Forest Mapping in China Using the Dense Landsat 8 Image Archive and Google Earth Engine
Round 1
Reviewer 1 Report
I have attached the comments for your reference.
Comments for author File: Comments.pdf
Author Response
Dear reviewer
We thank you very much for taking the time to review our manuscript, and thank you for your affirmation of our work and your professional opinions. We revised the manuscript according to your comments and proofread it carefully to minimize printing, grammar and bibliographic errors. We provide a reply letter as an attachment to reply to your modification comments.
Author Response File: Author Response.docx
Reviewer 2 Report
This paper proposes an algorithm to classify bamboo plantation areas in China based on the Landsat 8 SR images (2014 to 2016) via Google Earth Engine. As the author did, huge amounts of sample points are obtained and different features are calculated to build a classification model. The results prove that this model has relatively high accuracy on a large scale and the potentiality to monitor the long time-series bamboo plantations by using available Landsat/Sentinel such as past three decades or future. In my opinion, the bamboo classification model displayed in this paper does have a certain degree of innovation and practicality. However, there are several issues that need to be resolved before publication:
Introduction
- 1, you should improve the legend and add the area of the blue line to the legend.
- I suggest you to move the figure.1 to 2. Materials and Methods and add an additional section like this:
- Materials and Methods
2.1Study area
2.2Sample collection
…
- Line 65-70 looks confused and not clear, you need to improve your language and logic, and explain clearly the issue such as the availability of the number of images in a large-scale range, or what type of vegetation has similar properties to bamboo.
- Line 83-86, briefly explain the deficiencies of the past research, the significance of your improvements to these deficiencies, and the meaning of the method you proposed. I suggest you rewrite this paragraph to make the content more substantial if possible.
Materials and Methods
- Introduce the optical imagery you use. It is better to add a satellite observation frequency figure or a table to display the number of images during the research period since the number of available and high-quality images is significant to the building of the model. About the observation frequency, you could refer to https://doi.org/10.1016/j.jag.2021.102376; https://doi.org/10.1016/j.rse.2020.111916
- Improve the title of figure 2.
- 1, the map during 1999-2003 is too old and cannot ignore the limitations of visual interpretation, check the rationality of using the PFM method to obtain sample points in some areas. In addition, what sampling method was used? Stratified sampling? Random sampling?
- L153 IS TABLE 2.
- WHY you choose the RFC, how about the performance of other supervised models such as SVM or ANN.
- 3 is not necessary
- L186-189 the precipitation data need to be introduced in data
Result and Discussion
- L191-193 is useless
- Correct the detail of the title and the number of figures in this section.
- What is the source of the reference data in line 233? Introduce the reference data.
- Line261-273, you listed the elevation and precipitation differences of bamboo in different spatial. You need to analyze in detail the cause for this difference in different provinces.
- I suggest you analyze the accuracy and uncertainty of the model you proposed, including the feature combination or image quality, and discuss your improvements when compared to previous researches in bamboo classification, and cite references if appropriate.
Conclusions
- The Conclusions should be concise, some like discussion
Author Response
Dear reviewer
We thank you very much for taking the time to review our manuscript, and thank you for your affirmation of our work and your professional opinions. We revised the manuscript according to your comments and proofread it carefully to minimize printing, grammar and bibliographic errors. We provide a reply letter as an attachment to reply to your modification comments.
Author Response File: Author Response.docx
Reviewer 3 Report
The paper by Qi et al. provides a methodology and results for mapping bamboo extent in China using a dense stack of Landsat-8 images. The authors clearly state the need and difficulty in producing such a map, and demonstrate a methodology for doing so. However, the authors fail to provide clear and unambiguous methods so that the work would be reproducible (see detailed comments). The authors also need to restructure some components of the paper to make it more accessible.
Abstract
18 - change Landsat to Landsat-8
19 - change archives to archive
23 - 31. Perhaps remove the numbering from these sentences (i.e. the (1), (2) etc).
28 - 30 - make sure these findings are past tense i.e. “The bamboo forest of China covered an area”, and “Bamboo forests in China were mainly”
28 - 30 Remove superfluous ‘of/in China’ as it is clear where the study was taking place).
Introduction
The introduction provides a good background and rationale for the project. However, the objectives in the final paragraph are poor. The authors should provide a list of numbered objectives.
35: taxonomic names should be italicised
46: reword caption as ‘most common’ does not make sense.
49: the reference to Madagascar lemurs is not relevant given the focus is on China bamboo. Please also ensure other references [7 - 14] are relevant to bamboo in China
63: Why is mapping over large areas a difficult task with remote sensing? Expand on the reason.
72: You need to define “multi-phase spectral vegetation index” - this is not a term I have come across in remote sensing before.
83: Landsat is generally regarded as “medium” resolution, not fine resolution
84: What is NNFRI analysis and why is it being referenced here in the objectives?
85: The objectives are poor - see comment at the start of this section. Random Forest should be introduced in the methods.
Materials and Methods
The methods are logically organised, however, it is usual to include a study area section as the first component. This could be the combined as a “Study Area and and Sample Collection“ section. I am also concerned that some of the methods are not described in detail enough for a reader to be able to reproduce this research. For example, is the analysis carried out on a combined image, or individual images - it is not clear. See detailed comments below.
88 - 92: The flowchart is out of place here as the next section is
94: This first sentence is really a justification for using remote sensing which could be moved to the relevant part of the introduction.
Sample Collection
This section is very confusing. The authors should reorganised this and discuss the three different sampling sources in 3 different paragraphs. The authors need to provide details on all the field trips (not just one as an example). The authors also need to provide more details on the type of field sampling i.e. was an area taken into account, or was it just a single point location? Was a GPS used and what was the accuracy?
116: The table should have a total as the last row
Image Pre-Processing
128: It is not clear whether the authors used the QA data provided by Earth Engine or used your own - this needs to be made explicit, and the reason for not using the provided QA band.
129: “replace these pixels with clear observations” - the authors need to explain how this was done e.g. gap-filling undertaken, or masked out from further analysis
136 - 138: Give band numbers rather than wavelength ranges. The authors should consider a table listing the band numbers with the band centres and ranges that could be referred to here.
139: The term “auxiliary data” is confusing
141: The authors need to explain what “covariate elevation” is
143: Replace the SRTM URL reference with a citation of Farr et al. 2007
145: The authors need to explain what “vegetative covariates” are.
146 - 149: The authors need to explain how the quality composite is create as it is not reproducible by a reader.
Classification and Validation
163: The reference to hypspectral data is not relevant for multispectral Landsat data.
168: The authors need to explain how the number of trees were determined.
172: The k-fold method used here is unusual. Normally, the sample data is randomly split into training/test data (e.g. 70/30) and the accuracy assessment completed using the test data. The authors will need to justify why the k-fold methods was used especially given the number of samples available.
177: cite Cohen 1960 for the kappa statistic
179 - 181: The matrix/formulas are confusing and aren’t necessary, as the confusion matrix, accuracy and kappa are all well-known concepts in RS classification. The authors could consider removing them.
Results and Discussion
The results contain text and concepts that should have been described earlier in the methods. See comments below.
191 - 193: Remove this template text.
195: This section reads like methods and is the first time the authors have introduced ‘feature selection’. This should have been clearly stated as an objective (see Introduction notes), clearly described in the methods, and then the results of this presented here.
197: The authors should described the “importance scores” more clearly in the methods, and how they were treated.
200: The term covariates is used here referring to the various VI images. What are they covariates with?
201: “which is related” should be “which may be related” as the authors did not test leaf moisture content and can only hypothesise here. This applies to other conclusions related to the other bands.
208: “mediocre” should be “poor”
209: the authors need to explain what is meant by “not sufficient”
214 - 215: The authors are now referring to a single train/test dataset rather than the k-fold accuracy assessment. This is confusing.
230: Figure 2 is a poor figure. This could be a simple table, with the addition of overall map accuracy for each province (which the authors should have calculated).
232: The comparison with area should have been included as a section in the methods.
246: The area table could have an additional % difference between the areas for each province.
249: The map of bamboo distribution is the most important result from this paper and it should be moved to the first result reported in this section. The map should also be zoomed in on the provinces for which there are bamboo as it is too small a scale to see any detail of the distribution. The authors could also consider a hillshade layer on this map to show the relation between bamboo and elevation.
261 - 273: These findings are really interesting, but how do they compare with what was already described for the topographic/climate distribution for bamboo in China? Do these results reinforce or challenge them, or are they completely new?
Conclusions
The conclusions section is too long and a lot of this detail (e.g. field sampling, feature selection, area mapped in paragraphs 2,3,4) should have been in the results/discussion. The conclusion should just focus on the main objectives of the paper and future direction in a single concise paragraph.
Author Response
Dear reviewer
We thank you very much for taking the time to review our manuscript, and thank you for your affirmation of our work and your professional opinions. We revised the manuscript according to your comments and proofread it carefully to minimize printing, grammar and bibliographic errors. We provide a reply letter as an attachment to reply to your modification comments.
Author Response File: Author Response.docx
Reviewer 4 Report
Summary
The manuscript, “Fine resolution mapping of bamboo distribution in China using dense Landsat image archives and the Google Earth Engine” utilizes the efficiency and capabilities of cloud computing to map bamboo distribution across China using multiple years of remote sensing imagery and ancillary information. The application of random forests to a multi-year stack across a large geography furthers the classification study of vegetation. I found the manuscript to be well written with some formatting issues that complicated review. Overall, the key areas I identified as weaknesses dealt with limitations or omissions in the methodology and results. These issues challenge the interpretation and conclusions found in the discussion and conclusions of the manuscript. I find that there is promise in this study, but the current manuscript requires focus and clarifications.
General comments
The 500 species of bamboo found in China must represent a wide range of morphological, phenological, phenotypical, and other variations. You need to discuss the range of these differences and the implications more specifically in your study. Then justify how you can overcome this variability. In other words, how can you separate out bamboo from other vegetation? Is bamboo as a functional vegetation group (or sub-family) different enough from other vegetation across the variability in species and range to separate out by spectral differences? If you lump bamboo together, then is the variation found in here so large as to limit your ability to separate out bamboo from other vegetation? You do split out your classification by provinces, but you do not justify that these geopolitical boundaries actually are a suitable substitute for ecological boundaries that may do a better job parsing out your study into manageable chunks.
I am confused why you treated Hainan province so different than the rest of the provinces. There seems no reason why elevation data is not available for Hainan, in fact in figure 1 you actually display elevation data!? If there was a deliberate reason to treat this province differently than the other provinces, then this is not presented clearly. As presented, the additional vegetation indices and metrics used for those indices just serve to identify many metrics you did not test in the other provinces. Why not use EVI and GCVI and the quartile, amplitude, and phase metrics for mapping all provinces? Random forest classification can easily handle these additional variables. The Hainan results are also not presented as separate in the results.
Likewise, there seems to be some limitation in the input variables used beyond the vegetation indices. Since you have elevation (form SRTM) why not also include other useful covariates that can be derived from this (e.g., aspect, slope, etc.). In addition, you have chosen a complicated approach to develop texture variables that is not very well described (flesh out this section). You cannot dismiss texture as a useful classification variable when you only partially test this metric. The “composite strategy” is not described clearly so I cannot evaluate it. It also seems like you are using all the Landsat bands in the principal component analysis but have not justified why that approach. You could just use three visible bands, make a grayscale image, and calculate the GLCM. Or many other options.
My main point here about the methods is that your selection of variables feels arbitrary. Through the objectives and justification of your methods you need to argue for why this specific selection us used. If you want to make the best map possible, why not at lest test all of the variables that you present in your methods together?
Some additional points about the methods include a lack of detail about how to go from a 50 k-fold approach (and note that most RF approaches already subset the data) to the final model, details about the RF process (assumptions, trees, etc.), and a curious omission of temperature as a limiting factor examined for distribution of bamboo.
In the results you have overgeneralized any geographic differences that may have been found. If you split by province (which could just be a variable in the RF model) then did this lead to any differences in results? What was the importance scores for Hainan, where you use different input variables? If you want to conclude that canopy moisture content is different from other vegetation, then you could use some techniques like partial dependance plots to actually visualize where bamboo and non-bamboo fall along this predictor.
I feel that first you need to think about specific objectives beyond just mapping bamboo across China. Did you want to compare models? Do you want to identify geographic or provincial differences? After you tighten up these specifics then you can adjust methods and results to convey the findings you want to focus on rather than what is currently overwhelming and lacking focus. Then the discussion should get what your findings mean for these objectives.
I cover some of these issues more specifically below (e.g., identify where they appear in the manuscript) and I have also identified some other specific areas that need addressing in this manuscript.
Specific comments
Line 16: Consider being more specific about why or what is “important”
Line 23: This oversimplifies your conclusions. See comments about the results and discussion and revise this. My reading of your importance scores is that it takes a combination of vegetation indices to separate out bamboo.
Line 24: Perhaps avoid significantly as it may be confusing in the context. How about “particularly?”
Line 29: provinces (needs to be plural)
Line 31: Since you say temperature is important, can you say something about that here too? What temp range is the most diagnostic (yearly mean, warm season mean, etc.).
Line 41: Half of what? Clarify this. (amount, area, number of species, number of genera)
Line 42: Minor point, but the Qinling-Huaihe line is unfamiliar to me so I want to know why it only extends partway into China? In other words, the bamboo extent is also limited by an east to west boundary as well as this line which is a north-south boundary.
Line 56: The opening two lines of this paragraph don’t really setup the rest of the paragraph (and are overly general and not referenced). Perhaps setup more specifically to bamboo. E.g. “Due to the wide distribution of bamboo in China, distribution mapping my field surveys would be laborious and time consuming. Remote sensing technology has inherent advantages in this context…. However, bamboo….”
Line 60: Perhaps avoid significantly as it may be confusing in the context.
Line 67: Were these prior studies also more limited in what species they identified? Plus line 68 seems to indicate that your study has been already completed by others with good accuracy. Were there limitations to these studies or how will you build on them? It is unclear currently.
Line 79: Maybe choose a different word than “dense” or define more specifically. MODIS is daily, sentinel is 8 days, both more “dense” than Landsat.
Line 92: Nice helpful figure. Please include definitions of acronyms in the figure legend.
Line 94: Reference data is CRITICAL to a remote sensing study or all you have is some nice colors on a map. Replace the first sentence with something that doesn’t make me think the reference data was too laborious for you to do well and therefore doubt your findings and accuracy numbers….
Line 101: “Conducted” may be better than “determined”
Line 104: Should there be a period after “plot”
Line 106: While the imagery may be made available in GEE, please also give the type and date of the actual image source you used here.
Line 112: Same as 106
Line 139: Why not? Looks like SRTM covers your whole area
Line 153: You already had a Table 1?
Line 156: You mean Table 3?
Line 163: But you are doing multispectral image classification, not hyperspectral?
Line 168: How did you fine tune tree numbers? How many trees did you use? What RF package did you use? RF is resistant to overfitting, so it should only matter that you used enough trees that your importance scores stabilized. More info needed in the methods here.
Line 179: 4?
Line 185: You produced a final map, but which of the 50 k-folds is it? Need to clarify this here
Line 189: The intro talks about temperature being important, why not also analyze that here?
Line 191: Delete paragraph
Line 197: Table 1 and 2?
Line 198: So, this is the mean of the importance score for each province and each k-fold?
Line 209: You never really tested this, so this is too much of a leap to conclude here. One textural approach does not mean all are insufficient. At 30m you are mapping stand characteristics, so need to think about what would make a bamboo stand different than other vegetation. Revise your conclusion here.
Line 211: Do these differ between provinces? That would be really important information to include. Perhaps the SD of importance scores? In other words, is there are reason for future studies to also break out the classification approach by province?
Line 212: This is not figure 1, right? Hard to track when the figure and table numbers are off. Where are the values for the different inputs used for Hainan?
Line 227: This needs some explanation. Why are there such large spreads for provinces (its not just PA, but the relationship between the three values that should be considered)? Why would plantation training data be less accurate? Time since collected? Natural vs. managed? Species diffs?
Line 230: I think this is figure 5 now?
Line 238: How do you know it is underestimated (no reference here)
Line 245: There is a lot going on in your study, least of all different accuracies. So how does the lower accuracy provinces line up here with the survey statistics? How do you know the survey statistics are off, if the larger discrepancies are in the provinces where you have lower kappa (and other metrics) scores?
Line 245: I think you need to pull together at this point some figure or table that connects the dots between training data, accuracy, and comparisons with forest inventory/survey results. Many pieces going on here that are hard to keep track of.
Line 282: Need a different word than “significant”
Line 291: I didn’t thin this was a finding per se, so needs a reference. Or change this sentence to talk about what could improve your results specifically (e.g., you needed more recent reference data)
Line 296: Is the water content of bamboo lower than all other vegetation? With all 500 species in China and the large range they are found? Can you really say this so generally without qualifying or specifying?
Line 297: Where was the “phenological variations expressed from a seasonal vegetation index” in your results? I think you mean Hainan province, but these results are never discussed or presented.
Line 310: Fragile? Consider a word choice that relates to remote sensing more clearly.
Line 311: Careful from here to the end. Your study did not actually look at these questions (there are points from the intro), so you need to reference the studies that did this work here.
Author Response
Dear reviewer
We thank you very much for taking the time to review our manuscript, and thank you for your affirmation of our work and your professional opinions. We revised the manuscript according to your comments and proofread it carefully to minimize printing, grammar and bibliographic errors. We provide a reply letter as an attachment to reply to your modification comments.
Author Response File: Author Response.docx
Round 2
Reviewer 2 Report
The authors have addressed my comments, and accept in present form.
Author Response
Dear reviewer,
This is to express our gratitude for your time and expertise in reviewing our manuscripts (manuscript ID: remotesensing-1514848). And with the help of your professional comments and suggestions, our manuscript has become clearer and more professional. Here, we once again express our high respect and gratitude to you for your professionalism and dedication. Best wishes to you and your family!
Sincerely,
Shuhua, Qi
Corresponding anthor
Peng, Gong
Email: penggong@hku.hk
Reviewer 4 Report
Thanks for providing a new version and the opportunity to review. I feel that some of the responses to my questions also should have been included in the manuscript, and there are a few areas that still can be strengthened or clarified. Some specific instances that should be addressed:
Line 103 – I think you could make this goal and the objectives that follow more explicit. Reference your prior work (your experience) and then make some specific statements about mapping and using the resulting maps to identify country wide patterns. Also include why this work is useful.
Line 119 – Note that I am not asking you to map 500 species separately. I am asking you to justify and provide background on why bamboo (collectively) may sperate out from other vegetation from remote sensing techniques. Secondly, provincial boundaries would not be expected to separate out bamboo varieties. Theses are geopolitical boundaries, not ecological boundaries. So in the manuscript you need a few sentences or so on 1) what spectral differences you expect between bamboo and other vegetation (like some of the phenological differences you mention) and 2) why provincial boundaries are appropriate. For #2 you need some justification for how bamboo varieties happen to be distributed by province boundaries, or just say this is not an ideal way to separate out the study area biologically, but is useful for management to have province by province maps which will happen to help somewhat with the bamboo variety challenge.
Line 255 – It still reads like you made an arbitrary decision to treat Hainan Province differently. Your answer to my point just raises more questions. If this method worked better (the different set of predictors including more indices) why not try it as the unified method? Is the work in this manuscript new, or a repeat of results already presented in https://doi.org/10.1080/01431161.2019.1633702? The manuscript makes it way to hard to understand the progression of work, and why there are two separate approaches.
Line 267 – You do not know this and have not provided a reference for why. Plenty of other provinces in Figure 1 seem to have gentle terrain. As above your decision to treat Hainan province differently comes across as arbitrary in the manuscript. You actually have not come up with a unified way to map bamboo across China since you have a significantly different approach for one province.
Line 269 – My suggestions are to either remove Hainan province from this paper completely (it is distracting and seems a minimal addition based on prior published work) or be much clearer why, a priori, you treat it differently (specific references and justification). You can even include Hainan province maps in the results using the prior work (reference back to that paper in Figure 4 and 5) and not cause confusion on the majority of this paper. Hainan province is barely mentioned in the results but adds a lot to the methods.
Line 405 – My point here about geographic differences is that you mapped 16 provinces separately. So you should have 16 sets of importance scores? For each province model is this the same ranking (always looks like Figure 8) or are there some provinces where texture or NDVI or any of the variables had higher importance scores. You responded to me that the patterns were generally similar, now state that explicitly in the manuscript. And adding some bars to show the standard deviation to Figure 8 could help show that.
Some additional thoughts:
Line 29 – You only tested one way of calculating the GLCM texture features. You cannot make a broad conclusion about the efficacy of GLCM for bamboo classification from your approach. You also have not calculated the difference in accuracy with or without these measures. Perhaps change to “while our selected methodology for calculating GLCM texture features had limited importance scores for features.
Line 30 – Consider, if room a nice concluding sentence for the abstract
Line 141 – Your preference here, but as a small point, I was taught to use “Reference data” instead of “Ground Truth.” Up to you (and yes there are all sorts of ways the literature handles these terms) but many of the ways we identify data for accuracy assessments may not actually lead to “truth.” But we need the reference data to be substantially more accurate than the model we create to test accuracy. See https://doi.org/10.1080/01431161.2019.1633702
Line 285 – Important methodological info. Thanks for adding.
Line 317 – Results are much improved. This reads clearer and more focused now. Thanks.
Line 421 – I understand you need to meet other reviewer comments as well, but this section needs some edits. You need to go through the results one again and move the “interpretation” of the results out and into this section. There are still some areas where you explain rather than present results (like in 4.2, 4.3, and 4.4). Just make sure it is clean whichever way you present them (combined or separate).
Author Response
Dear reviewer,
This is to express our gratitude for your time and expertise in reviewing our manuscripts (manuscript ID: remotesensing-1514848). And according to your professional comments and suggestions, we revised our manuscript. And a response letter was submitted, please see the attachment.
Here, we express our high respect and gratitude to you for your professionalism and dedication. Best wishes to you and your family!
Sincerely,
Shuhua, Qi
Corresponding anthor
Peng, Gong
Email: penggong@hku.hk
Author Response File: Author Response.docx
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.