Next Article in Journal
AgSAT: A Smart Irrigation Application for Field-Scale Daily Crop ET and Water Requirements Using Satellite Imagery
Previous Article in Journal
Estimation of Snow Depth from AMSR2 and MODIS Data based on Deep Residual Learning Network
Previous Article in Special Issue
Forest Fire Spread Monitoring and Vegetation Dynamics Detection Based on Multi-Source Remote Sensing Images
 
 
Article
Peer-Review Record

Tree Species Classification Based on Fusion Images by GF-5 and Sentinel-2A

Remote Sens. 2022, 14(20), 5088; https://doi.org/10.3390/rs14205088
by Weihua Chen 1, Jie Pan 1,* and Yulin Sun 2
Reviewer 1: Anonymous
Reviewer 2:
Remote Sens. 2022, 14(20), 5088; https://doi.org/10.3390/rs14205088
Submission received: 20 August 2022 / Revised: 28 September 2022 / Accepted: 8 October 2022 / Published: 12 October 2022

Round 1

Reviewer 1 Report

Please see attached file.

Comments for author File: Comments.pdf

Author Response

Dear Ms. Claudia Dragan and reviewers:
Thank you for your letter and the reviewers’ comments on our manuscript entitled " Tree species classification based on fusion images by GF-5 and Sentinel-2A "(ID: remotesensing-1900498). Those comments are very helpful for revising and improving our paper, as well as the important guiding significance to other research. We have studied the comments carefully and made corrections which we hope meet with approval. The main corrections are in the manuscript and the responds to the reviewers’ comments are as follows (the replies are highlighted in blue).


Reviewer #2:
The main mark:
I have carefully read the references you recommended and I have added an experiment for Scheme 10: Spectral features (10 bands), Images preprocessed by Sentinel-2A, using only the spectral bands of Sentinel-2A images combined with RF for classification. Next is the reply to each comment.
1. it is not entirely clear whether you did a resample to 10 m by SNAP or downloaded the 2A-level data?
Response: I downloaded the 2A-level data directly, but the spatial resolution of b5, b6, b7, b8A, b11, b12 was 20m, then resampled to 10m by SNAP.
2. please add number and qualitative characteristics (collection time, location, etc.) of the ground samples from the survey data of forests.
Response: I have added number and qualitative characteristics (collection time, location, etc.) of the ground samples from the survey data of forests. The reference data was collected from May to December 2018 in Purple Mountain, Nanjing, China. There were 777 sub-compartments, with 2535 samples measured, each with an area of 100m2.
3. Provide a more formalized description of the algorithms used (GS, HAF, IFZ, VCT, OIF), or provide links to such a description at the first mention of the algorithms.
Response:
GS: GS method is an image fusion method based on the Gram-Schmidt transform. The number of bands is not limited. GS fusion method uses GS transform to transform the multi-spectral image into orthogonal space, then replaces the first component with a high-resolution image, and finally obtains the fused image by inverse transform. The fused image not only improves the spatial resolution but also maintains the spectral characteristics of the original image.
HAF: Harmonic analysis can decompose the image into a set of components composed of harmonic energy spectrum feature components (harmonic remainder, amplitude, and phase) by analyzing the characteristics between image spectral dimensions. For a single pixel in the image, harmonic analysis expresses the spectral curve of each pixel as the sum of a series of positive ( cosine ) sine waves composed of harmonic remainder, amplitude and phase.
IFZ: The IFZ is used as a basis for determining whether each image is a forest image, and the change in the IFZ index for each image is used to determine whether a forest disturbance has occurred in the time year of the image in the time series.
VCT: VCT is a highly automated vegetation change detection algorithm that has been developed to reconstruct recent forest disturbance history using Landsat time series. For the first presentation of this method see ref. 10.1016/j.rse.2008.06.016
OIF: The OIF index effectively unifies the standard deviation and correlation coefficients, providing a
further basis for image quality judgment. The principle is that the smaller the correlation between the bands, the larger the standard deviation of the bands and the more informative the combination of bands.
4. HAF doesn’t have higher values of the mean square error, it has the best value. Please reformulate it.
Response: the mean square error of the GS image and HAF image has been recalculated and the values are 0.0242,0.0245 respectively.
5. Is Scheme 9 got the best classification results or Scheme 8 as stated above?
Response: 5. Sorry, my carelessness has misled the reader; Scheme 8 got the best results, not Scheme 9.
I have checked the manuscript carefully. Typos and punctuation have been checked throughout the text, the spelling of satellite names and band names have all been changed to be consistent, and sentences that were too long have been revised.
We tried our best to improve the manuscript and made some changes in the manuscript. These changes will not influence the content and framework of the paper. And here we did not list the changes but marked in red in revised paper.
We appreciate Editors/Reviewers’ warm work earnestly, and hope that the correction will meet with approval.
Once again, thank you very much for your comments and suggestions

Author Response File: Author Response.pdf

Reviewer 2 Report

In the manuscript, the authors consider the solution to the actual problem of the tree species classification using remote sensing data. Their suggested is to use hyperspectral and multispectral images together to improve accuracy.

My main remark is that the work lacks a comparison of accuracy with classification only Sentinel-2 images. There are many studies (for example, 10.3390/rs10111794, 10.1016/j.rse.2020.112103, 10.3390/rs11101197, 10.3390/rs10091419) where the tree species classification with only Sentinel-2 images by the Random Forests gives an overall accuracy of 80-90%, close to that obtained in your work. You should classify only Sentinel-2 image bands and compare what adding hyperspectral data gives.

And some comments to improve the presentation of the work:

1. Line 137-139 - it is not entirely clear whether you did a resample to 10 m by SNAP or downloaded the 2A-level data?

2. Line 159 – please add number and qualitative characteristics (collection time, location, etc.) of the ground samples from the survey data of forests.

3. Provide a more formalized description of the algorithms used (GS, HAF, IFZ, VCT, OIF), or provide links to such a description at the first mention of the algorithms.

4. Line 242-244 – HAF doesn’t have higher values of the mean square error, it has the best value. Please reformulate it.

5. Line 291 – is Scheme 9 got the best classification results or Scheme 8 as stated above?

Please check the manuscript carefully. There are many small typos - dots instead of commas and vice versa, the spelling of the names of the satellites and the designation of the bands differ, some sentences are too long and overloaded.

Author Response

Dear Ms. Claudia Dragan and reviewers:

 

Thank you for your letter and the reviewers’ comments on our manuscript entitled " Tree species classification based on fusion images by GF-5 and Sentinel-2A "(ID: remotesensing-1900498). Those comments are very helpful for revising and improving our paper, as well as the important guiding significance to other research. We have studied the comments carefully and made corrections which we hope meet with approval. The main corrections are in the manuscript and the responds to the reviewers’ comments are as follows (the replies are highlighted in blue).

 

Replies to the reviewers’ comments:

Reviewer #1:

General comments:

  1. In The introduction part author should also mention to LiDAR data and provide some related references in forest type mapping.

Response: I have already mentioned LiDAR research in tree species classification, citing several references related to the application of LiDAR in tree species classification (references 10.3390/s19061284, 10.1117/12.2067506, 10.3390/f11030303).

  1. Reference data collection needs to be clearly explained (i.e. how many sample points were recorded, what the size of the sample plot was and so on).

Response: Detailed sample information has been provided and can be found in response to 10 specific comments.

  1. In the result part there is some issues, for example the author mentioned that the highest accuracy gained by scheme 9 and then all following steps were done based on this scheme approach. However, in the table 4 it is obvious that the scheme8 had a highest accuracy. It seems all steps must be done again to get results with scheme8.

Response: I am very sorry for the weak grammar and wrong description, Scheme8 got the highest result, when making the table due to carelessness, I wrote Scheme8 as Scheme9, I have rechecked and revised it.

  1. I am very sorry for the weak grammar and wrong description, Scheme8 got the highest result, when making the table due to carelessness, I wrote Scheme8 as Scheme9, I have rechecked and revised it.

Response: For the discussion section, I have revisited some new references and added them individually according to the main points. Please see the discussion section of the revised draft for more details.

Specific comments:

  1. Please clarify, Medium spectral or spatial resolution?

Response: Medium spectral resolution:

It means multispectral image, Multispectral imaging captures a small number of spectral bands, typically three to fifteen, through the use of varying filters and illumination.

Medium spatial resolution: There is no clear indicator for defining high, medium, and low resolution of remote sensing images, and it is somewhat relative. Remote sensing images with a ground spatial resolution of meters are usually defined as high resolution, 10m as medium resolution, and 100m as low resolution. Commonly used high-resolution remote sensing images include Aerial Film, QuickBird, IKONOS, OrbView-3, GeoEye-1, EROS-A, EROS-B, SPOT5, and so on. ATSER, HJ-1A, HJ-1B, Hyperion, etc., and low-resolution remote sensing images such as NOAA/AVHRR, MODIS, SPOTVEGETATION, etc.

  1. Please delete “and the fused image not only high spatial resolution and spectral information”.

Response: I have deleted “and the fused image not only high spatial resolution and spectral information”.

  1. Please delete “and” and replace it with comma.

Response: I have deleted “and” and replaced it with comma.

  1. Change “component replacement” to “CS method”.

Response: I have changed “component replacement” to “CS method”.

  1. change “had” to “has”.

Response: I have changed “had” to “has”.

  1. Fig 1: What are those points in the figure? I also suggest to put (a) and (b) on each sub-figure. Please also clarify the figure 1 (b): the background image?

Response: Those points in the figure were sample plots. I have put (a) and (b) on each sub-figure. figure 1 (b): GF-5 image ( the Purple Mountain). The revised figure can be seen in the revised draft on lines 134-137.

  1. In general, remote sensing is able to generate an information on “land-cover” not land use. These two words are completely different.

Response: Land cover is a natural attribute, objective, including vegetation, rivers, etc. Land-use is a man-made attribute, active, including urban land, industrial land, etc. I have replaced “land-use” with “land-cover”.

  1. Which approach was applied for atmospheric correction?

Response: The atmospheric correction method for Sentinel-2A images is that of the official European Space Agency (ESA) Sen2cor plug-in, and the atmospheric correction method for GF-5 images is the 6S atmospheric correction model.

  1. Please report the geometric accuracy for geometric correction.

Response: The accuracy of the GF-5 based on the Sentinel-2A image geometry correction is: 0.15 pixels on average.

  1. Change “ground samples” to “reference data”. Please also add more information on: how many samples were measured? What was the size of sample plot?

Response: I have changed “ground samples” to “reference data”. There were 777 sub-compartments, with 2535 samples measured, each with an area of 100m2. The number of samples for each type of tree species was shown in Table 1.

Table 1. Table1. The number of sample points and area

Tree species

Numbers of pixels

Area(m2)

Pinus massoniana (PM)

450

4500

Pinus elliottii (PE)

135

1350

Quercus acutissima (QA)

450

4500

Koelreuteria paniculata Laxm (KPL)

375

3750

Celtis sinensis Pers. (CSP)

450

4500

Liquidambar formosana Hance (LFH)

450

4500

Phyllostachys edulis (BA)

225

2250

  1. what is DBH stand for?

Response: The DBH stands for diameter at breast height, which means the diameter of the tree at 1.3m up from the ground.

  1. change “canopy density” to “canopy cover”.

Response: I have changed “canopy density” to “canopy cover”.

  1. In general Sentinel 2 is not a high spatial resolution instrument.

Response: From the current spatial resolution of remote sensing images, 10m level is defined as medium resolution, Sentinel 2 is not a high spatial resolution instrument, this paper is only based on the spatial resolution of GF-5 images, Sentinel 2 is a higher spatial resolution.

  1. What does VCT stand for?

Response: VCT is a highly automated vegetation change detection algorithm that has been developed to reconstruct recent forest disturbance history using Landsat time series. The specific algorithmic process is described in the reference http://dx.doi.org/10.1016/j.rse.2009.08.017.

  1. Provide a formula to calculate Integrated forest z-score (IFZ).

Response: The formula has been modified, cover letter can be seen.

  1. Please provide the relevant formula for OIF.

Response: The formula has been modified, cover letter can be seen

 

 

  1. : It is not really clear which datasets were used for fusion? And how fusion process applied?

Response: GF-5 and Sentinel-2A images fused to produce 282 bands, the fusion process for both methods, Gram-Schmidt (GS): GS method is an image fusion method based on Gram-Schmidt transform. The number of bands is not limited. GS fusion method uses GS transform to transform the multi-spectral image into orthogonal space, then replaces the first component with high-resolution image, and finally obtains the fused image by inverse transform. The process is as follows: GF-5 and Sentinel-2A images pre-processed and then implemented by the Gram-Schmidt Pan Sharpening module of the ENVI 5.3.1 software.

Harmonic analysis can decompose the image into a set of components composed of harmonic energy spectrum feature components (harmonic remainder, amplitude and phase) by analyzing the characteristics between image spectral dimensions. For a single pixel in the image, harmonic analysis expresses the spectral curve of each pixel as the sum of a series of positive ( cosine ) sine waves composed of harmonic remainder, amplitude and phase. The process is as follows: GF-5 and Sentinel-2A images pre-processed, then cropped to GF-5 based on Sentinel-2A, then implemented by the null spectrum fusion module of PIE-hyp 6.3.

  1. It seems that the authors did not use Landsat image.

Response: Yes, it is true that Landsat image was not to be used, according to the literature http://dx.doi.org/10.1016/j.rse.2009.08.017, the best threshold to distinguish between forest and non-forest is 3, the fusion image used in this paper also uses 3 as the standard threshold, but the extraction effect is not satisfactory, so the best threshold is 3.0, 2.5, 2.0. Use these thresholds to extract forest and non-forest, respectively, and 600 samples were selected for forest and non-forest respectively to determine the best threshold based on the accuracy evaluation results.

  1. This part is really confusing for readers, please re-write.

Response: Sorry for the unclear language, this paragraph has been rewritten.

For Landsat images, when the IFZ was 3, it could be used to distinguish forest and non-forest, but for GF-5 images, when the IFZ was 3, it could not distinguish forest and non-forest very well. Therefore, it was necessary to determine the optimal threshold value of IFZ, designed three schemes respectively, selected 600 samples of forest and non-forest, calculated the accuracy under the three thresholds respectively, and calculated the optimal accuracy of distinguishing forest and non-forest when IFZ=3, 2.5, 2. The results were shown in Table 3. It could be seen that when IFZ=3, The overall accuracy of forest extraction was the highest. The results of three different thresholds were shown in Figure 3. Table 4 and Figure 3 can be found in the revised version.

  1. It seems that the final threshold (i.e. 2.0) has been selected without any quantity comparison. I suggest to do accuracy assessment for Forest and non-forest classification map.

Response: The experiment had been redone. 600 samples of forest and non-forest were selected on the original GF-5 image, and the precision under the three thresholds was calculated respectively. The results were shown in Table 3. It can be seen that when IFZ=3, the overall accuracy of forest extraction is the highest.

Table 4. Accuracy evaluation results for different thresholds

IFZ

OA(%)

KC

3

67.0000% 

0.3400

2.5

76.1667% 

0.5233 

2

88.3333%

0.7667

  1. “The impact of different combinations of features was classified by RF”. Please re-write。

Response: I have re-write the sentence .Images of different scenarios were classified by RF.

  1. It is more dissection rather than a result.

Response: Topography influences vegetation growth, vegetation index represents the cellular structure of vegetation, and texture played an important role in feature classification.It is more dissection rather than a result. I had explained this section in the discussion and it is not appropriate to explain it here.

  1. Based on the table 4, the scheme 8 (with OA = 86.93 and KC 0.85) gained the highest accuracy not the scheme9.

Response: Yes, due to weak English grammar, this is a misrepresentation of the experimental results, for which I apologise. scheme8 and scheme 9 were the higher of all the experimental results, and I described them as both scheme8 and scheme 9 getting the highest experimental results, when in fact it was scheme8 that got the highest results and scheme 9 the second highest. I would describe it in a different way. the results of Scheme 8 and Scheme 9 outperform other scenarios.

  1. In table 5 please clarify which part is ground truth and which part is classification map (please also see he comment no. 23)

Response: Table 5 showed the confusion matrix for the scheme 8. The first column in Table 5 is the classification map and the first row is the Referene Data. I have reworked the table headings. Table 5 in the revised version is table 6.

  1. Please use “ground truth” or “reference data” instead of “forest survey data.

Response: I have used “reference data” instead of “forest survey data’’.

  1. Quercus acutissima (QA) is a broad-leaved tree. Why authors put this species in the conifer forest.

Response: I think it was a mistake, I'm sorry, Quercus acutissima (QA) is a broad-leaved tree. I meant that Pinus massoniana (PM), Liquidambar formosana Hance (LFH), and Quercus acutissima (QA) are three species in a coniferous and broad-leaved mixed forest, three species growing together in the same area, not three pure forests. It is not a pure forest of three species.

  1. Very confusing paragraph, please rewrite.

Response: Sorry, this paragraph has been rewritten, According to Table 5, in terms of PA, Celtis sinensis Pers. (CSP) and Phyllostachys edulis (BA) were the same, achieved the highest accuracy, followed by Pinus massoniana (PM) (94.67%), and the accuracy of all tree species were more than 70%. According to UA, Liquidambar formosana Hance (LFH) was the highest, followed by Pinus elliottii (PE), and the other tree species were more than 85%.

  1. what does ratio mean? How did you compute this ratio?

Response: The ratio of spatial resolution of hyperspectral image and multispectral image during image fusion. For example, the spatial resolution of GF-5 image is 30m, and that of Sentinel-2A is 10m. The ratio is 30:10=3:1.

  1. it is a repetition of the same sentence.

Response: Sorry, the check was not careful enough, I have deleted this repeated sentence.

  1. “Conifers and broadleaf trees of the same species”, this sentence is fundamentally incorrect. How it is possible a same tree be both conifer and broad-leaved tree?

Response: Sorry for the unclear expression, it is true that coniferous and broadleaved forests are different species and the original intention should have been that there is a great similarity in textural characteristics between species of the same type, i.e. the textures are very similar between coniferous forests, the textures are similar between broadleaved forests.

Other changes:

A new section has been added to the discussion section.

 

Special thanks to you for your good comments.

 

We tried our best to improve the manuscript and made some changes in the manuscript. These changes will not influence the content and framework of the paper. And here we did not list the changes but marked in red in revised paper. We appreciate Editors/Reviewers’ warm work earnestly, and hope that the correction will meet with approval.

Once again, thank you very much for your comments and suggestions

 

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thank you for your work to implements all comments and suggestion.

Reviewer 2 Report

Thank you for the excellent work with the paper and good luck in further research!

Back to TopTop