Next Article in Journal
Multispectral and Hyperspectral Image Fusion Based on Joint-Structured Sparse Block-Term Tensor Decomposition
Previous Article in Journal
Projected Changes in Precipitation Based on the CMIP6 Optimal Multi-Model Ensemble in the Pearl River Basin, China
 
 
Article
Peer-Review Record

Novel Land Cover Change Detection Deep Learning Framework with Very Small Initial Samples Using Heterogeneous Remote Sensing Images

Remote Sens. 2023, 15(18), 4609; https://doi.org/10.3390/rs15184609
by Yangpeng Zhu 1,*, Qianyu Li 1, Zhiyong Lv 2 and Nicola Falco 3
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Reviewer 4:
Remote Sens. 2023, 15(18), 4609; https://doi.org/10.3390/rs15184609
Submission received: 13 August 2023 / Revised: 9 September 2023 / Accepted: 18 September 2023 / Published: 19 September 2023

Round 1

Reviewer 1 Report (New Reviewer)

The research paper by the authors demonstrated that the deep learning framework performed better for land cover change detection (LCCD) applied to heterogeneous remotely sensed imagery with few initial samples. Their findings could make a substantial contribution to the field by providing insights into contemporary challenges associated with studies on LCCD, which should be of great interest to Remote Sensing readers. The paper should, however, be properly reconstructed to meet the formatting requirements provided in the author's guide. I recommend that the manuscript be considered for publication, , but minor/major issues must be resolved to improve the paper's quality and completeness. Please consider the following points while revising your current manuscript version:  

1.       The methods section must be reworked in a more intuitive way. I suggest adding titles that describe the study area, materials and methods before the "Method" section, which should include more details about the datasets used and their sources. In addition, I recommend moving and repositioning subheading 3.1. "Dataset description" in lines 272-276 before 2. Please give a reference for the equations in subsection 2.3, "Non-parameter sample-enhanced algorithm," as well as Table 1, which must be cited in the paper. These equations would also benefit from being numbered.

2.       The results presented in the section 3.3, "Experimental Results" are well done, informative, and supported by statistical analysis and well-crafted figures. The heading, however, should be changed and restated as "Results."

3.       References for methods in Tables 2–6 should be cited and provided in Section 2 rather than here, which sounds the same as the last section of the manuscript.

4.       The heading for 3.4,  "Discussion and analysis," needs to be revised to "Discussion. I found this section could be extended to include more substantive discussions by digging into similar studies, which would strengthen the section.

5.       The authors concluded their findings' potential and limitations, but contributions to the body of knowledge and beyond the study area must be added to this paper.

 

  Regards,

 

Comments for author File: Comments.pdf

Author Response

Please see the attached PDF.

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

Main Challenges:

- The main challenge addressed in this paper is performing change detection with heterogeneous remote sensing images when only a small number of labeled training samples are available. Many deep learning methods require large labeled datasets, which are time-consuming and labor-intensive to create. The authors propose an iterative framework to augment a small initial training set and gradually improve detection accuracy.

- Heterogeneous images captured by different sensors at different times can vary significantly in resolution, spectral bands, etc. Directly comparing them for change detection is difficult. Extracting meaningful shared features from the heterogeneous images is a key challenge.

- With only a few initial training samples, the model risks overfitting or failing to generalize well. The sample augmentation method needs to effectively explore the surrounding context rather than just propagate errors.

Suggestions for Improvement:

- Provide more details on the sample augmentation algorithm. How exactly are the potential new samples identified and labeled? Some examples could help illustrate the process.

- Expand the discussion on how the initial training set affects final accuracy. Is there a minimum or optimal size? How does the ratio of changed/unchanged samples impact results?

- Perform ablation studies to validate the contributions of the multi-scale network architecture and sample augmentation method individually.

- Evaluate the approach on a greater variety of change types and image sensor pairings. How does it perform on more subtle change detection tasks?

- Compare to other state-of-the-art methods specialized for limited training data, like few-shot learning techniques.

- Provide code/model details to make the method easily reproducible.

 

Overall the approach is novel and addresses an important problem. With some additional experimental validation and details, the manuscript would be strengthened.

Author Response

Please see the attached PDF.

Author Response File: Author Response.pdf

Reviewer 3 Report (New Reviewer)

Dear authors, I enjoyed reading your study. I made a few corrections; you will see them in the PDF I attached.

Comments for author File: Comments.pdf

I made a few corrections.

Author Response

Please see the attached PDF.

Author Response File: Author Response.pdf

Reviewer 4 Report (New Reviewer)

Thank you for considering me to review this manuscript. Today, accurate land cover maps are fundamental for environmental and climate research. Therefore, this paper, "Novel Land Cover Change Detection Deep Learning Framework with Very Small Initial Samples Using Heterogeneous Remote Sensing Images," can provide new insights into land cover change detection. However, the paper has a few problems. Kindly check the report.

Comments for author File: Comments.pdf

Moderate editing of English language required

Author Response

Please see the attached PDF.

Author Response File: Author Response.pdf

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Author should explained the Limitations of adopted methods and research. Author try to provide the code in the GITHUB for understanding purpose to readers. Details comments attached in the pdf file please see this attachment. Many improvement need in the paper. 

Comments for author File: Comments.pdf

Author should improve text and remove the grammar  errors in the paper.

Author Response

Please see our response in the attached PDF. Thanks.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors of the article present a novel framework to achieve change detection with HRSIs under a small initial sample set. The study is well prepared - I did not find any significant shortcomings. I think that some elements of the text of the article should be refined.

- First of all, it seems to me that the beginning of the introduction should be completed. The introduction should be understandable for both advanced and beginners. There are some misconceptions here.

- Information on the software used by the authors at various stages of the study should be added.

I put the rest of the comments in the attached file.

 

Comments for author File: Comments.pdf

Author Response

Please see our response in the attached PDF. Thanks.

Author Response File: Author Response.pdf

Reviewer 3 Report

Authors introduce a new method for detecting changes in heterogenous remotely sensed images. The method is based on Neural Networks, and aims at working with a low number of training samples, by generating new training samples inside the procedure by an iterative cycle.

In my opinion overall implementation framework surely lacks a sufficient description, and I guess that several theoretical issues affect the manuscript. To the use by authors, I list the most important ones:

- Eqs. 1-3: poor notation. Actually, (2) and (3) refer to a fraction of pixels. Authors should define M^L_k (which is different from M^L_{k,l}). Is it the class changed/unchanged for a pixel without notation? Or value in a pixel, e.g., and <—> indicates the match between changed or unchanged?

- Preprocessing needed? Procedure is not clear. Since data are heterogeneous, they can have different scales, spectral and spatial resolution. First of all, how many channels of the sensor are chosen? Just one as it seems from Figure 1 and from experiments and implicitly from line 194? How? In addition, concatenating requires the same pixel width or height. How to make this for heterogeneous images not having the same pixel width or height? Apparently, images before and after are not registered at all. Therefore, how are they compared pixel-wise? In particular, if comparison is not made for pixels corresponding to the same location, how is it possible to recognize if a “changed” class refers to an actual change in the same pixel or a pixel corresponding to different locations? Summarizing, a pre-processing of the images seems necessary, but no mention is made in the manuscript.

- l. 177: It is mentioned that 20 epochs are used. How is this value chosen? Is the same value used also in comparison with competitor methods? In this case, it would be better to select for the other methods their own suggested value

- l. 187: How large are the blocks? Apparently, they refer to the same location. See my comment before on pre-processing

- l. 198: t1 and t2 are not defined; are they the two times pre and post?

- l. 198: The notation is unclear. Here P is not a PCC as line 192. If I understood the sense, it would be better to say:
L(S^{t1}_{ij} = \begin{cases} 1, & if … \\ 0, & if … \end{cases}

- l. 198: I think that authors should consider the absolute value of the correlation coefficient, otherwise inequality in l. 198 does not make sense. I hope it is only a misprint

- l. 210-216. This way of generating training samples leaves me perplex. Due to the natural variability of pixel intensity and to the probably low number of pixels on which the correlation coefficient is estimated, we can have unequal surrounding blocks marked as equal simply because their estimate of the correlation coefficient is smaller (in absolute value!) than the central block, and viceversa. This means a quite significant number of falsely trained samples. A safer solution is to be stricter in inequalities at l. 198, e.g., considering an extra value to add/subtract. In addition, there can be cases where change occurs really locally, without involvement of surrounding blocks

- l. 214-216: See previous comment for very local changes

- Section 3.1: The size of the two images of each data set (pre and post) have the same pixels and the same spatial resolution and are apparently already been registered (for dataset 3 no information is given). This surely means that a pre-processing has been needed that is never mentioned in the methodology. The same applies also to the number of channels, where 1 or 3 channels are available depending on the sensor. The method apparently use one channel. How is it chosen?

- Section 3.2. What is the sense of distinguishing between traditional and NN methods in comparisons?

- Section 3.3. Description of the train/validatiion scheme is deficitary. No information is given to the reader about if and how training and test (better, training, validation and test) are disjoint (better, independent), which makes the estimated indicators of accuracy unreliable (and also choice of the number of epochs questionable). No indication is given in the paper on the actual number of selected samples for the training (at least the final number after iterations). In addition, no information is given on how is labelling made. Experts? How many? 

- Section 3.3. Figure 9 shows accuracy vs. the number of initial training samples. It is questionable  to deduce that accuracy initially decreases when the number of samples increases, the reason is counterintuitive and not clear, and in my opinion the effect could be due to statistical variability. In addition, it seems to me that a fair comparison with competitor methods should be done choosing the same number of training samples for all, meaning the initial number for the proposed methods. This should significantly decrease the performance of competitor methods, making comparison almost unrealistic.

- Section 3.4(i). To choose an optimal number of samples for the initialization is an issue for the method. First of all, Figure 8 also shows that different behaviours could depend not only on the sensor, but also on the type of image. This essentially means that the “trial-and-error” experiments cited by the authors at l. 357 correspond, de facto, to redo the training for each image, which is nonsensical. Moreover the choice of the number of initial samples is counterintuitive, because, as Figure 9 proves and it seems reasonable, the larger the sample, the better accuracy indicators.

- In which sense nonparametric in the definition of the methodology? What does it mean?

- Finally (l. 34) there is no link for the software

Minor misprints:

- l. 26: maybe selected instead of select
- remove dash in non-paramtric everywhere (use nonparametric).
- l. 233: Details of each image pair are detailed

Author Response

Please see our response in the attached PDF. Thanks.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

I have accepted the article for publication. In the line no. 42 author should add one references before publication. https://www.tandfonline.com/doi/abs/10.1080/10106049.2022.2086622 

Reviewer 3 Report

See attached file

Comments for author File: Comments.pdf

Back to TopTop