Next Article in Journal
Prevalence and Characteristics of Radiographic Radiolucencies Associated with Class II Composite Restorations
Previous Article in Journal
Three-Dimensional Printing and 3D Scanning: Emerging Technologies Exhibiting High Potential in the Field of Cultural Heritage
 
 
Article
Peer-Review Record

An Environmental Pattern Recognition Method for Traditional Chinese Settlements Using Deep Learning

Appl. Sci. 2023, 13(8), 4778; https://doi.org/10.3390/app13084778
by Yueping Kong 1,*, Peng Xue 1, Yuqian Xu 2 and Xiaolong Li 2
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Reviewer 5:
Appl. Sci. 2023, 13(8), 4778; https://doi.org/10.3390/app13084778
Submission received: 20 February 2023 / Revised: 3 April 2023 / Accepted: 7 April 2023 / Published: 11 April 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

The authors describe a method to classify areas from satellite images. The number of classes is 5 (representative environmental patterns) and the method consists in:

- segmentation of the image (7 categories)

- use of a CNN (one among 3, namely AlexNet, ResNet and DenseNet) to get few features that represent/summarize the original image

- use of few-shot learning

However the method is described three times, and each time in a confused way, first in the introduction, then in section 4, and finally in section 5. Each time the description is partial.

For example, in reading section 4.2 I thought that N was 5, equal to the number of classes, instead it is set to 3 (line 354). In section 4.2 the authors talk of "a" CNN, while in the Introduction they write that it is one of the three AlexNet, ResNet and DenseNet; it would be more appropriate to talk about "a" CNN in the introduction and specify exactly what it is in Sect. 4.1, where a more detailed description is given.

I did not understand if the labelling of the images (one of the 5 classes) and the segmentation (to train the segmentation model) were performed manually by the authors or not, please specify.

The literature references to few-shot learning should clearly appear the first time few-shot learning is discussed (line 88).

The literature reference to DenseNet is [27] in line 303, but [27] does not talk about DenseNets.

Please specify in the caption of Fig. 5 the meaning of the acronyms CE, FC used in the figure.

In line 279  "NxK"  should be "K".

There are some (very few) typos: in Fig. 5 "socre" instead of "score" (I guess), "Gradiebt" instead of Gradient , Fig. 6 "Dncoder" instead of decoder, line 356 "he" instead of The.

In the overall I believe the paper is worth to be published, but the machine should be more clearly described. The results in section 5 are clear, very good, and the reader can finally appreciate the logic behind the system of Fig. 5.

 

 

Author Response

Thank you for your valuable comments and suggestions. We have revised the manuscript according to your suggestions, and the responses to the review comments are attached.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper presents a reasonable method to solve a real application problem. It is well-organized, clearly writing, and shows some interesting results that encouraged to be accepted with major revision. However, the commented questions need only to be answered

1. Please explicitly indicate and clarify the challenges this study aims to address. What are the challenges and why? Why cannot the previous studies well address these challenges.

2. At the end of section 1 add a table that summarizes the advantages and disadvantages of existing methods facing the same problem. This way the reader would rapidly appreciate novelty of the paper.

3. More studies of the Environmental Patterns Recognition strategies should be cited and discussed 

4. Fig.1 and Fig. 5 needs more explanation.

5. Please enrich the captions of all figures and tables for clarification.

6. Why do you build on DenseNet121, Resnet50, Alxnet ? There are many state of the arts deep learning based on attentions and transformers structure that used with more popular and demonstrate better performance.

7. In the comparison to SOTA methods, more experimental results of other state-of-the-art methods should be given.

8. I also find some grammar problems in this paper. Author needs to carefully check these low mistakes, which is very important for readers.

 

Author Response

Thank you for your valuable comments and suggestions. We have revised the manuscript according to your suggestions, and the responses to the review comments are attached.

Author Response File: Author Response.pdf

Reviewer 3 Report

Dear Authors,

   Thanks very much for your manuscript submisstion to MDPI Jour. of Applied Science. This research article proposed a deep learning based approach for automatic recognition on environmental patterns of traditional Chinese settlements (TCS). The authors created a new TCS dataset, and then implemented several CNNs to evaluate the causes for low classification results, followed by a semantic segmentation method for feature extraction. Experiments are conducted to verify the robustness and reliability of work.

   This paper contents some useful results along with good idea, while there is still some room for improvement. I may recommend it as acceptance with minor edits after revision. Suggestions on further edits are listed as below:

   a) Abstract: Lines 10-13, please shorten the narrations in the opening part. Lines 20-24, the current version lacks keynote quantitative results in the concluding remarks, please update it with required edits, 180~200 words.

   b) Introduction: Most part of this section is well written. A few problems:

   Lines 87 and 90: avoid using those hard conjunctions, i.e., "then" and "thus", please apply some soft transition words to make it easily readable. 

   Lines 95-103: Good observation on summary of main contributions. I think these statements can be a bit more specific, especially the first manifold. Besides, I think 3-4 points on summary symbol for better presentation.

   Line 107-108: The positioning of Figure 1 is a bit too far from the claims on Lines 72-74, I may suggest the authors re-arrange the figure. 

   c) Related work: This short section looks a bit too generic. Please consider adding a few more state-of-the-art approaches, partitioning this part with 2-3 subsections with available workflow. In a word, be more specific herein.

   d) Section 3 Dataset construction: Most part of this section works fine. 

   Lines 230-234: Narrations on how to split: training, validating and testing the dataset, should be improved. For instance, the functions of Table 3 and Fig. 4 should be separately described. 

   Lines 222-223, Lines 235-236: The intervals before and after each table (especially after a table), should be preserved with 12 pt as specified by the MDPI template. Each of the figures and tables followed-by my specification, should confront with this rule.

   e) Section 4, Proposed methods: This section has a detailed architecture on pipeline of the semantic segmentation model, while I think the mathematical work are limited, it lacks a list of error functions, it also missed a summary of evaluation metrics, which may be vulnerable to be doubted by any reviewers.

   f) Experiments and Results: Quantitative study are quite a bit limited.

   Lines 314-316: Legends at each subplot of Fig. 7 look a bit too small. Please enhance resolution of each image along with the involved characters.

   Lines 342-344: Please fix the similar problem of Fig. 8 as did for Fig. 7.

   Regarding the deep learning models, I may suggest trying DenseNet-101 and DenseNet-152 for further performance evaluation, if applicable.

   Lines 358-359: Please adjust the linespacing on title of Table 6 (too loose). 

   g) Between Lines 372-373: I think a discussion section on limitations of your study, sensitivity analysis (or ablation tests), and analysis on pros and cons of your experimental design, can be supplemented, for comprehensive improvement on the quality of the research article. Please consider updating.

   h) Conclusions (and Future Work): Lines 374-387, this single paragraph has a lot of room for further updates. Consider splitting this part to 2-3 passages, where the first one may have to include some keynote concluding remarks (with quantitative results), and the following paragraph, addressing some limitations and opening questions to be handled, while the last one, with respect to future study, present a brief summary of potential research topics, summary of challenges tasks, and orientations of prospective work, should be a little bit more specific. Thanks a lot!

   i) References: Starting the Line 404, the list of citations may have to comply with the following edits: (1) Apply the required abbreviated, italic formats on the title of conference proceedings or journals, i.e., "Proceedings" --> "Proc.". (2) Please supplement any of missed information (time period and location) for each conference proceedings and calibrate the citation style. (3) I advise the authors proceed to review and cite both conventional, newer and latest approaches, across one decade range, especially these paper published in latest three years, i.e., Years 2020-2023 which are similar / parallel to your study, can be further enhanced in your upgraded version. These updates will make your citations look even more stronger. (4) Comply with current MDPI template for other tutorial formats in a list of References at various sources.

   Regarding the minor aspects on your revision, here are the suggestions:

   1) Please apply uniform interval, font size and style on the characters of each figures, and fix the rest formatting issues in the proofreading process.

   2) Please avoid hyphenating in the context at each Page. When you are using MS word or Latex to avoid hyphenating a word (which currently takes place multiple times at the end of some lines to cross-over two adjacent lines), the MDPI online template has the options to fix that issue.

   3) The literal quality of this paper can be further improved.  I would advice the peer-reviewed authors to polish the literal aspects of this research article, including grammatical checking and careful proofreading. Thanks so much!

   Again, thank you, and we look forward to seeing your updated research article coming into further acceptance. Stay well and good luck!

With warm regards,

Yours faithfully,

Author Response

Thank you for your valuable comments and suggestions. We have revised the manuscript according to your suggestions, and the responses to the review comments are attached.

Author Response File: Author Response.pdf

Reviewer 4 Report

 

This paper reports on a new architecture that processes images of geographical areas containing mountains, water segments and so on. The goal of the researchers was, since they’ve tried some conventional state-of-the-art image processing Convolutional Neural Networks (CNN) and were not satisfied with the performance results, to create a new architecture that combines the segmentation of the images first to predict the second step. The authors state the problem of overfitting and low generalization through the use of the accuracy metric, then present the “combined” implemented architecture, and in the end provide the improved results. 

 

The paper contains enough significant content since it is one of the first research works that deal with that kind of image. The need for AI applications in agriculture and forest operations needs methodologies like that in general. The state-of-the-art AI models are thoroughly tested in medical applications, digit recognition and fashion a lot, but not all types of images have that kind of characteristics (f.e. an object in the centre) that most of them are successful. 

 

Although the authors speak about land and city dynamics, a reference to current research work that describes the concrete application of AI in agriculture and forestry needs to be presented:

- Holzinger, A., Saranti, A., Angerschmid, A., Retzlaff, C. O., Gronauer, A., Pejakovic, V., ... & Stampfer, K. (2022). Digital transformation in smart farm and forest operations needs human-centered AI: challenges and future directions. Sensors, 22(8), 3043.,

doi: 10.3390/s22083043

Furthermore, from the Introduction it is clear that there is a major difference between the tried architectures; this has to be supported by an Explainable AI method also, to show the necessity of the new architecture. One example paper that can be used as a reference is the following:

 

Schwalbe, G., & Finzel, B. (2023). A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts.

Data Mining and Knowledge Discovery, 1-59.
doi: 10.1007/s10618-022-00867-8

As far as the safety component is concerned, the following reference can also be used:

- Hoenigsberger, F., Saranti, A., Angerschmid, A., Retzlaff, C. O., Gollob, C., Witzmann, S., ... & Stampfer, K. (2022, August). Machine Learning and Knowledge Extraction to Support Work Safety for Smart Forest Operations. In Machine Learning and Knowledge Extraction: 6th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2022, Vienna, Austria, August 23–26, 2022, Proceedings (pp. 362-375). Cham: Springer International Publishing.

 

doi: 10.1007/978-3-031-14463-9_23

 

The methodology needs some improvements, beginning with a good description of what a digital elevation model is. This is not clear to the reviewers that only have a background in computer and data science. The lack of clear rules, as referred to in line 129, rises the question of who sets them – is it the human? Will the human verify the rules? The use of an Explanable AI method could help this goal. The data collection details, diversity and size of section 3.3. is very good and highly appreciated, although the small size of the dataset is apparent. The clear division of the test set is adamant. It is not clear why some elements conflict with each other (lines 226-227).

 

In general, most of the details are presented but f.e. some exact numbers of N, K and M are missing – they need to be specified for the acceptance of this paper. Furthermore, it is advisable to provide a GitHub repository for the transparency and verification of the results. It is not clear if the data augmentations that were selected make sense or are adequate for this type of task. The reviewers advise also experiments with and without data augmentation. Of course the resizing in 256x256 loses important information therefore, it is questionable if this is the best strategy or cutting the images to 256x256 patches and training with a larger dataset. And lastly, the use of accuracy for an imbalanced dataset is completely inadequate; please use balanced accuracy and Mutual Information as presented and explained in the following books:

- Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.".

- MacKay, D. J., & Mac Kay, D. J. (2003). Information theory, inference and learning algorithms. Cambridge university press. (Section 44.5)

- Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: springer.

After repetition of the experiments and reporting of the new metrics as well as xAI methods that shed light on the feature characteristics, this research work can be accepted with minor changes in the written part. What is also important to add in the Conclusion section is the future work directions.

 

The paper is well-written, readable and comprehensive. It is sectioned very well, and the images are readable and have enough caption information. As mentioned above, please replace the word accuracy in lines 65-66 with the word performance. 

 

 

 

 

 

 

Author Response

Thank you for your valuable comments and suggestions. We have revised the manuscript according to your suggestions, and the responses to the review comments are attached.

Author Response File: Author Response.pdf

Reviewer 5 Report

The authors have done a good job solving the traditional Chinese Settlements Environmental Patterns classification problem. While this is a meaningful attempt, there are a few things that need to be fixed:

1. The author should redraw or re-export the figures, and quite a few of them have serious problems;

2. How the 4-channel data used by the author in section 4.1 on page 8 was obtained;

3. The author should present the testing results after meta-learning module to show that the overfitting problem has indeed been improved by the meta-learning method;

4. The author divides the meta-learning data into base class and novel class, and then uses the meta-learning method for training. And the test accuracy of the base class and the novel class needs to be verified separately to show that it is necessary to use meta-learning in this work;

5. The authors need to do ablation experiments to verify the effect of each module;

6. The authors need to compare with other methods to eliminate overfitting to prove the superiority of the method of choosing meta-learning.

7. Some deep learning methods are missing.[1] Graph-based few-shot learning with transformed feature propagation and optimal class allocation [2]Deep-irtarget: An automatic target detector in infrared imagery using dual-domain feature extraction and allocation

Author Response

Thank you for your valuable comments and suggestions. We have revised the manuscript according to your suggestions, and the responses to the review comments are attached.

Author Response File: Author Response.pdf

Round 2

Reviewer 5 Report

Accept

Back to TopTop