Next Article in Journal
Onboard Digital Beamformer with Multi-Frequency and Multi-Group Time Delays for High-Resolution Wide-Swath SAR
Previous Article in Journal
Hyperspectral Image Classification via a Novel Spectral–Spatial 3D ConvLSTM-CNN
 
 
Article
Peer-Review Record

Satellite Image Classification Using a Hierarchical Ensemble Learning and Correlation Coefficient-Based Gravitational Search Algorithm

Remote Sens. 2021, 13(21), 4351; https://doi.org/10.3390/rs13214351
by Kowsalya Thiagarajan 1, Mukunthan Manapakkam Anandan 2, Andrzej Stateczny 3,*, Parameshachari Bidare Divakarachari 4 and Hemalatha Kivudujogappa Lingappa 5
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2021, 13(21), 4351; https://doi.org/10.3390/rs13214351
Submission received: 22 August 2021 / Revised: 11 October 2021 / Accepted: 27 October 2021 / Published: 29 October 2021

Round 1

Reviewer 1 Report

The manuscript is devoted to the satellite image classification. Authors are using an interesting and sufficient approach in order to solve the stated problem. The manuscript is well written and scientifically sounds. Also, the structure of the submission is typical for scientific papers and adequate to the readers’ expectations. However, there are some issues and questions that arose from the reading of the paper. 
The Authors are advised to check carefully the sequence of their particular names. Are the forenames and surnames in the same order for all of them?
The analysis of the language is also strongly advised, since there are many grammatical errors in the manuscript. Let me only give one example in the abstract: “The HFEL uses the three different type of” – “uses three different types of”.
The Authors are asked to explain the selection of particular algorithms for their work, in a more convincing way. This is especially important for the selection of MSVM for classification. Why this algorithm is better than many other existing methods?
In my opinion Fig.5. is superfluous. It is only the repetition of the data given in Table 10, but provided in a graphic way. It could be removed. Otherwise, if the Authors are strongly attached to it, it would be better to emphasize the results obtained by the proposed approach, e.g. the bars for HFEL-CCGSA could be represented using other colour than in case of the compared methods.
Also, I would like to ask a question – did the Authors implemented all the methods from the literature selected for the comparison? If not, if the percentage results were simply taken from the references, is the methodology of experiments coherent for all approaches?

Author Response

Reviewer 1: 

Comments and Suggestions for Authors

The manuscript is devoted to the satellite image classification. Authors are using an interesting and sufficient approach in order to solve the stated problem. The manuscript is well written and scientifically sounds. Also, the structure of the submission is typical for scientific papers and adequate to the readers’ expectations. However, there are some issues and questions that arose from the reading of the paper. 

Response

Thank you for your appreciation

-----------------------------------------------------------------------------------------------------------------------------------
The Authors are advised to check carefully the sequence of their particular names. Are the forenames and surnames in the same order for all of them?

Response

Thank you for constructive comment. I have check the details.

 

-----------------------------------------------------------------------------------------------------------------------------------
The analysis of the language is also strongly advised, since there are many grammatical errors in the manuscript. Let me only give one example in the abstract: “The HFEL uses the three different type of” – “uses three different types of”.

Response

Thank you for constructive comment

As per the reviewer comment, the sentence is corrected as

“The HFEL uses three different type of Convolutional Neural Networks (CNN) such as AlexNet, LeNet-5 and residual network (ResNet) for extracting the appropriate features from both the low level and high level images obtained from the hierarchical framework.”

 

-----------------------------------------------------------------------------------------------------------------------------------
The Authors are asked to explain the selection of particular algorithms for their work, in a more convincing way. This is especially important for the selection of MSVM for classification. Why this algorithm is better than many other existing methods?

Response

Thank you for constructive comment

As complex networks were used to extract the features from input images, simple model is required for classification decision and to reduce the computation process. The MSVM model has higher efficiency to handle the high dimensional data and suitable for this classification process.

 

-----------------------------------------------------------------------------------------------------------------------------------
In my opinion Fig.5. is superfluous. It is only the repetition of the data given in Table 10, but provided in a graphic way. It could be removed. Otherwise, if the Authors are strongly attached to it, it would be better to emphasize the results obtained by the proposed approach, e.g. the bars for HFEL-CCGSA could be represented using other colour than in case of the compared methods.

Response

Thank you for constructive comment

As per the reviewer comment, Figure 5 is removed.

 

-----------------------------------------------------------------------------------------------------------------------------------
Also, I would like to ask a question – did the Authors implemented all the methods from the literature selected for the comparison? If not, if the percentage results were simply taken from the references, is the methodology of experiments coherent for all approaches?

Response

Thank you for constructive comment

The results of the existing methods were taken directly from the existing research papers.

Reviewer 2 Report

The paper describes a framework for classification of satellite images. The work is interesting and the obtained results shows the good performance of the approach. The paper is well-organized, but (as I will comment below) the authors should improve the text. 

Section 3 is the major problem: I cannot find how the first two paragraphs are related with the solution paragraph. The idea is good, but the authors should clearly explain how the new approach solves/avoids the problems detected from the revision of the state of art in Section 2.

The authors should revise the paper, rephrasing sentences such as "Therefore, an effective CLASSIFICATION over the ... for improving the CLASSIFICATION accuracy" (Abstract), or "The SATELLITE IMAGE classification is ... for the interpretation of the SATELLITE IMAGES" (Introduction). In the page 2 (lines 56-57), the authors comment that the noise in the satellite images is due to weather and problems with the adquisition system. I am not sure if we can say that the noise is "created". The sentence on page 2 (lines 63-64) must be also revised (" labeled samples are generally requires high time"). In the Conclusion, the sentence "the MSVM is effectively
predicted satellite images by using" must be also revised. There are other examples on the text.

Author Response

Reviewer 2:

Comments and Suggestions for Authors

The paper describes a framework for classification of satellite images. The work is interesting and the obtained results shows the good performance of the approach. The paper is well-organized, but (as I will comment below) the authors should improve the text. 

Response

Thank you for your appreciation

-----------------------------------------------------------------------------------------------------------------------------------

Section 3 is the major problem: I cannot find how the first two paragraphs are related with the solution paragraph. The idea is good, but the authors should clearly explain how the new approach solves/avoids the problems detected from the revision of the state of art in Section 2.

Response

Thank you for constructive comment

In order to avoid confusion, the problem statement is provided in section 1.

 

-----------------------------------------------------------------------------------------------------------------------------------

The authors should revise the paper, rephrasing sentences such as "Therefore, an effective CLASSIFICATION over the ... for improving the CLASSIFICATION accuracy" (Abstract), or "The SATELLITE IMAGE classification is ... for the interpretation of the SATELLITE IMAGES" (Introduction). In the page 2 (lines 56-57), the authors comment that the noise in the satellite images is due to weather and problems with the adquisition system. I am not sure if we can say that the noise is "created". The sentence on page 2 (lines 63-64) must be also revised (" labeled samples are generally requires high time"). In the Conclusion, the sentence "the MSVM is effectively
predicted satellite images by using" must be also revised. There are other examples on the text.

Response

Thank you for constructive comment. We have proof read the paper.

 

Reviewer 3 Report

This paper presents, at face value, an interesting research design exploring a) CNNs for feature extraction, b) image preprocessing and augmentation at various levels, c) feature selection and d) SVM classification. There is certainly merit in studying the role of many of these factors in satellite image classification.

However, the research is presented so poorly that this merit is lost in this paper. The use of language is so extremely poor that it becomes nearly impossible for a clear logical coherence to emerge, and for a strong argument to be formed regarding the research problem, the proposed methods and impact of the proposed intervention. While it can, of course, be appreciated that the language can be improved, and that the study should be evaluated based on its scientific merit alone, I fear that even then the authors have failed to make a convincing argument that their proposed methodology has been correctly designed, applied and importantly, interpreted. I do feel that there is research of value here which could be reworked into something publishable, but as the paper stands I cannot recommend publication in this journal.

Apart from suggesting a wholesale re-editing of the paper in order to make it intelligible, I can offer the following comments and suggestions to the authors towards improving this paper.

Introduction (Sections 1-3)

  • I suggest collapsing the three-part introduction into a single “Introduction” section
  • A clear literature gap and research problem has not been identified. Some problems are stated, but it is not made clear how the proposed methodology will address these problems. This is a challenge later in the paper as well.

The HFEL-CCGSA method (Section 4)

  • There are several mentions of “high” and “low” level images, but this is not explained anywhere. This type of problem is systemic problem in the paper.
  • In Line 169 the authors claim that their proposed approach will “improve the classifcation performances of the HFEL-CCGSA method”. Since their proposed approach IS the HFEL-CCGSA method, this is highly confusing.
  • Figure 1 can be made more visually clear
  • In line 179 reference is made to a “NAIP” dataset, but this is not explained? The citation [24] also does not seem to relate to this dataset? This problem of inappropriate citations to support statements is also pervasive in the paper.
  • In general, the data used is not described in sufficient detail.
  • The “hierarchical” nature of the “hierarchical framework” needs to be explained – it is currently unclear
  • It is not necessary to explain the concept of contrast enhancement in a remote sensing paper.
  • There are some logical issues here. Line 221 states that “Therefore, the histogram equalization is an effective technique”. This has not been demonstrated, however, and the use of “therefore” is inappropriate. Logical gaps such as these, where conclusions are inappropriately drawn from limited evidence, are also common across the paper. This sheds doubt on much of the study and the interpretation of its results.
  • The citations to [27] – [30] for reference to AlexNet, LeNet-5 and ResNet seem again somewhat weak. I may be wrong, but it seems as if these papers are case studies, rather than fundamental sources that provide strong reference to these algorithms.
  • In Line 316, what is fa, fc and H?
  • In Line 318, it is stated that “appropriate features” have been obtained form the CNNs. However, it has not yet been established whether these features are, in fact, appropriate, since feature selection has not yet been done. Another logical leap and inappropriate conclusion.
  • It is never made clear why the selected feature vector is subdivided into separate feature sets.
  • In general it feels as if the methods are not described well enough to be reproducible. In places, extraneous detail is provided about the functioning of, for example, the neural nets, but the research design as applied in this study is not always well documented. For example, how many features were finally selected by the CCGSA process? Dimensionality is not addressed sufficiently in this paper.

Results and discussion (Section 5)

  • At this stage it becomes clear that some aspects of the methodology have not been clearly explained. According to Figure 1, there are two experiments: one where all the extracted features are used for classification (Experiment 1), and one where a selected set of features are used for classification (Experiment 2). It is impossible to do Experiment 1 without having done Experiment 2. So how can they be “combined” as shown in Table 1? What is this combined experiment? To my understanding, Experiment 2 is equivalent to “HFEL-CCGSA”. This would need to be made much clearer.
  • In this section, some conclusions are drawn regarding why the HFEL-CCGSA outperforms other classifiers which I feel are spurious. The research design did not include sufficient control for the authors to authoritatively state that outperformance is definitely due to e.g. “ensemble learning” or the “hierarchical framework” or the correlation-based feature selection or the “reduction of outliers” .
  • On this point, I am not convinced that what is proposed here can legitimately be called “ensemble learning”. At no stage is there a combination of classification models which perform a vote or combine to provide superior accuracies. The authors simply used three different CNNs to derive image features, and these were pooled together. A separate feature selection was done, and a single classifier (MSVM) was used. I would suggest steering clear of claiming that this methdology constitutes “ensemble learning”.
  • A very big issue in this section is that results are presented from methods which were never discussed. This includes Particle Swarm Optimisation and the Binary Dragonfly Algorithm, as well as everything presented in Section 5.4. This is simply unacceptable.

I hope these comments can aid in the improvement of the paper for future submission.

Comments for author File: Comments.pdf

Author Response

Reviewer 3:

This paper presents, at face value, an interesting research design exploring a) CNNs for feature extraction, b) image preprocessing and augmentation at various levels, c) feature selection and d) SVM classification. There is certainly merit in studying the role of many of these factors in satellite image classification.

However, the research is presented so poorly that this merit is lost in this paper. The use of language is so extremely poor that it becomes nearly impossible for a clear logical coherence to emerge, and for a strong argument to be formed regarding the research problem, the proposed methods and impact of the proposed intervention. While it can, of course, be appreciated that the language can be improved, and that the study should be evaluated based on its scientific merit alone, I fear that even then the authors have failed to make a convincing argument that their proposed methodology has been correctly designed, applied and importantly, interpreted. I do feel that there is research of value here which could be reworked into something publishable, but as the paper stands I cannot recommend publication in this journal.

Apart from suggesting a wholesale re-editing of the paper in order to make it intelligible, I can offer the following comments and suggestions to the authors towards improving this paper.

Introduction (Sections 1-3)

  • I suggest collapsing the three-part introduction into a single “Introduction” section
  • A clear literature gap and research problem has not been identified. Some problems are stated, but it is not made clear how the proposed methodology will address these problems. This is a challenge later in the paper as well.

Response

Thank you for constructive comment

As per the reviewer comment, section 1-3 are merged into section 1.

The problem statement is presented in section 1 as

“The better classification of the satellite images resulted in high classification accuracy. The hyper tuned deep learning architecture obtained less accuracy during the classification process due to vanishing gradient problem [16]. Moreover, the net dropout technique is used to avoid the overfitting issues affects the classification accuracy due to removal of relevant features [17]. The Dropsample developed for the CNN failed to consider the similarity of the classes which may affect the classification performances [18]. Moreover, the huge amount of freedom degrees is required for the GeoSystemNet model to maintain the classification performances [21].”

 

The proposed method solution is given in section 1 as

“In this paper, classification accuracy over the satellite images is increased using the multiple ensemble features and optimal features selected from the CCGSA technique to selects the relevant features to avoid overfitting problem. The correlation coefficient considered in the feature selection process is used to avoid the irrelevant features from the feature set. The classification accuracy of the HFEL-CCGSA method also improved using both the low level and high level image data obtained from the hierarchical framework to maintain gradient in network.”

 

-------------------------------------------------------------------------------------------------------------------------------------

The HFEL-CCGSA method (Section 4)

  • There are several mentions of “high” and “low” level images, but this is not explained anywhere. This type of problem is systemic problem in the paper.

Response

Thank you for constructive comment

The ‘high’ and ‘low’ images are changed into hierarchical images.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In Line 169 the authors claim that their proposed approach will “improve the classifcation performances of the HFEL-CCGSA method”. Since their proposed approach IS the HFEL-CCGSA method, this is highly confusing.

Response

Thank you for constructive comment

 

Section 4 is changed as section 2 due to combination of section 1, 2, 3 as section 1.

 

The given sentence is corrected in section 2 as

“Therefore, the utilization of both the low and high level images, feature extraction from the CNN and optimal feature selection are used to improve the classification performances.”

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • Figure 1 can be made more visually clear

Response

Thank you for constructive comment

The clear presentation of figure 1 is provided in section 2.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In line 179 reference is made to a “NAIP” dataset, but this is not explained? The citation [24] also does not seem to relate to this dataset? This problem of inappropriate citations to support statements is also pervasive in the paper.

Response

Thank you for constructive comment

The proper citation is provided for dataset as citation [24].

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In general, the data used is not described in sufficient detail.

Response

Thank you for constructive comment

The image size, classes, and details of datasets are given in section 2.1.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • The “hierarchical” nature of the “hierarchical framework” needs to be explained – it is currently unclear

Response

Thank you for constructive comment

The ‘hierarchical’ nature is explained in section 2.2 as

“The hierarchy is arrangement of items to represent the data that is in various levels or same level. Here, images are arranged in raw images, pre-processed, and augumented images to extract features for better representation.”

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • It is not necessary to explain the concept of contrast enhancement in a remote sensing paper.

Response

Thank you for constructive comment

Overview of the contrast enhancement is provided in section 2.2.1.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • There are some logical issues here. Line 221 states that “Therefore, the histogram equalization is an effective technique”. This has not been demonstrated, however, and the use of “therefore” is inappropriate. Logical gaps such as these, where conclusions are inappropriately drawn from limited evidence, are also common across the paper. This sheds doubt on much of the study and the interpretation of its results.

Response

Thank you for constructive comment

As per the reviewer, the sentence is corrected in section 2 as

“Histogram equalization is considered an effective technique to provide a better image without losing its information such as points, image patches and edges [26].”

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • The citations to [27] – [30] for reference to AlexNet, LeNet-5 and ResNet seem again somewhat weak. I may be wrong, but it seems as if these papers are case studies, rather than fundamental sources that provide strong reference to these algorithms.

Response

Thank you for constructive comment

The citation [27] is changed to provide more details about algorithm.

The citation [28 – 30] provides the detail information about algorithms.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In Line 316, what is fa, fc and H?

Response

Thank you for constructive comment

The fa, fc and H are denoted in section 2 as

“The alexnet features , LeNet-5 features , and ResNet feature  are combined for feature selection process.”

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In Line 318, it is stated that “appropriate features” have been obtained form the CNNs. However, it has not yet been established whether these features are, in fact, appropriate, since feature selection has not yet been done. Another logical leap and inappropriate conclusion.

Response

Thank you for constructive comment

Feature selection is applied in section 2.4 based on Correlation Coefficient based Gravitational Search Algorithm (CCGSA).

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • It is never made clear why the selected feature vector is subdivided into separate feature sets.

Response

Thank you for constructive comment

Selected feature vector is sub-divided into separate feature sets to measure correlation between the features.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In general it feels as if the methods are not described well enough to be reproducible. In places, extraneous detail is provided about the functioning of, for example, the neural nets, but the research design as applied in this study is not always well documented. For example, how many features were finally selected by the CCGSA process? Dimensionality is not addressed sufficiently in this paper.

Response

Thank you for constructive comment

The details of feature selection are given in section 3.4 as

“The features from CNN models are in size of  and feature selection model of CCGSA selects 0.8 correlated features from extracted features.”

 

-------------------------------------------------------------------------------------------------------------------------------------

 

Results and discussion (Section 5)

  • At this stage it becomes clear that some aspects of the methodology have not been clearly explained. According to Figure 1, there are two experiments: one where all the extracted features are used for classification (Experiment 1), and one where a selected set of features are used for classification (Experiment 2). It is impossible to do Experiment 1 without having done Experiment 2. So how can they be “combined” as shown in Table 1? What is this combined experiment? To my understanding, Experiment 2 is equivalent to “HFEL-CCGSA”. This would need to be made much clearer.

Response

Thank you for constructive comment

Experiment 1 is denoted as ‘without CCGSA’ and experiment 2 is denoted as ‘with CCGSA’. The ‘HFEL-CCGSA’ is ensemble method that selects the best performance model.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • In this section, some conclusions are drawn regarding why the HFEL-CCGSA outperforms other classifiers which I feel are spurious. The research design did not include sufficient control for the authors to authoritatively state that outperformance is definitely due to e.g. “ensemble learning” or the “hierarchical framework” or the correlation-based feature selection or the “reduction of outliers” .

Response

Thank you for constructive comment

The proposed method advantage is provided in section 2 as

“The Hierarchical framework provides the data in various manner for feature learning, CCGSA method selects the features based on correlation, and ensemble learning selects the features set based on MSVM model.”

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • On this point, I am not convinced that what is proposed here can legitimately be called “ensemble learning”. At no stage is there a combination of classification models which perform a vote or combine to provide superior accuracies. The authors simply used three different CNNs to derive image features, and these were pooled together. A separate feature selection was done, and a single classifier (MSVM) was used. I would suggest steering clear of claiming that this methdology constitutes “ensemble learning”.

Response

Thank you for constructive comment

Features from three CNN models are combined and applied to CCGSA method to select the features for classification using MSVM. The superior performance of the features is measured by MSVM of ‘without CCGSA’ and ‘with CCGSA’.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

  • A very big issue in this section is that results are presented from methods which were never discussed. This includes Particle Swarm Optimisation and the Binary Dragonfly Algorithm, as well as everything presented in Section 5.4. This is simply unacceptable.

Response

Thank you for constructive comment

Section 5 is changed as Section 3.

Section 3.4 provides the comparison analysis with existing methods that are discussed in section 1.

Apart from this comparison of existing method, the proposed method is tested with other commonly used feature selection methods such as Particle Swarm Optimization and Binary Dragonfly Algorithm.

 

-------------------------------------------------------------------------------------------------------------------------------------

 

I hope these comments can aid in the improvement of the paper for future submission.

Reviewer 4 Report

The authors propose a satellite image classification architecture based on hierarchical ensemble and gravitational search algorithm. The combination proposed seems new, and there are many experiments in the paper that show the benefits of the proposed approach.

However, there are several issues that must be corrected / clarified regarding the methodology and, also, the language/grammar.

The main concern is the training and testing procedure and the very high scores achieved (99.99%), can it be an overfitting situation? In the evaluation section, the authors state that they used 70% of the images for training and 30% for testing. Here is the potential problem. Normally, for learning the dataset must be split into training  and validation data, and test data. The validation data is used during the model training, and the test data is used only for the final model evaluation. In the paper it is not clear is the 30% are actually the validation data? If so, then the 99.99% can be explained. This part must be clarified, or redone if it is the case.

Also, it seems that different sets of images are used for each of the neural networks (x1, x2, x3). What is the reason behind this? x3 is the augmented set, why not using this set for all the networks?

Regarding the grammar, there are many mistakes that makes it harder to read/understand the paper. Please carefully correct the paper. Some examples (not all):

"the developed DropSample was failed to consider the similarity" - was is not needed

"affected by a certain events in MGSS"-number agreement (a...events)

"The GeoSystemNet model was solved the classification issue based on" was is not needed- this appears multiple times in the next paragraphs

etc.

 

 

 

 

 

 

 

Author Response

Reviewer 4:

Comments and Suggestions for Authors

The authors propose a satellite image classification architecture based on hierarchical ensemble and gravitational search algorithm. The combination proposed seems new, and there are many experiments in the paper that show the benefits of the proposed approach.

Response

Thank you for your appreciation

-----------------------------------------------------------------------------------------------------------------------------------

However, there are several issues that must be corrected / clarified regarding the methodology and, also, the language/grammar.

Response

Thank you for constructive comment. I have proof read the paper.

 

-----------------------------------------------------------------------------------------------------------------------------------

The main concern is the training and testing procedure and the very high scores achieved (99.99%), can it be an overfitting situation? In the evaluation section, the authors state that they used 70% of the images for training and 30% for testing. Here is the potential problem. Normally, for learning the dataset must be split into training  and validation data, and test data. The validation data is used during the model training, and the test data is used only for the final model evaluation. In the paper it is not clear is the 30% are actually the validation data? If so, then the 99.99% can be explained. This part must be clarified, or redone if it is the case.

Response

Thank you for constructive comment

The proposed method is first evaluated on the common train-test ratio of randomly selected 70-30 %, similar to existing research papers. The proposed method is also evaluated on 5-fold cross validation to test the performance on validation set. The proposed method shows the similar performance on both randomly selected train-test and 5-fold cross validation.

 

-----------------------------------------------------------------------------------------------------------------------------------

Also, it seems that different sets of images are used for each of the neural networks (x1, x2, x3). What is the reason behind this? x3 is the augmented set, why not using this set for all the networks?

Response

Thank you for constructive comment

As this method process in iterative manner, the network choses the image size. The network is tested on various image size and the model shows the similar performance.

 

-----------------------------------------------------------------------------------------------------------------------------------

Regarding the grammar, there are many mistakes that makes it harder to read/understand the paper. Please carefully correct the paper. Some examples (not all):

"the developed DropSample was failed to consider the similarity" - was is not needed

"affected by a certain events in MGSS"-number agreement (a...events)

"The GeoSystemNet model was solved the classification issue based on" was is not needed- this appears multiple times in the next paragraphs

etc.

Response

Thank you for constructive comment. We have proof read the paper.

 

Round 2

Reviewer 2 Report

The paper has been revised according to the suggestions from the reviewers.

Author Response

Dear reviewer, thank you very much for your constructive comments that helped to improve the article.

Reviewer 4 Report

The paper has improved, but one issue still remain and the answer provided by the author is not convincing. 

My previous remark was "Also, it seems that different sets of images are used for each of the neural networks (x1, x2, x3). What is the reason behind this? x3 is the augmented set, why not using this set for all the networks?"

 

The author's response does not clarify the issue.  Please read carefully my remark, the paper text where you assign x1, x2, x3 to particular network. Provide a better justification. The explanation that each network chooses the right image size is not clear. Sets x1, x2, x3 are different regarding the image quality and augmentation, I don't see any text explanation about different sizes.

 

 

Author Response

Reviewer 4:

Comments and Suggestions for Authors

The paper has improved, but one issue still remain and the answer provided by the author is not convincing. 

My previous remark was "Also, it seems that different sets of images are used for each of the neural networks (x1, x2, x3). What is the reason behind this? x3 is the augmented set, why not using this set for all the networks?"

The author's response does not clarify the issue.  Please read carefully my remark, the paper text where you assign x1, x2, x3 to particular network. Provide a better justification. The explanation that each network chooses the right image size is not clear. Sets x1, x2, x3 are different regarding the image quality and augmentation, I don't see any text explanation about different sizes.

Response

Thank you for constructive comment.

The proposed method selects the image size on its own. The proposed method selects the number of input image in adaptive manner based on image parameters. Generally, input raw data (x1) has unknown properties and it is difficult to analysis. The developed model is applied to perform randomly to select the input images from raw data. So, the developed method performs effectively not only on particular dataset and also for unknown properties of dataset. The k-fold cross-validation on proposed method shows that proposed method has higher performance in all folds. This also confirms the proposed method ability to handle unknown properties of image. So, the proposed method adaptively selects the input images and also suitable to handle various sizes of images. This study analysis the effect of pre-processing (x2) such as normalization and augmentation (x3).

This is described in section 2 of manuscript as

“Generally, input images has unknown characteristic and effective model is required to handle unknown characteristics of images. The developed method randomly selects the number of images in the model for classification. The proposed method analysis the effect of the pre-processing such as normalization and augmentation. The k-fold cross-validation is applied to analysis the performance of the developed method in remote sensing classification.”

Back to TopTop