Next Article in Journal
Annual Sea Level Amplitude Analysis over the North Pacific Ocean Coast by Ensemble Empirical Mode Decomposition Method
Next Article in Special Issue
Predicting Water Stress in Wild Blueberry Fields Using Airborne Visible and Near Infrared Imaging Spectroscopy
Previous Article in Journal
Seasonal and Interannual Variability of Sea Surface Salinity Near Major River Mouths of the World Ocean Inferred from Gridded Satellite and In-Situ Salinity Products
Previous Article in Special Issue
Data Augmentation and Spectral Structure Features for Limited Samples Hyperspectral Classification
 
 
Article
Peer-Review Record

3DeepM: An Ad Hoc Architecture Based on Deep Learning Methods for Multispectral Image Classification

Remote Sens. 2021, 13(4), 729; https://doi.org/10.3390/rs13040729
by Pedro J. Navarro 1,*, Leanne Miller 1, Alberto Gila-Navarro 2, María Victoria Díaz-Galián 2, Diego J. Aguila 3 and Marcos Egea-Cortines 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Remote Sens. 2021, 13(4), 729; https://doi.org/10.3390/rs13040729
Submission received: 17 January 2021 / Revised: 5 February 2021 / Accepted: 11 February 2021 / Published: 17 February 2021
(This article belongs to the Special Issue Feature Extraction and Data Classification in Hyperspectral Imaging)

Round 1

Reviewer 1 Report

The paper presents a very interesting study, the prepared imagery is impressive, and the multispectral classification problems are essential and should be investigated.

The paper provides the successful results for the experiments, although some of the figures require minor corrections.

Especially the confusion matrices presented in tables 5 & 6 are quite... "confusing".
If you sum results in rows, not all of them sum up to 100%, i.e. table 6, row 3 - "F#1[10x10x10]F#2[5x5x5]" - [ITUM9] - the sum of the results - 0.51 + 1.02 + 91.28 is not 100%
And shouldn't all columns also sum up to 100%? (like usually in conf. matrices)

The results for the first and the third algorithm configuration that is reported in table 6 reports 100 percent accuracy for all classes. As such superb & incredibly effective classifier is "unique" the paper lacks the comment of those results.

Additionally, such high accuracy is rather hard to get for a demanding test set. Could the authors specify more the test set?

Is it more possible that the examples are easy to distinguish, or there is a possible information leak between train and test data sets? The method of data split between those two sets should be supplemented in the paper.

The presented study is promising but the manuscript requires some corrections and additions.

Best

Author Response

Review1 Report Form

Comments and Suggestions for Authors

The paper presents a very interesting study, the prepared imagery is impressive, and the multispectral classification problems are essential and should be investigated.

The paper provides the successful results for the experiments, although some of the figures require minor corrections.

- Especially the confusion matrices presented in tables 5 & 6 are quite... "confusing".
If you sum results in rows, not all of them sum up to 100%, i.e. table 6, row 3 - "F#1[10x10x10]F#2[5x5x5]" - [ITUM9] - the sum of the results - 0.51 + 1.02 + 91.28 is not 100%
And shouldn't all columns also sum up to 100%? (like usually in conf. matrices).

We have found mistakes in the tables numbering and we have corrected them in the new version of the manuscript. The revision process has been done with the new numbering.

After the reviewer comments, we have checked the results related to the MBO architecture and we have found minor mistakes in Table 5 and Table 7. For this reason, we have repeated the computation process and we have included the corrections in the paper (line 523 to 537 and line 548).

" The results shown in Table 5 confirm that the designs proposed for the SBO and MBO architectures based on two 3D filter blocks have managed to classify 100% of the grape berries correctly. With the validation and testing sets, a rating of 100% was obtained with all the proposed architectures except with the SBO architecture and kernel sizes: F#1[10x10x10] + F#2[5x5x5].

As for the relationship between the distribution of kernel sizes in 3D filter blocks and the number of epochs, it should be noted that the constant kernel sequence F#1[7x7x7]+F#2[7x7x7] achieved a maximum rating value with only 7 epochs out of a total of 20.

In addition, in Figure 1, it can be observed that the data augmentation process followed in section 2.2.2 by which a set of 10400 multispectral images have been gen-erated, starting from a set of 1810 images captured in the dark chamber, has allowed models that have been able to generalize and classify all of the original images in the test set to be obtained."

- The results for the first and the third algorithm configuration that is reported in table 6 reports 100 percent accuracy for all classes. As such superb & incredibly effective classifier is "unique" the paper lacks the comment of those results.

We believe that having such a large number of bands i.e. images at different wavelengths creates an extremely informative dataset. This helps with any ML or DL process to obtain good results. We have added a comment about the high accuracy of the architecture in a new subsection 4.2 in Discussion. (Line 619 to 639)

"4.2. Remote sensing applications

..

3DeepM can be used for multichannel image classification in remote sensing applications as well as other types of application, in two ways: (1) redesign the architecture using the detailed design and implementation steps shown in section 2.3.2; or (2) use the transfer learning techniques described in the literature provided at the beginning of the section.

            ..."

- Additionally, such high accuracy is rather hard to get for a demanding test set. Could the authors specify more the test set?

We have included a new figure, (Figure 1), where we have specifically explained the partitions and the differences between the training, validation and test datasets (Line 117)

see file attached

Figure 1. Flow diagram for the acquisition and data augmentation processes

 

Is it more possible that the examples are easy to distinguish, or there is a possible information leak between train and test data sets? The method of data split between those two sets should be supplemented in the paper.

From the different grapes used, the dark red and middle red are relatively easy to distinguish when fully mature, not so when maturation is not complete. However, the green varieties are really difficult to tell apart just by colour or size.

The new Figure 1 along with Figure 10 allow the reader to know exactly how the split process was carried out. To summarize, after MSI acquisition, we obtained 1810 multispectral images (37 channels per image). These were split in two physically separated sets: training (75%-1358 MSI) and validating (25%-452 MSI). After that we augmented both sets to 10400: training (8100) and validating (2400). The model was created using the training dataset and was validated with validation dataset. In the validation process an accuracy of 100% was obtained. To verify this result, we used the initial set of 1810 MSI, and we got the same results again (100%). This fact confirms that the model has generalized correctly.

The presented study is promising but the manuscript requires some corrections and additions.

Best

Author Response File: Author Response.pdf

Reviewer 2 Report

Introduction is well according to the limited number of scientific publications in this field. Probably it could be interesting to provide some general approaches in the field of deep learning image classification for agricultural crops.

The objectives of your approach need to be more visible in the introduction section. They need to be well adapted to the scope and aim of the journal.

Methodology - it can be useful to provide a table with the basic datasets used and integrated in the analysis together with their features, while the scheme of the entire approach from figure 9 need to be presented earlier, with data input, data processing and data outputs (it is recommended to employ more remote sensing specific terminology).

Figure 3 - it is better to explain the acronym for the vertical scale of the diagrams. Spectral reflectance is common for this type of approaches.

It is essential to identify the key basic concepts for a scientific approach (partly explained in section 2.2.2). We understand the strong experimental part of the study but the applications are developed on a scientific basement and this need to be emphasized too (using some more high ranked references, including the spectrometric aspects).

Section 2.3 - these algorithms need some remote sensing related references and short description of their application issues in the current approach. It is recommended to avoid general theory statements of algorithms. These are used in literature in remote sensing image classification approaches.

Figure 9 need to be carefully explained. A list of acronyms can help the reader to understand the approach.

The question is the application of the current methodology for typical aerial hyperspectral imagery in order to connect it better with remote sensing approaches.

Of course, the training stage is essential in order to perform the deep learning and apply the algorithms...

Results

A statistical presentation of results in good but it might be necessary to prepare the explanation of their significance for remote sensing approaches.

Again, the deep learning strategy is correct but this need to be prepared to be tested on some high resolution hyperspectral data (ex. image from drone or from low altitude flight aircraft). This is a step to a possible approach in precision crop management...

Discussion - the models are adequately calibrated, but the application is still limited to deep learning/classification training and less to remote sensing image classification.

Conclusion - an explanation concerning the potential of the approach for remote sensing image analysis and processing  is needed

 

 

 

 

Author Response

Review2 Report Form

 

We have found mistakes in the figure numbering and we have corrected them in the new version of the manuscript. The revision process has been done with the new numbering.

 

We wanted to state that our paper has been sent to the special issue “Feature Extraction and Data Classification in Hyperspectral Imaging” and within that special issue the call for: “Deep learning approaches for data mining and data classification”. We think that our approach is generally suitable for any type of hyperspectral image data. Nevertheless, we have added some references about it.

Introduction is well according to the limited number of scientific publications in this field. Probably it could be interesting to provide some general approaches in the field of deep learning image classification for agricultural crops.

- We have added a paragraph in lines 58-65 related with general approaches in the field of remote sensing and deep learning image classification for agricultural crops. We have included 4 new references related to this field.

"Deep learning methods are being used not only for the close-range classification of crops, but there is also a growing interest in their use for crop identification and land use classification using multispectral satellite images [5]. Various CNN models for crop identification have been designed, trained, and tested using hyperspectral remote sensing images [6]. In [7], an accuracy of 97.58 was obtained using the Indian Pines dataset with 16 different classes. In [8], six different CNN architectures for crop classification were trained using multispectral land cover maps containing 14 classes, with the best CNN model obtaining an overall accuracy of 78-85% with the test samples."

The objectives of your approach need to be more visible in the introduction section. They need to be well adapted to the scope and aim of the journal.

- We have added a text (lines 29-32) to the abstract for adapting to the scope.

Methodology - it can be useful to provide a table with the basic datasets used and integrated in the analysis together with their features, while the scheme of the entire approach from figure 9 need to be presented earlier, with data input, data processing and data outputs (it is recommended to employ more remote sensing specific terminology).

We have included a new figure (Figure 1) in line 117, where we have specifically explained the partitions and the differences between the training, validation and test datasets.

see file attached

Figure 1. Flow diagram for the acquisition and data augmentation processes

 

Figure 3 - it is better to explain the acronym for the vertical scale of the diagrams. Spectral reflectance is common for this type of approaches.

- We have added the meaning of QE (Quantum Efficiency (QE%)) in the Figure 4 caption and in the text (line 161)

It is essential to identify the key basic concepts for a scientific approach (partly explained in section 2.2.2). We understand the strong experimental part of the study but the applications are developed on a scientific basement and this need to be emphasized too (using some more high ranked references, including the spectrometric aspects).

- In the new draft of the manuscript we have been included 14 new references related to aspect of remote sensing technologies and applications in the Abstract, Introduction, Discussion and Conclusion sections. All new modifications were marked in blue.

Section 2.3 - these algorithms need some remote sensing related references and short description of their application issues in the current approach. It is recommended to avoid general theory statements of algorithms. These are used in literature in remote sensing image classification approaches.

- We have added a new paragraph (lines 431 to 446) with references and a short description of applications of deep learning  architectures in the scope of remote sensing. We have included 9 new references related to this topic.

"All of these architectures have been used in the field of remote sensing in numerous studies. In the case of LeNet-5, for example in [40], the authors used a slightly modified version of this CNN, among others, to track the eyes of typhoons with satellite IR imaging; and in [41], as part of their study, a pretrained LeNet-5 model was created using the UC Merced Dataset to classify cropland images. The AlexNet is used in [42] to classify the images of the UCM spaceflight dataset and in [43], a pretrained AlexNet model is fine-tuned to classify wetland images. VGG16 and 19, InceptionV3, Xception and ResNet50 among others are used in a comparative study [44] to classify complex multispectral optical imagery of wetlands. In [45], the authors designed an improved version of InceptionV3 to classify ship imagery from optical remote sensors. This architecture is also the base of the one used in [46], to classify images of buildings damaged by earthquakes using high resolution remote sensing images; and in [47], where the authors pretrained an ImageNet2015 InceptionV3 model together with VGG19 and ResNet50 to classify images of high spatial resolution. The Xception architecture was used in [48] to detect palm oil plantations, and ResNet50 in [49] to detect airports in large remote sensing images. These are just a few examples of the many existing studies in which CNN algorithms are used."

Figure 9 need to be carefully explained. A list of acronyms can help the reader to understand the approach.

- According to the new numbering, we have clarified the abbreviations of Figure 12 in line 320 and line 325.

The question is the application of the current methodology for typical aerial hyperspectral imagery in order to connect it better with remote sensing approaches.

- We have added a new subsection 4.2. in line 619 to clarify the comments of the reviewer.

Of course, the training stage is essential in order to perform the deep learning and apply the algorithms...

Results

A statistical presentation of results in good but it might be necessary to prepare the explanation of their significance for remote sensing approaches.

Again, the deep learning strategy is correct but this need to be prepared to be tested on some high resolution hyperspectral data (ex. image from drone or from low altitude flight aircraft). This is a step to a possible approach in precision crop management...

Discussion - the models are adequately calibrated, but the application is still limited to deep learning/classification training and less to remote sensing image classification.

We have added a new subsection "4.2. Remote sensing applications" in section "4. Discussion" where we incorporate and discuss the view of the reviewer about the applications of the architecture in remote sensing applications (Line 619 to 639).

"4.2. Remote sensing applications

The architectures reviewed in this work used in remote sensing applications were specifically designed to be used with hyper or multi spectral images. In most cases [40–45,47–49], the authors apply the transfer learning technique, which means that they used a model that was not specifically trained for their case of study and adapted it via fine-tuning.

In this work an architecture for multispectral image classification has been designed and implemented from scratch. Section 2.3.2 describes the design and implementation process carried out to obtain the 3DeepM architecture in detail. The high classification performance (100%) obtained by 3DeepM is mainly due to two factors followed in the research process: (1) a specific multispectral vision system has been developed and an exhaustive sampling process has been carried out and (2) a systematic architecture design process was performed until an optimal design was obtained.

3DeepM can be used for multichannel image classification in remote sensing applications as well as other types of application, in two ways: (1) redesign the architecture using the detailed design and implementation steps shown in section 2.3.2; or (2) use the transfer learning techniques described in the literature provided at the beginning of the section.

Finally, the exhaustive design process of 3DeepM has achieved an architecture with a very small number of parameters which makes it suitable for online multispectral image classification applications on board autonomous robots or unmanned vehicles".

Conclusion - an explanation concerning the potential of the approach for remote sensing image analysis and processing  is needed.

- We have added a paragraph to the Conclusion section about the considerations of the reviewer (line 653).

"The detailed design process described in this work for obtaining 3DeepM allows the use of the architecture in a multitude of applications that use multispectral images such as remote sensing or medical diagnosis. In addition, the small number of parameters of 3DeepM make it ideal for application in online classification systems aboard autonomous robots or unmanned vehicles."

 

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The manuscript is significantly improved and I suggest for future approaches to test the techniques to a typical remote sensing datasat.

Back to TopTop