Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

AlexNet Convolutional Neural Network for Disease Detection and Classification of Tomato Leaf

Electronics 2022, 11(6), 951; https://doi.org/10.3390/electronics11060951

by Hsing-Chung Chen^1,2

, Agung Mulyo Widodo^1,3

, Andika Wisnujati^1,4

, Mosiur Rahaman¹, Jerry Chun-Wei Lin⁵

, Liukui Chen⁶ and Chien-Erh Weng^7,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Electronics 2022, 11(6), 951; https://doi.org/10.3390/electronics11060951

Submission received: 28 December 2021 / Revised: 25 February 2022 / Accepted: 15 March 2022 / Published: 18 March 2022

(This article belongs to the Special Issue Selected Papers from 14th International Conference on Signal Processing and Communication Systems)

Round 1

Reviewer 1 Report

Why is the Python code displayed in tables? Isn't it possible to show it as a figure or as an appendix after the end of the text?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

I would like the authors to clarify their contribution because there are several papers in conferences and journals with proposals similar to this one. What does this study contribute compared to others?
On the other hand, it is important to review the wording of the document and make it easily understandable for the reader. On the one hand, acronyms are not introduced in all cases. On the other hand, there are quantitative leaps in the wording that do not facilitate its interpretation and that make reading difficult. Also, too long text fragments are included that do not add value to the document and that could be considered an annex in any case. Many pages are devoted to code when the document should focus on explaining the method and the rationale behind it. In addition, it is important to follow the usual format of the journal, including the way of citing references to facilitate review.

Regarding the methodology, a lot of effort is devoted to explaining how the code has been developed. But the language used is only a tool. Therefore, the focus of what is really important is lost. What are the authors proposing in this work? I believe that an effort should be made to restructure the information, make the figures and tables more visual so that they add value and do not mislead the reader, and focus the discourse on the method and not on the tool.

The conclusions must be reviewed in depth explaining what the document contributes to the scope, what the main achievements are and what benefits are obtained. Now it is an unambitious text.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The study focuses on disease detection from plant leaves, with a particular focus on tomato leaves. The authors propose to use a convolutional neural network based around the AlexNet architecture. The network takes inputs in the form of matrices representing the RGB images of tomato leaves. The network is trained on labeled images of healthy leaves as well as leaves of plants infected by one of several types of diseases. The classifier model is trained using a training/cross-validation set and then evaluated on the open test set.

The topic is current and has a potential to help farmers automatically monitor plants and cure or prevent spread of diseases. This being said, the manuscript in its current form does not meet standards of academic writing due to the excessive number of grammatical errors, vague and often incorrect or misleading statements. It appears as if the text was prepared in a rush and little to no time was put into its proofreading. An extreme example supporting this impression is a paragraph in the Introduction which has nothing to do with the authors’ work yet apparently did not catch the authors’ attention.

About seven and half pages, or nearly half of the manuscript, are taken by commands/scripts listings. While it is extremely helpful for the readers to have access to supplementary materials like this, those can be made available in some online repository (e.g., authors’ homepages or GitHub) and referenced in the text via a footnote link. The text of a journal paper should be focused on background overview, motivation of the authors’ work, summary of contributions, detailed steps of the design and evaluation, and discussion of the evaluation results. The specifics on which commands were used to download the experimental datasets and which Keras/TF functions were called at what sequence can be left to the online resource.

In this state of the manuscript, it is hard to follow the authors’ steps in the study and judge relevance and contributions of the presented research effort. The authors are urged to proofread their manuscript, organize the text in the sections into sequences of statements that logically connect, with later statements being implied by the preceding ones. Finally, the authors should consider working with a language expert to assure that the language is crisp and does not put a barrier between the reader and the authors’ work.

The extent of the issues with the text prevents the reviewer to list them in their entirety and so only several representative examples are given below to give an idea:

Introduction (Page 2): “The introduction should briefly place the study in a broad context and highlight why it is important. It should define the purpose of the work and its significance. The current state of the research field should be carefully reviewed and key publications cited. Please highlight controversial and diverging hypotheses when necessary. Finally, briefly mention the main aim of the work and highlight the principal conclusions. As far as possible, please keep the introduction comprehensible to scientists outside your particular field of research. References should be numbered in order of appearance and indicated by a numeral or numerals in square brackets—e.g., [1] or [2,3], or [4–6]. See the end of the document for further details on references.” – a template text?

Introduction: “Tomatoes grow in almost several fairly-drain soil, and nine out of ten farmers cultivate tomatoes in their agricultural lands. Most of planters in agricultural likewise breed tomatoes in their gardens to use the freshly harvested tomatoes in their cooking to get a better taste. However, sometimes farmers and gardeners are not able to manage the growth of the plants properly (Agarwal et al., 2020; Hassan et al., 2021). From conditional logic to Neural Networks, Artificial Intelligence (AI) includes the various automated decision-making techniques. Machine Learning (ML), a subtype of AI, includes decisions or predictions produced using data-driven methodologies. Deep Learning (DL) is the name given to a subsection of machine learning approaches which employ Deep Neural Networks (DNN). Enlightening the CNN to extract the cultured characteristic in interpretable form not only ensures its reliability, but also allows validation of the authenticity of the model and the training dataset through individual interference…” – The section “Tomatoes… to manage the growth of the plants properly” and the following sentences discussing machine learning are disconnected. At this stage, no connection of machine learning and the problem the authors plan to look at is established. The text abruptly jumps between topics.

Introduction: “Machine Learning (ML), a subtype of AI, includes decisions or predictions produced using data-driven methodologies.” - Machine learning is not a subtype of artificial intelligence. Machine learning provides tools that can be used for example for clustering, regression, or classification. These tasks themselves do not qualify as artificial intelligence. However, they can be used as building blocks of an AI system. ML provides tools for AI, but ML is not a subset of AI. ML is used in a variety of applications that have nothing to do with AI as well.

Introduction: “Enlightening the CNN to extract the cultured characteristic in interpretable form not only ensures its reliability, but also allows validation of the authenticity of the model and the training dataset through individual interference.” – It is unclear what the authors mean by ‘enlightening’? This certainly is not a technical term. CNN’s can be trained/adapted/fine-tuned/optimized with respect to some criteria, but it is unclear how the process of enlightening would be conducted. ‘Cultured characteristic’ is not a commonly used term and it is unclear what you are referring to, please rephrase. ‘allows validation of the authenticity of the model and the training dataset through individual interference’ – this is hard to read. What do you mean by ‘authenticity’ of the model – its accuracy in representing the data/task? What does ‘authenticity of the training dataset’ refer to? Unless the task is something like spoofing attack detection, the training data should be real/authentic? Finally, what do you mean by ‘individual interference’ – interference of the data and the model? How would such interference be quantified? Given that this portion of text was likely intended to provide a brief introduction or overview of ML, the text is surprisingly dense and difficult to follow. Please use standard ML terminology.

Introduction: “While some image approaches were used as is, others had to be improved to object a limited layer that completely captures the features to produce the appropriate results.” – This sentence follows immediately after the quoted text in my previous comment and again seems disjoint from what was discussed so far. There was no mention of image processing in the context of ML up to this point, yet this sentence dives into some very particular detail of image processing without any introduction of the topic. ‘others had to be improved to object a limited layer that completely captures the features to produce the appropriate results’ – It is difficult to follow this statement. Seeing the term ‘limited layer’, perhaps you are referring to a bottleneck layer? What does ‘to object a limited layer’ mean? A network-based classifier itself will not be improved by introducing a bottleck layer. The bottleneck layer can be used as a feature extractor implementing dimension reduction. Features extracted from this layer are typically routed to another model that is trained on this reduced and highly discriminative feature space. Your sentence is vague and yet does not imply any of this.

Introduction: “In addition, by figure out the prevented responsiveness maps, we identified several layers that did not contribute to inference and removed these layers within the network, reducing amount of limitations by 75% without affecting classification accuracy (Arya & Singh, 2019; Geetharamani & Pandian, 2019; Toda & Okura, 2019; Verma et al., 2019).” – The grammar here is really bumpy. Perhaps ‘by figuring out’? What do you mean by ‘prevented responsiveness maps’? What is ‘amount of limitations’? This quoted sentence follows right after the previously quoted text. Again, it does not connect with the previous text and abruptly presents some values without introducing the task on which this reduction was implemented and topology of the network. The use of ‘we’ implies that this is an original work by the authors of the manuscript, yet the sentence is finished with several literature references, neither authored by the authors of this manuscript.

Introduction: “Aside from academic research, Machine Learning, Artificial Intelligence, and Deep Learning are quite common in sectors with a lot of data.” – This sentence immediately follows the previous quote and again does not seem to connect with the previous text. Deep learning is an area of machine learning, yet here it is listed as a separate field. Academic research is not limited to small data tasks, a number of academic groups routinely work on big data tasks. Yet your statement somehow puts academic research and big data research into two disjunct groups. Besides being conceptually inaccurate, it is unclear what this sentence is trying to convey.

Section 2 – Related Works: “To overcome the above problems, researchers have come up with some respective solutions. In machine learning, different types of characters can be implemented to classify plant diseases (Hassan et al., 2021).” The previous section did not outline any problems and so it is unclear which problems are being overcome with respective solutions here. ‘In machine learning, different types of characters can be implemented’ – what do you mean by ‘characters’ in the context of plant disease detection?

Section 2 – Related Works: “5 closes cross-validation” – What does ‘closes’ stand for? Do you mean ‘5-fold cross-validation’ by any chance?

Section 2 – Related Works: “residual inetwork”?

Section 3: “The goal of EDA is to provide data in such a way that the data can be comprehended regarding the amount of data in the train and test data, as well as the relationship between variables.” The amount of data in the training and test sets is known as it is decided by the experimenter. How does EDA contribute to this knowledge? What do you mean by ‘the relationship between variables’? What kind of variables are you referring to – features?

Section 3.1: “According to Fig. 1, the yellow leaf curl virus was responsible for the majority of the total train data from leaf disease photos on tomato plants, with 1,961 images.” Based on Figure 1, this virus had more images available than other diseases, but given the 10 categories of diseases and each of them having over 1,700 images, the yellow leaf curl virus did not represent the majority of the data (it is still roughly just slightly over 10% of the data).

Section 3.1: “At least 1,702 pictures of bacterial spots were used to compile the data.” – This sounds like the whole training set had at least 1,702 images. But based on Fig. 1, this value likely refers to the minimum number of images available *per each disease category*.

Section 3.1: “Furthermore, Fig. 2 depicts the entire valid data from 490 photographs of leaf disease in tomato plants, the majority of which is caused by the yellow leaf curl virus. At least 425 photos of bacterial spots were used to compile the data.” – Similar comments as above. The 490 likely refers to the yellow leaf samples only and the minimum of 425 photos is meant per each category.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

The authors have addressed most of my comments and concerns in the revised version of their manuscript. I believe the manuscript now provides sufficient background information and details on the experimental design to benefit the audience. I believe the manuscript is, after perhaps one more language proofreading pass by the authors, ready for publication.

Article Menu

AlexNet Convolutional Neural Network for Disease Detection and Classification of Tomato Leaf

Further Information

Guidelines

MDPI Initiatives

Follow MDPI