Convolutional Neural Networks in Detection of Plant Leaf Diseases: A Review

: Rapid improvements in deep learning (DL) techniques have made it possible to detect and recognize objects from images. DL approaches have recently entered various agricultural and farming applications after being successfully employed in various ﬁelds. Automatic identiﬁcation of plant diseases can help farmers manage their crops more effectively, resulting in higher yields. Detecting plant disease in crops using images is an intrinsically difﬁcult task. In addition to their detection, individual species identiﬁcation is necessary for applying tailored control methods. A survey of research initiatives that use convolutional neural networks (CNN), a type of DL, to address various plant disease detection concerns was undertaken in the current publication. In this work, we have reviewed 100 of the most relevant CNN articles on detecting various plant leaf diseases over the last ﬁve years. In addition, we identiﬁed and summarized several problems and solutions corresponding to the CNN used in plant leaf disease detection. Moreover, Deep convolutional neural networks (DCNN) trained on image data were the most effective method for detecting early disease detection. We expressed the beneﬁts and drawbacks of utilizing CNN in agriculture, and we discussed the direction of future developments in plant disease detection.


Introduction
Plant diseases are one of the most critical elements impacting food production. They are responsible for a significant drop in the economic productivity of crops, as well as being an obstruction to this activity in some cases. According to [1], disease management and control procedures must be carried out effectively to reduce output losses and ensure agricultural sustainability, underlining the importance of continual crop monitoring paired with prompt and accurate disease detection. In addition, as the world's population continues to rise, a significant increase in food production is required (FAO) [2]. This must be combined with the preservation of natural ecosystems through the use of environmentallyfriendly farming methods. Food must keep a high nutritious value while still being secure worldwide [3]. This can be accomplished by using new scientific methodologies for leaf disease diagnosis and crop management, as well as applying these new technologies to large-scale ecosystem monitoring.
The main point for researchers is correctly identifying diseases affecting crops [4]. According to Miller et al. [5], manual practices in conventional farming operations cannot cover large areas of crops and provide early background information for decision-making processes. As a result, researchers have never stopped looking for ways to develop automated practical solutions and effective methods for detecting plant diseases. DL-based models, in particular, have found many applications in plant disease detection. They have overcome the problems associated with traditional classification methods and represent cutting-edge technology in this field. DL [6] is an advanced technique that has shown great promise and success in various fields where it has been used [7]. It is, however, a group of machine learning methods that attempt to model at a high level of data abstraction through articulating structures of various transformations.
The current review aims to describe the state-of-the-art identification and examination of plant disease detection problems using a specific class of DL called CNN, which extends classic Artificial Neural Networks (ANN) by adding more "depth" to the network, as well as the various convolutions that allow the data to be successfully applied in various problems related to images [8]. Therefore, the inquiry of this survey discusses significant contributions concerning CNN and various innovations which aimed to improve the performance of CNN and thus correctly identify diseases.
The motivation for conducting this survey comes from the fact that CNN has recently been primarily used in agriculture, with CNN's growing popularity and success in solving many problems related to agriculture, and the fact that multiple research efforts using CNN to discuss various agricultural problems exist today. As a result of its success, CNN is perhaps the most popular and commonly used approach in agricultural research today.
Regarding image analysis, the current survey focuses on a particular subset of DL models and techniques since there are very few of this type of survey in the agricultural field, especially about CNN utilization. Thus it would be beneficial to present and analyze relevant work to help the authors conduct a more comprehensive review. A discussion about innovative and high-potential techniques for solving numerous difficulties in agriculturerelated to image and DL will be presented. In addition to reviewing recent research in this area, significant practical features of CNN based on images are presented to explain the technique's advantages and disadvantages further.
The rest of the paper is structured as follows: Section 2 provides related work. Section 3 describes the methodologies used in this study. Section 4 presents CNN. Section 5 discusses the applications of CNN in agriculture, Section 6 provides the main problems and solutions associated with the CNN used in plant disease detection, Section 7 presents the discussion and Section 8 concludes the paper.

Related Work
Many studies have been conducted to find an ideal solution to the problem of crop disease detection by creating techniques that can assist in identifying crops in an agricultural environment. This section will provide the most recently reviewed studies on CNN's applicability in the broad field of agriculture; this section includes papers from peerreviewed articles that use CNN methods and plant datasets.
Abade et al. [9] reviewed CNN algorithms for the detection of plant diseases. The authors studied 121 papers that were published between 2010 and 2019. PlantVillage was selected as the most widely used dataset, while TensorFlow was identified as the most frequently used framework in this review. Dhaka et al. [10] outlined the basic methods of CNN models used to identify plant diseases using leaf images. They also compared CNN models, pre-processing approaches, and frameworks. The study also looks at the datasets and performance measures used to assess model efficiency. Moreover, Nagaraju et al. [11] also provided a review to find the best datasets, pre-processing approaches, and DL techniques for various plants. They reviewed and analyzed 84 papers on DL's applicability in plant disease diagnosis. They observed that so many DL methods are limited in their ability to analyze original images and that effective model performance necessitates using a suitable pre-processing technique.
Kamilaris et al. [12] found that DL approaches were used to solve various agricultural challenges. According to the study, DL methods performed better than standard image processing techniques. Fernandez-Quintanilla et al. [13] evaluated weed-monitoring technologies in crops. They focused on weed monitoring devices in agricultural fields that were both remote sensed and ground-based. Weed monitoring is critical for weed control, according to them. They predicted that data acquired by various sensors would be saved in a public cloud and used in appropriate contexts at the optimal time. Lu et al. [14] introduced a review for plant disease classification using a CNN. They evaluated the significant problems and solutions of CNN used for plant disease classification and DL criteria in plant disease classification. They discovered that additional research with more complex datasets was required to obtain a more satisfactory result.
Golhani et al. [15] presented a review paper on hyperspectral data for plant leaf disease identification, highlighting existing problems and potential prospects. They also presented NN approaches for SDI development in a short time. They discovered that, as long as SDIs remain relevant for proper crop protection, they must be tested on various hyperspectral sensors at the plant leaf scale. Bangari et al. [16] presented a review on disease detection using CNN, focusing on potato leaf disease. They reviewed several papers and concluded that convolutional neural networks work better at detecting the disease. They also identified that CNN contributed significantly to the maximum possible accuracy for disease identification.

Methodology
In this work, we discussed the most recent research papers on applying DL in the agricultural field. Moreover, this work was accomplished through two essential stages: the first is the collection of 100 previous research works that discuss DL in its relationship to the agricultural field, and the second is a thorough examination and analysis of the collected work.
In the first stage, we looked for papers and articles published within the last five years using scientific databases such as Science Direct and Elsevier and web-based scientific indexing services. In addition, we conducted our searches for relevant papers using several keywords, the most prominent of which were agriculture, CNN and DL. Papers mentioning CNN but not applying it to the agricultural domain were thus removed. In the second stage, the papers chosen in the first stage were analyzed one by one, taking into account the following research questions: • The approach used. Examining how CNN performs is an essential aspect of this study. As a result, we reviewed and analyzed several relevant studies. We also compared CNN to other current technologies and summarized the most important advantages and disadvantages that affect CNN's performance. It should be noted that the current paper focuses on comparing techniques used for the same data and on the same scale. We also investigated and discussed the most significant problems and limitations identified by previous research.

Convolutional Neural Networks (CNN)
ANNs consist of three different layers: input, one or more hidden, and output layers. Neurons placed in hidden layers have an associated weight and a bias value. These values are multiplied by the input values and sent to an activation function. If the output value is greater than the specified threshold, that node carries the output value to the next layer of the network. Otherwise, no data is transmitted. The process of spreading data in the network from one layer to the successive layer is called a feed-forward network. The ultimate objective is to minimize the cost function for any input when tuning the model weights and bias. The process is depicted in Figure 1. CNNs, a form of multi-layer neural networks, are designed to extract dependencies in a grid-structured input such as images and text. The convolution operation applied in many intermediate layers is the most crucial property of CNNs. Similarly, a convolution operation is a dot-product of a set of grid-structured weights and another set of similarly structured inputs. The term CNN refers to a type of ANN widely used in image recognition and processing. Since the introduction of LeNet-5 in 1998, various innovations in CNN architectures have been presented [17]. In addition, before developing DL for computer vision, learning relied on extracting interesting variables called features. However, these methods require a significant amount of experience in image processing. CNN introduced by [18] have revolutionized image processing and removed manual feature extraction. CNNs operate directly on matrices or even tensors in the case of RGB color three-channel images.
CNNs are now widely used for image classification, image segmentation, face recognition and object recognition. They have been successfully applied by many organizations in various domains such as health, web, mail services, etc. CNN can receive any data input, including images, video, sound, speech, and natural language [19,20]. However, CNN is simply a stack of several layers (see Figure 2), pooling and fully connected layers, beginning with a convolution layer and progressing through the following layers: pooling, Relu correction, and ending with a fully-connected layer [21]. As a result, each image received as input will be filtered, reduced, and corrected several times, to finally form a vector. The strength of the CNN is found in the convolution layer. The CNN will learn the most valuable filters for the task (such as detection). Another benefit is that several convolution layers can be considered: the output of one convolution becomes the input of the next one, and the pooling layer is another component of a CNN. It performs downsampling, which significantly reduces computational weight, memory usage, and the number of parameters. On the other hand, in fully Connected Layers, as the name implies, each layer has a complete connection with the layer that comes before it. We can use a "sigmoid" or "softmax" function with the last fully connected layer for class predictions. As a result, the convolutional layers extract features from the input images, which are then reduced in dimensionality by the pooling layers. Typically, the fully connected layers use the high-level features learned to classify input images into predefined classes at the final layer [22]. Moreover, the classification layer can extract features for classification and detection tasks [23]. Figure 2 provides an overview of the architecture of a typical CNN.

Comparison of Popular CNN Frameworks
CNN frameworks play an important role in projects that use the CNN architecture. Therefore, we now have many frameworks available that allow us to develop tools that can offer a higher level of expertise while simplifying complicated programming difficulties, with each framework created differently for different purposes. Therefore, numerous tools and platforms are available to researchers for conducting DL experimental studies [24]; the most famous of them is resumed in Table 1. Developments in DL provide various tools and platforms to implement CNN in different applications, including the fields of agriculture. Therefore, so many applications and studies in agriculture have used the framework described above, each one employed depending on the author's study conditions, data type, size and complexity of the project. The following sections will provide a detailed description of CNN applications in agricultural fields.

Pre-Trained Network
A pre-trained model has already been trained to solve a similar problem. Using a pretrained model with transfer learning instead of creating a model from scratch to solve a similar problem is usually much quicker and more straightforward than training a network from scratch. Naik et al. [32] classified five diseases of the chili with 12 different deep learning models (AlexNet, DarkNet53, DenseNet201, EfficientNetb0, InceptionV3, MobileNetV2, NasNetLarge, ResNet101, ShuffleNet, SqueezeNet, VGG19, and XceptionNet). The VGG19 model produced the best accuracy value of 83.54% without data augmentation. On the other hand, DarkNet53 gave the best result with data augmentation. They reached the best accuracy values of 98.63% and 99.12% with Squeeze-and-Excitation-based Convolutional Neural Networks (SECNN) without and with augmentation, respectively. The study examined tomato maturity levels with three different CNN models (VGG, Inception and ResNet). The best accuracy value was calculated for VGG19 at 97.37% for batch size 32 and epoch 50.
Partel et al. [33] studied the performance of the YOLO-v3 [34], Faster R-CNN [35], ResNet-50, ResNet-101, and Darknet-53 [36] models to create an intelligent sprayer for realtime plant leaf control. The recall value, precision, and ResNet-50 model performed better than others. Bin-guitcha-Fare and Sharma [37], on the other hand, used the ResNet-101 model. They proved that the input image's size might impact the performance of the ResNet-101. Sahu et al. [38] presented a method for classifying Bean Crop Diseases. The proposed method found that fine-tuning pre-trained networks performed significantly better than training from scratch. The fine-tuning of hyperparameters raises GoogleNet's accuracy from 90.1% to 95.31% and VGG16's accuracy from 89.6% to 93.75%. The effect of transfer learning is later clarified in this study by the network's ability to reuse and transmit features from one specific problem to another.
Mukti et al. [39] presented a method for identifying plant diseases using a transfer learning model based on ResNet50. There are 87.867 images in their dataset. A total of 80% of the dataset was used for training and 20% for validation. Their highest accuracy performance was 99.80%. According to Arya et al. [40], CNN can perform well in detecting various plant diseases. They introduced a method based on a pre-trained model to compare different CNN architectures (i.e., AlexNet, shallow CNN) to identify diseases in potato and mango leaves. The approach was considered more effective using AlexNet (up to 98.33%) than using a shallow CNN, which obtained only 90.85% accuracy.

Training from Scratch
Milioto et al. [41] developed a CNN model for discriminating blobs-wise. They suggested a system that coupled vegetation detection with high-quality plant classification into important crops and weeds in the field. To train the model, they used multi-spectral data. Furthermore, they tested this system on images taken from various sugar beet fields, analyzing different combinations of convolutional layers and fully connected layers to find an efficient and problem-free model. Finally, they achieved a superior result by combining three convolutional layers with two fully linked layers. They addressed that this strategy had no geometric priors, such as planting crops in rows.
Lu et al. [42] reported a novel rice disease detection approach based on DCNN. They trained the suggested model from scratch to identify ten common rice diseases using a dataset of 500 images of infected and healthy rice leaves, all of which were taken from a rice field. The suggested CNN-based model achieves an accuracy of 95.48% using a 10-fold cross-validation technique. The accuracy of this model is substantially higher than that of a traditional machine learning model. Zhang et al. [43] proposed a method for detecting Broad-leaf weeds using a CNN model with three fully connected classification layers and six convolutional layers. They also compared CNN to SVM and reported that the model succeeded in identifying weeds with an accuracy of 96.88%, while SVM could only achieve 89.4%. As a result, they showed that the CNN model outperformed the SVM model in detecting broad-leaf weeds in pastures. Liang et al. [44] also demonstrated that the CNN model performs better than both the LBP and HoG approaches in classification. They developed a low-cost weed identification system using a CNN architecture that includes three pooling, three convolutional, four Dropout layers, and a fully connected layer.
Dyrmann et al. [45] demonstrate a method for detecting plant species in color pictures using a convolutional neural network. The network was created from the ground up and tested on 10,413 images featuring 22 early-stage crop species. Dyrmann et al. created a new system based on their needs, utilizing a combination of convolutional layers, maxpooling layers, fully connected layers, activation functions, batch normalization, and residual layers. The network was then capable of classifying 22 species with an accuracy of 86.2%. Chen et al. [46] suggested a method for detecting tea plant diseases from leaf images based on DCNN. They built and trained the model from scratch. Furthermore, to extract the properties of tea plant diseases from images, they employed a CNNs model called LeafNet that was constructed with several feature extractors. They also employed SVM and MLP classifiers to classify diseases using DSIFT (dense scale-invariant feature transform). The three disease identification classifiers had various performance results, with an average accuracy of 90.16% for LeafNet, 60.62% for the SVM algorithm, and 70.77% for the MLP algorithm.
Nkemelu et al. [47] presented a study for plant seedling classification. They compared the CNN's performance to that of the K-Nearest Neighbor (KNN) algorithm, which scored 56.84%, and SVM, which scored 61.47%, and they proved that CNN was better at distinguishing crop plants diseases. Furthermore, to reach 92.6%, they used three fully connected layers and six convolutional layers in the CNN architecture. They also tested CNN's accuracy using both the original and pre-processed pictures. In another study, Pearlstein et al. [48] presented a study for plant disease detection. They utilized synthetic image data to evaluate and train the CNN model on real data. They constructed a CNN model using two fully connected layers and five convolutional layers. The results demonstrated that CNN could accurately distinguish crop plants from natural images, even when many occlusions were present.   All of these studies are concerned with identifying and classifying plant diseases on images. This article reviewed both pre-trained and custom CNN-based models developed by the authors. CNN models given different names by the authors were taken as CNN. However, a comparison of various studies reveals that, rather than constructing a new CNN model from scratch, most researchers use transfer learning to meet their needs and complete their tasks quickly and easily. The CNN model then is used to perform different tasks for various research in transfer learning. Jiang et al. [49] employed VGG16 to diagnose diseases in rice and wheat plants. As a consequence of the experimental comparisons with other state-of-the-art, the overall accuracy on rice and wheat plants was 97.22% and 98.75%, respectively.

Applications of CNN in Agriculture
Sravan et al. [50] presented the fine-tuning of hyperparameters of current ResNet-50 for disease identification and classification. They reached a higher accuracy of 99.26% on the Plant Village dataset, which contained 20,639 images. Shin et al. [51] presented a method for detecting disease on strawberry leaves. They compared six CNN models for detecting disease on strawberry leaves using various criteria such as speed, accuracy, videlicet, and hardware requirements. ResNet-50 had the highest classification accuracy of 98.11%, SqueezeNet-MOD2 required the least amount of memory, and AlexNet had the fastest processing time. Mohanty et al. [4] worked on multiple crops to identify 26 diseases using PlantVillage, GoogLeNet, and AlexNet. The result recorded an accuracy of 99.35%. Ferentinos et al. [27] also worked on various crops for plant disease identification and diagnosis, and they had a reasonable success rate with Alex-NetOWTB and VGG.
Brahimi et al. [52] conducted a study in which they used AlexNet and GoogLeNet to classify nine tomato leaf diseases, with an accuracy rate of 99.18%. Darwish et al. [53] proposed a method for detecting three maize diseases; in this study, they used VGG16 and 19 and reported a 98.2% average accuracy. Similarly, the authors presented similar studies to identify and recognize diseases in maize using CNN architecture [54,55]. Furthermore, there are other plants in which the authors used CNN architectures. For instance, they used CNN to detect apple disease [56,57]. In [31], they presented a study to detect Black Sigatoka and Black speckled disease in banana plants. The authors used Inception-v3 and 2756 images to detect cassava diseases. Yuwana et al. proposed a study to detect Tea diseases [58], using 5632 images of tea, MCT, AlexNet, and GoogLeNet. Only two segmentation methods were used in the proposed study, each with a kernel size of five and a rotation of 40. Kawasaki et al. [59] proposed a method for detecting cucumber diseases such as melon yellow spot virus and zucchini yellow mosaic virus using CNN.
There are also several studies to detect olive plant disease, such as [60], where Cruz et al. use PlantVillage to detect symptoms of Olive Quick Decline Syndrome. However, their study task may be challenging for large numbers of samples or quarantine pests with restricted modifications. In [61,62], the authors have presented a method to distinguish Fusarium wilt disease in the radish plant. They achieved 93.3% and 90% accuracy, respectively. Another study using CNN to identify ten different diseases of the rice plant may be found in [63]. Table 2 summarizes and clarifies all relevant information to assist readers in selecting one or more criteria and comparing various DL models. Most researchers apply similar CNN architectures, as indicated in Table 2, and achieve similar experiment results. As a result, new requirements and experiments with more datasets and new architectures should be performed; otherwise, much work will be duplicated.
A comparison of CNN architectures reveals that the choice of appropriate CNN models is impacted by experimental settings, dataset, and data size. Figure 4 compares several architectures based on plant type and accuracy trained by various authors. When comparing the performance of these studies, it was noticed that CNN performed very well. The results appeared to be close because the methods used were applied using somewhat similar architectures.

Analysis and Data Extraction
In this section, we give discrete data derived from 100 studies to assist researchers in generating responses to their queries and to provide insight into the application of CNN in the agriculture field, including the datasets, plants, and model architectures used. In the paper collection phase, we choose 100 studies concerning CNN to conduct the data extraction technique, then excluded 25 similar papers. After reviewing the rest of the papers, we summarized 100 studies that met the criteria for our review objectives. Figure 5 depicts the percentage distribution and the number of crops most frequently used in 100 reviewed studies that applied CNN to detect diseases. It is obvious that the majority of authors used multiple crops (more than one type of crop) in their studies with a percentage of 19.2%, and thus make use of datasets with different plant phenotypes. We also observed that maize (11.5%) is the second most commonly used crop in the 100 summarized studies, while the least used were cassava, blueberry, strawberry, olive, and soybean at only 1.9%.  Figure 6 illustrates the distribution and number of studies categorized by the algorithm used in the approach development process. Among the 100 summarized studies, the newly developed architecture was the most widely used architecture in plant diseases using CNN. A total of 10 studies, representing 22.2% of the summarized studies, used new architecture. While we observed that Alexnet is the second most commonly used CNN algorithm. In addition, we presented in Figure 7 the accuracy characteristics of 7 CNN architectures used in 100 reviewed studies to detect plant leaf diseases. It could be seen that almost all models converged by the 100th epoch of training reported accuracies of more than 99%. Models such as AlexNet and VGG yielded the highest accuracy when compared to the other models such as ResNet and MobileNet.  The diversity of the data used is also a critical factor for the performance of the CNN to detect disease in the plant. Therefore, with respect to the characteristics of the datasets most frequently used in the studies examined, we generated a distribution of studies by dataset by previously correlating the data for each dataset used in the summarized studies (see Figure 8a).

CNN's Agriculture Applications: Major Problems and Solutions
This section may be divided into subheadings. It should provide a concise and precise description of the experimental results, their interpretation, and the experimental conclusions that can be drawn.

Limited Plant Leaf Datasets
In the last five years, CNN has become increasingly used to detect plant leaf disease. However, CNN faces numerous hurdles. Therefore, in this section, we explained and summarized the problems and solutions encountered in developing CNN-based plant disease detection.
Datasets are critical for CNN models. As a result, the main challenge to using CNN for plant disease detection and classification is the requirement for large datasets. Thus, insufficient dataset size significantly impacts the practical implementation, which means that the results will be inaccurate no matter how efficient the model is. However, there are few publicly released datasets for agricultural researchers to work with, and in many cases, researchers must create their datasets of images from scratch. Furthermore, external environmental factors such as climate can readily affect data collection, making it timeconsuming, and several days of work may be required. Moreover, to solve the problem of an insufficient and limited dataset, we provide the following solutions: 1.
Data augmentation techniques: Data augmentation techniques increase the diversity of the data during training by artificially generating additional samples from the real dataset. Furthermore, image augmentation is a technique that creates new data from existing data to help train a deep neural network model. The most recent augmentation techniques. Fast Auto Augment [128], AugMix [129], Rand Augment [130], and population-based augmentation [131]. Liu et al. [56] used data augmentation techniques to increase the dataset size from 1053 to 13,689 images. Sladojevic et al. [132] used data augmentation to increase from 4483 to 33,469 images using perspective transformation and rotation methods. With the expansion of the dataset, the accuracy improved as well. In another study, Barbedo [133] used resizing and image segmentation methods to increase the size of the dataset from 1567 images to 46,409 images. The accuracy improved by 10.83% over the no expanded dataset.

2.
Transfer learning: Transfer learning is a machine learning technique in which we reuse a previously trained model as the base for a new model on a new task. As a result of the new datasets, it will just retrain a few layers of pertained networks which helps to reduce the amount of data required [133]. Chen  Citizen science: In 1995, the concept of citizen science was proposed. Nonprofessional participants collect data as part of a scientific study in this technique. Farmers submit the collected images to a server for plant disease and pest classification, after which the images are correctly labeled and analyzed by an expert [133].

4.
Data sharing: Data sharing is another way of increasing datasets. Several studies are now being conducted worldwide on accurate disease detection. The dataset will become more accurate if the different datasets are shared. This situation will encourage more significant and satisfying study results. PlantVillage [4] and Kaggle [53] are the most commonly employed public datasets in the literature on DL methods for plant disease classification and detection. Table 3 describes these datasets as the most widely applied public datasets.

Image Background
From the section above, we found that almost all of the datasets used in training a CNN model for different studies used large datasets of images. However, one of the problems researchers face is the effect of image background on detection. Most of the time, this effect is unclear by the overlapping of many factors. The most remarkable is the interaction of plants with each other and the organization process. When images are collected in real-time conditions with a crowded background, some of the background features are similar to the area of interest; thus, a leaf segmentation technique is required in these conditions. Otherwise, the model will learn background features throughout training, leading to inaccurate classification results. On the other hand, some researchers are interested in organizing the image collection. In this case, the background is usually preserved because it creates relatively homogeneous backgrounds. It does not affect detection and may even improve detection accuracy.
Mohanty et al. [4] conducted a study in which they employed three different versions of the PlantVillage dataset to identify plant diseases and examine the influence of image background on identification outcomes, namely gray-scaled, color, and segmentation. The researchers concluded that the CNN model's performance with colored images was better than the model's performance with segmented images. As a result, we believe that the problem of image background can be solved in two ways: (i) Organize the image collection to obtain a homogeneous background. (ii) By utilizing image segmentation.

Variability in Symptoms
Symptoms are the impact and changes in the appearance of the plant. It may provide a significant change in appearance, color, or functionality. Therefore, the symptoms of plant diseases are the interaction of diseases, environment, and plants [135]. In general, there is much overlap between the symptoms of the diseases. Many factors in nature, such as humidity, temperature, sunlight, wind, and others, might affect disease symptoms. The interplay of diseases, plants, and the environment can affect various symptom changes, making it difficult to take and report on data.
The main challenge in plant disease identification is that numerous diseases may be identified and combined on the same plant leaves. Therefore, the symptoms may rapidly change, making it challenging to recognize disease types [135], and also, similar symptoms usually develop between many different diseases, further complicating disease identification [136]. Solving this type of problem is to continue increasing the database's diversity in real applications [137]. Researchers are increasingly employing this strategy to improve data diversity successfully. Furthermore, this method is seen to be more realistic because it allows researchers to collect all of the variations and information on the disease in a timely and efficient manner.

Discussion
In this work, we analyzed research on the application of CNN in agriculture, especially on the application of CNN methods in the detection of plant leaf diseases. In addition to leaf disease, this study also focused on CNN architectures, CNN frameworks, datasets applied, dataset size, CNN Pros and Cons, and experimental results of various models used to detect plant leaf diseases. Moreover, in this study, the review is based on research completed in the last five years, and as a result, there has been an increase in research on using CNN techniques for plant leaf disease detection.
Furthermore, the current analysis has revealed that CNN performs well in detecting plant leaf disease. Most studies found that the results were almost identical because the architecture used appeared to be similar. Consequently, new requirements and experiments with larger datasets and new architectures are required; otherwise, much work will be duplicated.
According to the literature, CNN, with various models, has become necessary in agriculture. These models are resilient under challenging conditions such as images with a complex background, non-uniform size, symptom variability, and insufficient datasets. Thus, creating a large dataset from various locations and under different conditions is necessary. On the other hand, most of the datasets mentioned in the summarized studies have a class imbalance, which could lead to model overfitting. The problem must be addressed in future research; this can be accomplished by employing correct data redistribution techniques or class balancing classifiers [138].
Through the evaluation and analysis we have conducted on the reviewed studies, traditional algorithms are dominant in almost all of the studies summarized. The most prominent of these algorithms are AlexNet [139], GoogLeNet [8], ResNet [140], VGG [141], LeNet [142]. They frequently achieve good results and high performance, which explains the importance of their widespread use in various studies. Moreover, one of the reasons for its widespread use is the employment of transfer learning methods and fine-tuning to improve the accuracy of their results. However, many researchers still choose to use their ways to detect plant diseases; for instance, we found that 22.2% of the studies employed innovative architectures, while the remainder used traditional approaches. This is due to several factors, the most important of which are the nature of the research, the data available, and the work's goals. Nevertheless, the assessed studies do not include the possibility of recommending algorithms with substantial documentation, application examples, and usability, which can assist readers and provide inspiration for future initiative approaches.
After reviewing several studies, it became evident that maize, potato, apple, rice, cucumbers, wheat, potatoes, radish, and bananas are the most widely used and studied plants. We also observed that some researchers focus on one crop's disease, while others choose to study diseases in multiple crops as data that can include more than one plant. In the reviewed studies, we found that 19.2% of the comprehensive research used multiple crops, including different plants. The second percentage and area of interest were applying CNN to determine Maize disease. Moreover, when analyzing the results that study the types of plant diseases, we conclude the relationship between diseases and the factors causing them, as it is noted that the majority of diseases are identified and classified by fungi, and this can be explained by the fact that this pathogen causes a large proportion of plant diseases; we found that diseases caused by viruses and bacteria follow a consistent quantitative approach that is determined by the crops present in the datasets. Plant disease identification is critical, especially when employing CNN approaches. However, there are several challenges, the most significant of which are disease overlap, the presence of many disease Symptoms in a single image, and a lack of sufficient and organized data; all these challenges should be considered and worked on in the future.

Conclusions and Future Directions
CNN methods are widely used in the detection of plant diseases. It has solved the problems of traditional object detection and classification methods. In this study, we presented a detailed review of CNN-based research on plant leaf disease detection in crops over the last five years. A total of 100 publications were reviewed based on detection methods and model performance evaluation, comparison of popular CNN frameworks, detailed description of CNN applications in agricultural fields, dataset preparation, the problem and solution related to plant leaf disease detection, and publicly released datasets in the relevant field. We addressed highly related research articles to present a comparative analysis of various CNN models.
Most studies used CNN approaches, and they note that pre-training models compared with training from scratch models on plant leaf datasets can quickly improve performance accuracy, especially if there is a sufficient dataset for each class to train the models. Moreover, we found that most CNN approaches have many problems and challenges, one of which is the lack of dataset, a severe challenge that researchers face while doing their work. However, an essential future impact would be to develop highly efficient detection approaches employing large datasets with different plant leaf diseases. This would also address the class imbalance by requiring large generalized datasets.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

DL
Deep Learning CNN Convolutional Neural Networks DCNN Deep Convolutional Neural Networks FOA The Food and Agriculture Organization