Tomato Leaf Disease Recognition via Optimizing Deep Learning Methods Considering Global Pixel Value Distribution

Li, Zheng; Tao, Weijie; Liu, Jianlei; Zhu, Fenghua; Du, Guangyue; Ji, Guanggang

doi:10.3390/horticulturae9091034

Open AccessEditor’s ChoiceArticle

Tomato Leaf Disease Recognition via Optimizing Deep Learning Methods Considering Global Pixel Value Distribution

¹

School of Rail Transportation, Shandong Jiaotong University, Jinan 250357, China

²

Department of Cyberspace Security, Qufu Normal University, Qufu 273165, China

³

Institute of Automation, Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Horticulturae 2023, 9(9), 1034; https://doi.org/10.3390/horticulturae9091034

Submission received: 21 August 2023 / Revised: 9 September 2023 / Accepted: 12 September 2023 / Published: 14 September 2023

(This article belongs to the Special Issue Smart Horticulture, Plant Secondary Compounds and Their Applications)

Download

Browse Figures

Versions Notes

Abstract

:

In image classification of tomato leaf diseases based on deep learning, models often focus on features such as edges, stems, backgrounds, and shadows of the experimental samples, while ignoring the features of the disease area, resulting in weak generalization ability. In this study, a self-attention mechanism called GD-Attention is proposed, which considers global pixel value distribution information and guide the deep learning model to give more concern on the leaf disease area. Based on data augmentation, the proposed method inputs both the image and its pixel value distribution information to the model. The GD-Attention mechanism guides the model to extract features related to pixel value distribution information, thereby increasing attention towards the disease area. The model is trained and tested on the Plant Village (PV) dataset, and by analyzing the generated attention heatmaps, it is observed that the disease area obtains greater weight. The results achieve an accuracy of 99.97% and 27 MB parameters only. Compared to classical and state-of-the-art models, our model showcases competitive performance. As a next step, we are committed to further research and application, aiming to address real-world, complex scenarios.

Keywords:

plant leaf disease; image recognition; attention mechanism; smart agriculture

1. Introduction

Plant diseases cause approximately 30% of annual crop losses [1], posing a significant threat to agricultural production. Tomato, as a widely cultivated and distributed crop, can suffer from extensive yield reduction or even complete crop failure if diseases are not promptly addressed [2]. Rapid and early identification of plant diseases is helpful in intervening with preventive measures and preventing the wide spread of diseases. Traditional diagnostic methods such as manual visual analysis and chemical testing [3] are time-consuming, labor-intensive, and expensive. With the advancement of computer technology, image-based plant disease recognition offers advantages of being fast, resource-efficient, and low-cost. Many image recognition methods have been proposed and applied for plant disease identification by researchers, including artificial bee colony algorithm [4], image segmentation [5], SVM [6], and other machine learning algorithms [7]. In recent years, deep learning approaches have shown promising results in plant leaf disease recognition [8]. However, due to the random distribution of plant leaf disease images, diverse symptoms, and complex backgrounds [9], researchers have made a series of improvements to deep learning models.

Researchers propose a novel plant leaf disease identification model based on a deep convolutional neural network (Deep CNN), achieving better recognition results by setting an appropriate number of convolutional layers [10]. Due to the scarcity of datasets and the difficulty in data acquisition, researchers often heavily rely on data augmentation and transfer learning techniques. Building upon CNN, a deep learning architecture called EfficientNet [11] is designed on top of data augmentation, improving generalization through transfer learning methods. It is tested on the public dataset Plant Village and achieves higher accuracy compared to other models. The introduction of attention mechanisms allows deep learning to assign higher weights to favorable features during the training process. In [12], the authors incorporate attention mechanisms on top of a residual CNN network for plant leaf disease recognition, comparing it with regular CNN networks and residual CNN networks. By adjusting hyperparameters, designing deep learning architectures, enhancing preprocessing tasks, and incorporating attention mechanisms, researchers have explored deep learning methods suitable for plant leaf disease identification and improved recognition accuracy.

Meanwhile, researchers have found that highlighting the typical features of plant leaf disease regions during the training process of deep learning models can improve their performance. Therefore, guiding deep learning models to focus on the diseased areas of plant leaves has become an important concern for researchers. In the research article [13], the authors explored the integration of the Convolutional Block Attention Module (CBAM) into the ResNet-34 architecture to enhance feature extraction. In addition, they implemented the Faster-RCNN framework. A noteworthy aspect of this study is that it generated annotations for suspected images in advance, which helped specify the regions of interest (RoI). By incorporating this approach, the model acquired prior knowledge about disease regions and was able to capture more profound disease-related features. Another solution to address this issue is presented in [9], where researchers construct a large-scale plant leaf disease dataset called PDD271 and propose a new framework. This framework explores a multi-scale strategy and reweights both visual regions and the loss to emphasize discriminative diseased parts for plant disease recognition.

In traditional feature engineering, pixel value features are an important characteristic of plant diseases. In deep learning, appropriately regressing at the pixel level can guide the model towards the desired training direction. In the research article [14], by establishing a pixel-level correspondence relationship at both ends of the U-Net network, the model gains better control over geometric deformations. We draw inspiration from this idea and incorporate the pixel value distribution information of plant leaf diseases as global information into each self-attention module. This guides the self-attention module to assign greater weights to the pixel value range corresponding to the diseased region, further guiding the deep learning model to pay more attention to the diseased areas during the training process. We refer to this approach as GD-Attention. We utilize ResNet-18 as the backbone and experiment with applying GD-Attention to tomato leaf disease recognition, yielding the following contributions:

The training dataset images were enhanced using methods such as brightness adjustment, noise addition, rotation, scaling, and Gaussian filtering. Training was conducted using the augmented dataset, and compared to training with the original dataset, the accuracy of the test dataset increased by 1.2% with the same model.
The GD-Attention method was proposed and applied to plant leaf disease classification using ResNet-18 as the backbone. On the PV dataset, an accuracy of 99.97% was achieved.
An ablation experiment was performed, comparing ResNet-18, ResNet-18 with self-attention, and ResNet-18 with GD-Attention to validate the effectiveness of GD-Attention.
By analyzing the attention heatmaps, a comparison was made between self-attention and GD-attention regarding their focus on plant leaf regions. This validated that GD-Attention has guiding capabilities for disease areas in the model.
Comparative experiments were conducted to compare the presented work with other methods for plant leaf disease classification, including basic model methods and state-of-the-art (SOTA) methods, in order to demonstrate the efficacy of the presented work.

2. Materials and Methods

2.1. Experimental Data Preparation

Plant Village is a public plant disease image dataset that covers 14 plant species and 26 types of diseases, including common crops such as corn, tomato, apple, and chili. The images were manually collected, classified and annotated by plant scientists and professional horticulturists. Some of the annotations were completed by multiple experts to ensure accuracy. In this study, we selected seven types of tomato leaf diseases and one normal state from the Plant Village dataset, resulting in a total of 15,755 images. The image labeling process was automated using the original dataset. We then partitioned the dataset into training and testing sets with a ratio of 0.8:0.2. Table 1 displays the data information used in this research according to different class of tomato diseases.

2.2. Model Establishment

As shown in Figure 1, the framework of the method used in this study mainly consists of the following components: Input data augmentation (step 1), Incorporation of global pixel value distribution information (step 2), Image feature extraction with GD-Attention mechanism (step 3), and Disease classification (step 4). In step 1, we perform data augmentation on the original images, including brightness adjustment, noise introduction, rotation and scaling, and Gaussian filtering, to expand the training dataset and enhance the model’s generalization capability. In step 2, we take into account the global pixel value distribution information. As an additional input to the model, we incorporate the pixel value distribution information of the images. Specifically, we calculate the statistical information of the pixel value distribution for each image and input it into the GD-Attention module in step 3. In step 3, we input the enhanced image along with global pixel value distribution information into a residual network composed of GD-Attention blocks and Res blocks. Within each GD-Attention block, global pixel value distribution information is introduced to guide the attention mechanism in focusing on important regions and features of the image. In step 4, we first reduce the dimensionality of the output from the last block using pooling operation. Then, the dimensionally-reduced feature maps are fed into a Fully Connected Layer for feature abstraction and non-linear transformation. Next, we introduce an output layer with the same number of neurons as the number of disease categories. By applying the softmax function to the outputs of the fully connected layer, we can calculate the probability scores for each category, thus determining the disease type to which the image belongs. In this study, we use ResNet-18 as the residual network in step 2 for experimentation.

2.3. Data Augmentation

Data augmentation is a technique in the field of image recognition that enhances a model’s performance by introducing additional data. Common image data augmentation techniques include rotation, cropping, flipping, and other operations. To enhance the model’s generalization ability, we performed the following data augmentation on the original images: brightness adjustment, noise introduction, rotation and scaling, and Gaussian filtering.

By randomly adjusting the brightness of the images, we simulated image variations under different lighting conditions. This helps the model be robust to changes in lighting conditions and improves its generalization ability. By introducing a certain level of noise into the images, we simulated background interference or other image corruption. This helps the model learn robustness to noise and improves its robustness and generalization ability. By randomly rotating and scaling the images, we simulated variations in the images at different angles and ratios. This helps the model learn features that are invariant to rotation and scale, improving its robustness and generalization ability. By applying Gaussian filtering to the images, we blurred them and reduced noise within them. This helps reduce the interference of fine details on the model and improves its ability to learn overall image features.

Through these data augmentation operations, we can generate more diverse and challenging training samples, enabling the model to accurately predict and classify images under different lighting, noise, and transformation conditions. This enhances the model’s robustness and generalization ability, allowing it to better adapt to various situations in the real world.

2.4. Introducing Global Pixel Value Distribution Information

The combination of a simple attention mechanism and CNN network has limited ability to enhance the attention weight on the regions of plant leaf diseases, especially when affected by brightness and noise. Therefore, we propose whether it is possible to add an additional input to the attention mechanism, representing some traditional feature representation of the disease regions on plant leaves, to help the attention mechanism better identify the disease areas.

In image processing, we often adjust the pixel value distribution of an image to enhance contrast, improve visual effects, and adapt to different lighting conditions. Methods such as histogram equalization [15], gamma correction [16], and local adaptive techniques [17] are commonly used. In plant leaf disease classification studies based on traditional methods, researchers analyze pixel values to obtain pixel features of the disease regions on plant leaves [18,19], and utilize these features for disease classification.

In this study, we use the traditional feature representation of the pixel value distribution of the image as global information. It is inputted into each Attention-block using skip connection [20] and participates in the matrix operations within the attention mechanism.

As shown in Figure 2a, we have an image of tomato leaf disease, with a size of (3, 224, 224), and its pixel values range from 0 to 255. In Figure 2b, we divide the pixel values from 0 to 255 into 16 intervals and count the number of pixels in each interval. We then normalize these pixel counts to the range of [0, 1]. This way, we obtain the pixel value distribution density for each RGB channel of the image. The distribution is represented by:

p (R_{k}) = \frac{n_{r} k}{H W}

(1)

p (G_{k}) = \frac{n_{g} k}{H W}

(2)

p (B_{k}) = \frac{n_{b} k}{H W}

(3)

where

R_{k}

/

B_{k}

/

G_{k}

is the k-th interval of the R/G/B channel,

n_{r} k

/

n_{g} k

/

n_{b} k

is the number of pixels in the k-th interval, and

H \times W

is the total number of pixels in the image. These three distributions form a matrix with dimensions of (3, 16, 1), as shown in Figure 2c.

After obtaining the pixel value distribution information, we treat it as global information and output it to each GD-Attention block. As shown in Figure 1 step 2 and step 3, the feature extraction module of the image consists of a series of alternating GD-Attention blocks and Res-blocks, forming a residual network. By combining these different structured blocks, the model’s feature extraction capability can be enhanced by leveraging their respective advantages [21]. Through skip connections, the global pixel value distribution information enters the GD-Attention module and is introduced into the feature extraction process at different depths, guiding the attention mechanism to allocate weights for different pixel value ranges.

2.5. GD-Attention

2.5.1. Self-Attention

Self-attention is a mechanism widely used in neural networks and commonly applied in natural language processing and computer vision tasks. Compared to RNN and LSTM, it has advantages such as context awareness, parallel computation, and handling long-term dependencies. Self-attention works by mapping an input sequence or matrix into three representations: query, key, and value. It then calculates the similarity between the query and other positions, applies weighted aggregation based on this similarity, and finally obtains the output representation.

2.5.2. GD-Attention

As shown in Figure 3, We denote a feature map matrix X of size

X \times W

. X is transformed into three matrices Q, K, and V through the transformations

ω^{q}

,

ω^{k}

, and

ω^{v}

respectively, as shown in the following equations:

Q = ω^{q} • X

(4)

K = ω^{k} • X

(5)

V = ω^{v} • X

(6)

where the transformations

ω^{q}

,

ω^{k}

, and

ω^{v}

are combined with con.

Before introducing the self-attention mechanism, we perform the transformation

ω^{g}

on the global pixel value distribution matrix Y to obtain the matrix G, as shown in the following equation:

G = ω^{g} • Y

(7)

where

w^{g}

has the same dimension as the feature map in the GD-attention convolutional layer.

Then, we use the values in matrix G as the new distribution density to map the elements in matrix Q. Assuming the value of

Q_{i j}

falls within the k-th interval of G, the calculation process for the mapped

Q_{i, j}^{'}

is as follows:

Q_{i, j}^{'} = G_{k} * M A X (Q)

(8)

where

Q_{i j}

is the value in the i-th row and j-th column of matrix Q,

G_{k}

is the value of the k-th interval in matrix G.

M A X (Q)

is the maximum value among all elements in matrix Q.

Afterwards, similar to self-attention, matrix multiplication is used to calculate the correlation between the

Q^{'}

matrix and the K matrix. The result is divided by

\sqrt{d_{k}}

, Which is the dimension of the self-attention head. After followed by applying softmax operation, the result is multiplied by V to obtain the module output. The calculation process is as follows:

A t t e n t o n (Q^{'}, K, V) = s o f t m a x (\frac{Q^{'} K^{T}}{\sqrt{d_{k}}}) V

(9)

2.5.3. The Structure of GD-Attention Block

In this study, we conducted experiments using ResNet18. We introduced the GD-Attention mechanism into the basic blocks of ResNet-18,called GD-Attention block.

As shown in Figure 4a, the ResNet BasicBlock consists of two convolutional layers and a residual connection. Given an input matrix X, the output matrix F is computed using the following expression:

F = ω_{2} * σ (ω_{1} * X)

(10)

where

σ

represents the activation function RELU,

ω_{1}

and

ω_{2}

represent the weights of the first and second convolutional layers, respectively.

Then, the final output matrix Y is obtained by adding F to the input matrix X:

Y = F + X

(11)

In this way, the residual block can learn the incremental changes in the input information and add them to the original input, thereby alleviating the problem of gradient vanishing and improving the training effectiveness of the network.

As shown in Figure 4b, We added the GD-attention module between two convolutional layers. The final output matrix Y of the block can be represented by the following equation:

Y = ω_{2} * G D - A t t e n t i o n (σ (ω_{1} * x), G) + X

(12)

3. Results and Discussion

The experiment was conducted using Python 3.7 and the PaddlePaddle 2.4.0 deep learning framework. The execution took place in a Linux environment, specifically running on a system equipped with a Tesla V100 GPU, 32 GB of video memory, a 4-core CPU, and 32 GB of RAM.

3.1. Data Augmentation

Image enhancement is used to improve the quality and quantity of original images, aiming to enhance the robustness and generalization capability of models to different lighting conditions, noises, and other variations. Common image enhancement methods include adjusting brightness, contrast, and color balance; cropping, rotating, and scaling images; adding random perturbations and mixing, among others. By applying image enhancement, we obtain more training data and improve the performance and stability of the model. We apply data augmentation to the PlantVillage dataset used for training and testing. As shown in Figure 5a, it displays the original images from the PV dataset. Figure 5b–e demonstrate the effects of random brightness adjustment, salt-and-pepper noise addition, rotation and scaling, and Gaussian filtering, respectively. Through data augmentation, the dataset is expanded, while certain fine details are enhanced, resulting in increased diversity and improved generalization capability.

3.2. Training Parameter Settings and Model Details

In the process of model training, we utilize the cross-entropy loss function to measure the difference between the predicted results and the true labels. We employ the Adam optimizer to optimize the model parameters and minimize the loss function. The batch size is set to 64, and we train the model for 100 epochs.

Table 2 provides details of the parameter settings for the layers in our model structure.

3.3. Model Identification Results

As shown in Figure 6, after 100 epochs of training, our model successfully classifies and recognizes different types of plant leaf diseases with high accuracy. The model achieved an accuracy of 99.97 % on the test set, with a loss of approximately 0.002. Specifically, the model reached an accuracy of over 95% around the 50th epoch, demonstrating good convergence and stability.

As shown in Figure 7, we utilized the t-SNE method to project the original data and the feature data extracted by our model onto a two-dimensional space. Through t-SNE visualization analysis, the scattered distribution of the original signals was significantly improved after feature extraction by our model, resulting in a better clustering effect.

As shown in Figure 8, we plotted the confusion matrix for the classification results. Through analysis using the confusion matrix, our model achieved a classification accuracy of 100% for all categories except for misdiagnosing diseases in the “early blight” class and “mosaic virus” class, where some errors occurred. Through calculation, we obtained the average precision, recall rate, F1 score, and accuracy rate of the proposed model as 99.94%, 99.94%, 99.94%, 99.97%, respectively.

3.4. Ablation Experiment

To verify the effectiveness of each component in our model, we conducted ablative experiments on various modules within the model.The ablative experiments included resnet18 lacking data argumentation, resnet18 with data argumentation, and self-attention lacking global information. The experimental results are shown in Table 3.

Based on the results of the ablative experiments, we found that data argumentation made a significant contribution to the improvement in accuracy, achieving an increase of 1.2%. The self-attention mechanism also had a positive effect on the accuracy improvement. Moreover, incorporating global information and using the attention mechanism led to an additional improvement of 0.95% in accuracy.

We plotted Precision-Recall and ROC curves to analyze the model’s performance, as shown in Figure 9 and Figure 10. The results indicate that our proposed model, while maintaining high precision, also exhibits good recall capability. The ROC curve outperforms the PR curve, but this phenomenon also reflects the data imbalance among disease categories. A more balanced and high-quality dataset is expected to further enhance model performance.

3.5. Attention Analysis of Leaf Disease Areas

We generate heatmaps for the feature extraction process to analyze the GD-attention mechanism. As shown in Figure 11a, it represents the Global pixel value Distribution information in the GD-attention mechanism. Figure 11b shows the Global pixel value Distribution information after entering the GD-Attention block, obtained through training, which is the output of Equation (7). Figure 12c is the output of Equation (5). It showcases the feature map generated by mapping the Q matrix in GD-Attention using the information from Figure 11b.

From Figure 11c, it can be observed that the feature map reflects information from various pixel value distribution ranges, indicating that the feature extraction process is guided by the global pixel value distribution information. Next, the matrix in Figure 11c is operated with the K matrix and V matrix according to Equation (9), resulting in the output features of the GD-Attention block.

We map the output feature maps of the Attention block back to the original images for observation. We compare the results of Self-attention and GD-attention, as shown in Figure 12. In the image, the result of GD-attention is more focused on the disease area, while the result of self-attention not only enhances the disease area but also emphasizes the edge areas and even background features. This also indicates that GD-attention is less affected by the background and contributes to its generalization capability.

3.6. Comparison

We selected classic models and state-of-the-art models in the field of plant leaf disease classification based on the Plant Village dataset for comparison. We have selected commonly used classical models in the field of plant leaf disease classification, including VGG16, ResNet-50, and CNN, and MobileNet. In addition, we have also chosen state-of-the-art models, such as Customized CNNs, Faster-RCNN, U-net considering the regions of interest, and ResNet with multi-scale strategy and reweight both visual regions and the loss.

The comparison results are shown in Table 4. Through our comparison, we have observed that classical models have achieved high recognition accuracy through techniques such as transfer learning, data augmentation, and squeeze and excitation, thanks to the continuous research conducted by scholars in recent years. State-of-the-art (SOTA) models further improve disease recognition accuracy by modifying and adjusting the structures of classical models. Notably, the papers [5,13,22] have demonstrated accuracy rates surpassing 99.9%. These papers commonly adopt ROI, multi-scale, or other algorithms to highlight plant disease regions. The distinguishing factor is that ROI marks the areas of focus prior to training, aiming to achieve specific training objectives, while multi-scale and other algorithms emphasize feature extraction of plant disease regions through structural changes in the model.

In this study, we propose the GD-Attention mechanism, which introduces pixel value distribution information to guide the model’s focus on plant disease regions during training. The results demonstrate the high accuracy of our proposed method. Furthermore, our proposed model has a size of approximately 27 MB and achieves high accuracy with a relatively small number of parameters. Small-scale accurate models are easy to load into automated systems, contributing to rapid automated early plant disease diagnosis. In applications, we can combine classification models with agricultural robotics systems or monitoring systems to provide automated solutions for rapid diagnosis of plant diseases [30,31].

4. Conclusions

This study focuses on the problem of excessive emphasis on edges, background, and other parts in feature extraction for plant leaf disease identification. We propose the GD-Attention mechanism, which introduces global pixel value distribution information to guide the model’s attention towards the diseased regions of plants. We use ResNet-18 as the backbone and incorporate the GD-Attention mechanism into the ResNet basic block. In addition, we design skip connection structures that incorporate global pixel value distribution information. The proposed deep learning model is trained and tested on the tomato leaf disease dataset from Plant Village.

To validate the proposed model in this study, we conducted a series of experiments, including ablation experiments, attention visualization analysis, and comparative experiments with traditional and state-of-the-art models. The experimental results indicate that the model’s data augmentation module enhances its robustness against blurring, noise, rotation, and scaling. Additionally, the introduction of global pixel value distribution information guides the attention mechanism’s focus. We incorporated global pixel value information at various locations in the model’s feature extraction module using a skip-connection structure. This GD-Attention mechanism reinforces the model’s feature extraction for plant disease regions, leading to improved accuracy. Remarkably, our proposed model achieves the highest accuracy while having only 27 M parameters, which is expected to facilitate further research in real-world applications.

In summary, this study combines traditional image processing methods with deep learning approaches. By guiding the deep learning training process using global pixel value distribution information, the resulting model pays more attention to the diseased areas of plant leaves. This allows the model to achieve higher accuracy with a smaller number of parameters. With the help of agricultural robots and monitoring systems, this method can provide important help in rapid detection of plant diseases. In future work, we plan to collect more complex scenarios data and investigate the performance of this method in more complex scenarios, with the goal of further optimizing the model.

Author Contributions

Conceptualization, J.L. and Z.L.; methodology, J.L., Z.L. and W.T.; software, Z.L. and W.T.; validation, Z.L. and W.T.; investigation, G.D. and G.J.; data curation, G.D. and F.Z.; writing—original draft preparation, Z.L. and W.T.; writing—review and editing, Z.L. and F.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is available on request due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Savary, S.; Ficke, A.; Aubertot, J.N.; Hollier, C. Crop Losses Due to Diseases and Their Implications for Global Food Production Losses and Food Security. Food Secur. 2012, 4, 519–537. [Google Scholar] [CrossRef]
Panno, S.; Davino, S.; Caruso, A.G.; Bertacca, S.; Crnogorac, A.; Mandić, A.; Noris, E.; Matić, S. A Review of the Most Common and Economically Important Diseases That Undermine the Cultivation of Tomato Crop in the Mediterranean Basin. Agronomy 2021, 11, 2188. [Google Scholar] [CrossRef]
Wakeham, A.J.; Keane, G.; Kennedy, R. Field Evaluation of a Competitive Lateral-Flow Assay for Detection of AlternariaBrassicae in Vegetable Brassica Crops. Plant Dis. 2016, 100, 1831–1839. [Google Scholar] [CrossRef]
Pravin Kumar, S.K.; Sumithra, M.G.; Saranya, N. Artificial Bee Colony-Based Fuzzy c Means (ABC-FCM) Segmentation Algorithm and Dimensionality Reduction for Leaf Disease Detection in Bioinformatics. J. Supercomput. 2019, 75, 8293–8311. [Google Scholar] [CrossRef]
Shoaib, M.; Hussain, T.; Shah, B.; Ullah, I.; Shah, S.M.; Ali, F.; Park, S.H. Deep Learning-Based Segmentation and Classification of Leaf Images for Detection of Tomato Plant Disease. Front. Plant Sci. 2022, 13, 1031748. [Google Scholar] [CrossRef]
Rumpf, T.; Mahlein, A.K.; Steiner, U.; Oerke, E.C.; Dehne, H.W.; Plümer, L. Early Detection and Classification of Plant Diseases with Support Vector Machines Based on Hyperspectral Reflectance. Comput. Electron. Agric. 2010, 74, 91–99. [Google Scholar] [CrossRef]
Sujatha, R.; Chatterjee, J.M.; Jhanjhi, N.; Brohi, S.N. Performance of Deep Learning vs Machine Learning in Plant Leaf Disease Detection. Microprocess. Microsyst. 2021, 80, 103615. [Google Scholar] [CrossRef]
Sladojevic, S.; Arsenovic, M.; Anderla, A.; Culibrk, D.; Stefanovic, D. Deep Neural Networks Based Recognition of Plant Diseases by Leaf Image Classification. Comput. Intell. Neurosci. 2016, 2016, 3289801. [Google Scholar] [CrossRef]
Liu, X.; Min, W.; Mei, S.; Wang, L.; Jiang, S. Plant Disease Recognition: A Large-Scale Benchmark Dataset and a Visual Region and Loss Reweighting Approach. IEEE Trans. Image Process. 2021, 30, 2003–2015. [Google Scholar] [CrossRef]
Geetharamani, G.; Arun Pandian, J. Identification of Plant Leaf Diseases Using a Nine-Layer Deep Convolutional Neural Network. Comput. Electr. Eng. 2019, 76, 323–338. [Google Scholar] [CrossRef]
Atila, Ü.; Uçar, M.; Akyol, K.; Uçar, E. Plant Leaf Disease Classification Using EfficientNet Deep Learning Model. Ecol. Inform. 2021, 61, 101182. [Google Scholar] [CrossRef]
Karthik, R.; Hariharan, M.; Anand, S.; Mathikshara, P.; Johnson, A.; Menaka, R. Attention Embedded Residual CNN for Disease Detection in Tomato Leaves. Appl. Soft Comput. 2020, 86, 105933. [Google Scholar] [CrossRef]
Nawaz, M.; Nazir, T.; Javed, A.; Masood, M.; Rashid, J.; Kim, J.; Hussain, A. A Robust Deep Learning Approach for Tomato Plant Leaf Disease Localization and Classification. Sci. Rep. 2022, 12, 18568. [Google Scholar] [CrossRef] [PubMed]
Yao, W.; Zeng, Z.; Lian, C.; Tang, H. Pixel-Wise Regression Using U-Net and Its Application on Pansharpening. Neurocomputing 2018, 312, 364–371. [Google Scholar] [CrossRef]
Mungra, D.; Agrawal, A.; Sharma, P.; Tanwar, S.; Obaidat, M.S. PRATIT: A CNN-based Emotion Recognition System Using Histogram Equalization and Data Augmentation. Multimed. Tools Appl. 2020, 79, 2285–2307. [Google Scholar] [CrossRef]
Ju, M.; Ding, C.; Guo, Y.J.; Zhang, D. IDGCP: Image Dehazing Based on Gamma Correction Prior. IEEE Trans. Image Process. 2020, 29, 3104–3118. [Google Scholar] [CrossRef]
Zhang, W.; Zhuang, P.; Sun, H.H.; Li, G.; Kwong, S.; Li, C. Underwater Image Enhancement via Minimal Color Loss and Locally Adaptive Contrast Enhancement. IEEE Trans. Image Process. 2022, 31, 3997–4010. [Google Scholar] [CrossRef]
Zhang, S.; Wang, H.; Huang, W.; You, Z. Plant Diseased Leaf Segmentation and Recognition by Fusion of Superpixel, K-means and PHOG. Optik 2018, 157, 866–872. [Google Scholar] [CrossRef]
Zhang, S.; You, Z.; Wu, X. Plant Disease Leaf Image Segmentation Based on Superpixel Clustering and EM Algorithm. Neural Comput. Appl. 2019, 31, 1225–1232. [Google Scholar] [CrossRef]
Wen, X.; Li, T.; Han, Z.; Liu, Y.S. Point Cloud Completion by Skip-Attention Network With Hierarchical Folding. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Seattle, WA, USA, 2020; pp. 1936–1945. [Google Scholar] [CrossRef]
Al-gaashani, M.S.A.M.; Shang, F.; Muthanna, M.S.A.; Khayyat, M.; Abd El-Latif, A.A. Tomato Leaf Disease Classification by Exploiting Transfer Learning and Feature Concatenation. IET Image Process. 2022, 16, 913–925. [Google Scholar] [CrossRef]
Attallah, O. Tomato Leaf Disease Classification via Compact Convolutional Neural Networks with Transfer Learning and Feature Selection. Horticulturae 2023, 9, 149. [Google Scholar] [CrossRef]
Wspanialy, P.; Moussa, M. A Detection and Severity Estimation System for Generic Diseases of Tomato Greenhouse Plants. Comput. Electron. Agric. 2020, 178, 105701. [Google Scholar] [CrossRef]
Wagle, S.A.; Harikrishnan, R. A Deep Learning-Based Approach in Classification and Validation of Tomato Leaf Disease. Trait. Signal 2021, 38, 699–709. [Google Scholar] [CrossRef]
Thangaraj, R.; Anandamurugan, S.; Kaliappan, V.K. Automated Tomato Leaf Disease Classification Using Transfer Learning-Based Deep Convolution Neural Network. J. Plant Dis. Prot. 2021, 128, 73–86. [Google Scholar] [CrossRef]
Chen, J.; Zhang, D.; Suzauddola, M.; Nanehkaran, Y.A.; Sun, Y. Identification of Plant Disease Images via a Squeeze-and-excitation MobileNet Model and Twice Transfer Learning. IET Image Process. 2021, 15, 1115–1127. [Google Scholar] [CrossRef]
Islam, M.S.; Sultana, S.; Farid, F.A.; Islam, M.N.; Rashid, M.; Bari, B.S.; Hashim, N.; Husen, M.N. Multimodal Hybrid Deep Learning Approach to Detect Tomato Leaf Disease Using Attention Based Dilated Convolution Feature Extractor with Logistic Regression Classification. Sensors 2022, 22, 6079. [Google Scholar] [CrossRef]
Ahmed, S.; Hasan, M.B.; Ahmed, T.; Sony, M.R.K.; Kabir, M.H. Less Is More: Lighter and Faster Deep Neural Architecture for Tomato Leaf Disease Classification. IEEE Access 2022, 10, 68868–68884. [Google Scholar] [CrossRef]
Bhujel, A.; Kim, N.E.; Arulmozhi, E.; Basak, J.K.; Kim, H.T. A Lightweight Attention-Based Convolutional Neural Networks for Tomato Leaf Disease Classification. Agriculture 2022, 12, 228. [Google Scholar] [CrossRef]
ElBeheiry, N.; Balog, R.S. Technologies Driving the Shift to Smart Farming: A Review. IEEE Sens. J. 2023, 23, 1752–1769. [Google Scholar] [CrossRef]
Grieve, B.D.; Duckett, T.; Collison, M.; Boyd, L.; West, J.; Yin, H.; Arvin, F.; Pearson, S. The Challenges Posed by Global Broadacre Crops in Delivering Smart Agri-Robotic Solutions: A Fundamental Rethink Is Required. Glob. Food Secur. 2019, 23, 116–124. [Google Scholar] [CrossRef]

Figure 1. Overview of the proposed Method.

Figure 2. Global pixel value distribution information. (a) Original image. (b) Pixel value distribution density. (c) Heatmap.

Figure 3. GD-Attention.

Figure 4. Block Structure. (a) ResNet BasicBlock. (b) GD-Attention Block.

Figure 5. Data Argumentation. (a) Original Image. (b) Brightness Adjustent. (c) Salt-and-pepper Noise Addition. (d) Rotation and Scaling. (e) Gaussian Filtering.

Figure 6. Training Loss and Test Accuracy.

Figure 7. T-SNE for feature visualization. (a) Original data. (b) Results of feature extraction by our model.

Figure 8. Confusion matrix for the results.

Figure 9. The Precision-Recall curve. (a) ResNet-18. (b) ResNet-18 + Data Argumentation. (c) ResNet-18 + Data Argumentation + Self-attention. (d) ResNet-18 + Data Argumentation + GD-attention.

Figure 10. The ROC curve. (a) ResNet-18. (b) ResNet-18 + Data Argumentation. (c) ResNet-18 + Data Argumentation + Self-attention. (d) ResNet-18 + Data Argumentation + GD-attention.

Figure 11. The feature extraction process of the GD-attention. (a) Global pixel value distribution. (b) The trained pixel value distribution. (c) Feature map

Q^{'}

.

Figure 11. The feature extraction process of the GD-attention. (a) Global pixel value distribution. (b) The trained pixel value distribution. (c) Feature map

Q^{'}

.

Figure 12. Attention heatmaps. (a) original images. (b) Self-attention heatmaps. (c) GD-attention heatmaps.

Table 1. Dataset Preparation Used for Classification.

Class Label	Disease Type	Number of Samples
0	bacterial spot	2127
1	early blight	1000
2	late blight	1909
3	mold leaf	1000
4	mosaic virus	1000
5	septoria spot	1771
6	yellow virus	5357
7	Healthy	1591

Table 2. Model structure details.

Layers or Blocks	Input	GD-Attention Information	Output
Input Layer	(64, 3, 224, 224)	(64, 3, 1, 16)	-
Convolutional Layer	(64, 3, 224, 224)	-	(64, 64, 112, 112)
Max Pooling Layer	(64, 64, 112, 112)	-	(64, 64, 56, 56)
GD-Convolutional Layer1	-	(64, 3, 1, 16)	(64, 64, 1, 16)
GD-Attention Block1	(64, 64, 56, 56), (64, 64, 1, 16)	-	(64, 64, 56, 56)
ResNet Basic Block1	(64, 64, 56, 56)	-	(64, 64, 56, 56)
GD-Convolutional Layer2	-	(64, 3, 1, 16)	(64, 64, 1, 16)
GD-Attention Block2	(64, 64, 56, 56), (64, 64, 1, 16)	-	(64, 64, 56, 56)
ResNet Basic Block2	(64, 64, 56, 56)	-	(64, 128, 28, 28)
GD-Convolutional Layer3	-	(64, 3, 1, 16)	(64, 128, 1, 16)
GD-Attention Block3	(64, 128, 28, 28), (64, 128, 1, 16)	-	(64, 128, 28, 28)
ResNet Basic Block3	(64, 128, 28, 28)	-	(64, 256, 14, 14)
GD-Convolutional Layer4	-	(64, 3, 1, 16)	(64, 256, 1, 16)
GD-Attention Block4	(64, 256, 14, 14), (64, 256, 1, 16)	-	(64, 256, 14, 14)
ResNet Basic Block4	(64, 256, 14, 14)	-	(64, 512, 7, 7)
Avg Pooling Layer	(64, 512, 7, 7)	-	(64, 512, 1, 1)
Flatten and Fully Connected Layers	(64, 512, 1, 1)	-	(64, 8)

Table 3. Ablation Experiment.

No.	Model	Accuracy	Precision	Recall	F1-Score
0	ResNet-18	96.58%	95.66%	96.47%	96.03%
1	ResNet-18 + Data Argumentation	97.78%	96.84%	97.68%	97.21%
2	ResNet-18 + Data Argumentation + Self-attention	99.02%	99.03%	99.04%	99.02%
3	ResNet-18 + Data Argumentation + GD-attention	99.97%	99.94%	99.94%	99.94%

Table 4. Performance comparison of methods for tomato leaf disease classification on PlantVillage.

Model	Method	Accuracy
ResNet-50 [23]	ResNet-50	97.00%
VGG 16 [24]	VGG 16 compared with AlexNet, GoogLeNet and MobileNetv2	99.17%
CNN [25]	CNN with transfer learning	99.55%
MobileNet [26]	MobileNet with squeeze-and-excitation	99.78%
ADCLR [27]	Attention-based dilated CNN logistic regression	96.60%
Customized CNN [28]	Lightweight transfer learning-based approach on pretrained CNN	99.30%
Customized CNN [29]	A Lightweight Attention-Based Convolutional Neural Networks	99.69%
Customized ResNet [9]	A multi-scale strategy and reweight both visual regions and the loss to emphasize discriminative diseased parts for plant disease recognition	99.78%
Customized CNN [22]	10 KNN Fully connected layer (MobileNet + ShuffleNet + ResNet-18)+ hybrid FS	99.92%
Customized U-net [5]	Customized U-net Considering the Regions of Interest	99.95%
Customized Faster-RCNN [13]	ROI and Faster-RCNN and CBAM	99.97%
Proposed	ResNet-18 with GD-Attention	99.97%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Z.; Tao, W.; Liu, J.; Zhu, F.; Du, G.; Ji, G. Tomato Leaf Disease Recognition via Optimizing Deep Learning Methods Considering Global Pixel Value Distribution. Horticulturae 2023, 9, 1034. https://doi.org/10.3390/horticulturae9091034

AMA Style

Li Z, Tao W, Liu J, Zhu F, Du G, Ji G. Tomato Leaf Disease Recognition via Optimizing Deep Learning Methods Considering Global Pixel Value Distribution. Horticulturae. 2023; 9(9):1034. https://doi.org/10.3390/horticulturae9091034

Chicago/Turabian Style

Li, Zheng, Weijie Tao, Jianlei Liu, Fenghua Zhu, Guangyue Du, and Guanggang Ji. 2023. "Tomato Leaf Disease Recognition via Optimizing Deep Learning Methods Considering Global Pixel Value Distribution" Horticulturae 9, no. 9: 1034. https://doi.org/10.3390/horticulturae9091034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Tomato Leaf Disease Recognition via Optimizing Deep Learning Methods Considering Global Pixel Value Distribution

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Data Preparation

2.2. Model Establishment

2.3. Data Augmentation

2.4. Introducing Global Pixel Value Distribution Information

2.5. GD-Attention

2.5.1. Self-Attention

2.5.2. GD-Attention

2.5.3. The Structure of GD-Attention Block

3. Results and Discussion

3.1. Data Augmentation

3.2. Training Parameter Settings and Model Details

3.3. Model Identification Results

3.4. Ablation Experiment

3.5. Attention Analysis of Leaf Disease Areas

3.6. Comparison

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI