Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images

Devora-Guadarrama, Maryjose; Luna-Benoso, Benjamín; Alarcón-Paredes, Antonio; Martínez-Perales, Jose Cruz; Morales-Rodríguez, Úrsula Samantha

doi:10.3390/computers14110500

Open AccessArticle

Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images

by

Maryjose Devora-Guadarrama

¹

,

Benjamín Luna-Benoso

^1,*

,

Antonio Alarcón-Paredes

²

,

Jose Cruz Martínez-Perales

¹

and

Úrsula Samantha Morales-Rodríguez

¹

Escuela Superior de Cómputo, Instituto Politécnico Nacional, México City 07738, Mexico

²

Centro de Investigación en Computación, Instituto Politécnico Nacional, México City 07738, Mexico

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(11), 500; https://doi.org/10.3390/computers14110500

Submission received: 19 September 2025 / Revised: 9 November 2025 / Accepted: 12 November 2025 / Published: 17 November 2025

(This article belongs to the Section AI-Driven Innovations)

Download

Browse Figures

Versions Notes

Abstract

Early detection of plant diseases is key to ensuring food production, reducing economic losses, minimizing the use of agrochemicals, and maintaining the sustainability of the agricultural sector. Citrus plants, an important source of vitamin C, fiber, and antioxidants, are among the world’s most significant fruit crops but face threats such as canker and Huanglongbing (HLB), incurable diseases that require management strategies to mitigate their impact. Manual diagnosis, although common, I s imprecise, slow, and costly; therefore, efficient alternatives are emerging to identify diseases from early stages using Artificial Intelligence techniques. This study evaluated four deep learning models, specifically convolutional neural networks. In this study, we evaluated four convolutional neural network models (DenseNet121, ResNet50, EfficientNetB0, and MobileNetV2) to detect canker and HLB in citrus leaf images. We applied preprocessing and data-augmentation techniques; transfer learning via selective fine-tuning; stratified k-fold cross-validation; regularization methods such as dropout and weight decay; and hyperparameter-optimization techniques. The models were evaluated by the loss value and by metrics derived from the confusion matrix, including accuracy, recall, and F1-score. The best-performing model was EfficientNetB0, which achieved an average accuracy of 99.88% and the lowest loss value of 0.0058 using cross-entropy as the loss function. Since EfficientNetB0 is a lightweight model, the results show that lightweight models can achieve favorable performance compared to robust models, models that can be useful for disease detection in the agricultural sector using portable devices or drones for field monitoring. The high accuracy obtained is mainly because only two diseases were considered; consequently, it is possible that these results do not hold in a database that includes a larger number of diseases.

Keywords:

deep learning; citrus canker; Huanglongbing

1. Introduction

Agriculture is a fundamental activity of great importance for areas such as nutrition, the economy, sustainability, and food security [1]. It underpins the cultivation of products essential to human diets, including citrus plants. These plants are of paramount global importance due to their nutritional value, as they are a natural source of vitamin C, antioxidants, minerals, and fiber; their economic relevance, as they generate jobs and income through market sales; their industrial applications, since they are used in the pharmaceutical industry for the production of medicines and cosmetics; and their contribution to environmental conservation, helping prevent soil erosion and forming part of sustainable agricultural systems when appropriately managed [2,3]. However, citrus plants are exposed to various diseases that affect their yield and quality, notably citrus canker and Huanglongbing (HLB, citrus greening).

Citrus canker is a disease caused by the bacterium Xanthomonas axonopodis pv. citri that affects the leaves of citrus plants. It is considered the most feared disease, since it can affect all types of citrus crops. Its significance lies in the severe economic, phytosanitary, and commercial impacts it generates on the global agricultural sector. Transmission occurs primarily through rain and wind, as well as via contaminated tools, people, and machinery in citrus orchards; to a lesser extent, it can also be spread by insects. This disease causes defoliation, aggressive dieback, and premature drop of leaves and fruit; eventually, trees cease production. Citrus canker is highly contagious, and there is currently no cure; only management and control strategies are applied to reduce its spread and minimize crop damage [4,5].

HLB is considered one of the most threatening diseases for citrus production. It is caused by bacteria of the genus Candidatus Liberibacter and is transmitted primarily by the vector insect Diaphorina citri and by contaminated grafts. Symptoms include uniform yellow spots on leaves, a bitter taste in the fruit, branch dieback, and eventually the death of the entire tree [6]. As with canker, there is no cure for HLB. At present, manual diagnosis is performed through human observation, which depends heavily on the practitioner’s experience and skills, and error rates exceeding 30% have been reported [7].

Although manual diagnosis, for both citrus canker and HLB, is a common practice, it is imprecise, slow, and costly. In this context, works are presented that, using Artificial Intelligence techniques, manage to identify diseases from early stages, and emerge as efficient alternatives for identifying diseases at early stages. Some studies have incorporated machine learning techniques for plant disease detection [8,9], whereas others have employed deep learning techniques [10,11].

This work aims to address the detection of citrus canker and HLB in images of citrus leaves using deep learning. To this end, transfer learning via selective fine-tuning was applied to four Convolutional Neural Network (CNN) models: DenseNet121, ResNet50, EfficientNetB0, and MobileNetV2. We employed preprocessing and data-augmentation techniques, cross-validation training schemes, and regularization and optimization methods to tune the best hyperparameters. The models were compared using the loss value and metrics derived from the confusion matrix, including accuracy, precision, recall, and F1-score. We also analyzed the behavior of training and validation loss and accuracy curves for each model, as well as model interpretability using Grad-CAM visualizations. Finally, the best-performing model was compared with other state-of-the-art models in terms of accuracy. Although the study proposed in this work focuses on plant disease detection, there are methodological parallels with human health research. Studies such as [12,13] apply machine learning to analyze and predict epidemiological phenomena from large volumes of data, which are conceptually analogous to automated plant disease detection using leaf images. These references reinforce the validity and versatility of the deep learning approaches employed in this work.

2. Related Work

Citrus fruits are among the most consumed worldwide, making timely detection of any disease-induced damage essential. Recent research has shown that artificial intelligence techniques can identify diseases such as canker and HLB in citrus plants.

Some research makes use of Machine Learning (ML) techniques to detect diseases in citrus fruits as in [14,15], reporting accuracies of 97% and 97.8%, respectively. On the other hand, there are works that also make use of ML techniques, but they only focus on detecting Canker in citrus fruits as in [16,17,18], reporting accuracies between 94% and 96.2%, or they only focus on detecting HLB as in [19,20,21], reporting accuracies between 87.5% and 97.46%.

Although the use of ML techniques has shown good results, moderate levels of accuracy may be acceptable in less relevant tasks, such as categorizing image preferences [22]. Nevertheless, in high-importance contexts, such as, for example, distinguishing healthy fruits, it is crucial to minimize errors, so it turns out to be convenient to employ deep learning techniques, such as those used in the work presented in [23]. The authors evaluated the performance of four DL models (EfficientNetB0, ResNet50, DenseNet121, and InceptionV3) for the image-based classification of citrus diseases, including HLB and canker. InceptionV3 and DenseNet121 both achieved an accuracy of 99.12%, whereas ResNet50 and EfficientNetB0 achieved accuracies of 84.58% and 80.18%, respectively. In [24], four DL models (EfficientNetB3, ResNet50, MobileNetV2, and InceptionV3) were also evaluated using transfer learning for the identification and categorization of various citrus diseases, including HLB and canker. The best results were obtained with EfficientNetB3, achieving an accuracy of 99.58%. Similarly, Ref. [25] evaluated three CNN models trained on leaves (L-CNN), on fruit (F-CNN), and on multiple components (MC-CNN). The authors introduced the use of combined visual features from citrus leaves and fruit to achieve more accurate classification. The best results were obtained with MC-CNN, reaching an accuracy of 97.75%, followed by L-CNN with 95.50%. They also concluded that diseases such as canker and HLB are challenging to classify. In [26], the authors incorporated data augmentation and transfer learning with the pretrained DenseNet201 and AlexNet models, achieving an accuracy of 99.6%. In [27], the authors propose a two-stage CNN model based on Faster R-CNN. In the first stage, candidate affected regions are identified via a region proposal network; in the second stage, the most likely target region is classified into the corresponding disease class. The model was applied to black spots, citrus bacterial canker, and HLB, achieving an accuracy of 94.37%. Separately, works such as [28] propose localizing citrus fruit in complex orchard backgrounds using an optimized YOLOv4 detector and an EfficientNet classifier, achieving an accuracy of 89%. In [29], citrus diseases are also detected and classified in complex orchard backgrounds. To this end, the study develops an effective Duck Optimization with Enhanced Capsule Network Based Citrus Disease Detection for Sustainable Crop Management (DOECN-CDDCM) technique, achieving an accuracy of 98.40%. By contrast, some works fuse different models or techniques, as in [30], where CNN and LSTM deep learning models are combined with edge computing. The models incorporate an improved feature extraction mechanism, a down-sampling approach, and a subsequent feature fusion subsystem. The proposed CNN-LSTM model achieved an accuracy of 97.18% using magnitude-based pruning, and 98.25% when magnitude-based pruning was combined with post quantization.

Research Gap: Both ML and DL techniques have proven effective for detecting citrus diseases. Within these studies, the best-performing model achieved 99.6% accuracy using DL with data augmentation and transfer learning applied to DenseNet201 and AlexNet [26]. Several researchers presented in the state-of-the-art have compared models for high-capacity environments and lightweight models for Edge Computing. In the case of citrus disease detection, it is of interest to have lightweight and optimized models that can offer outstanding performance while maintaining their viability for deployments in edge computing environments, such as portable agricultural devices or drones for field monitoring. Therefore, in this study, a comprehensive analysis of different DL models is performed, including the lightweight EfficientNetB0 model, and compared with more robust models such as DenseNet121, ResNet50, and MobileNetV2, incorporating techniques such as data augmentation, transfer learning, cross-validation training, hyperparameter tuning, and optimization. It is shown that lightweight models are capable of achieving favorable performance compared to robust models.

3. Materials and Methods

3.1. Dataset

The dataset used in this study was compiled from two publicly available collections of citrus disease images hosted on Kaggle [31,32]. These datasets include samples corresponding to canker and HLB. Both diseases considered exhibit visible manifestations on citrus leaves. The first dataset [31] consists of images with a uniform background, whereas the second dataset [32] contains images captured under natural conditions with varied backgrounds. All images used in this study are in PNG format and they have different resolutions ranging from 256 × 256, 225 × 354 and 236 × 225 among others. Figure 1 shows representative examples of both diseases considered. Figure 1a,b correspond to Canker and Huanglongbing cases from the first with-dataset, while Figure 1c,d correspond to the same diseases from the second dataset. Table 1 summarizes the dataset information.

3.2. Stratified K-Fold Cross-Validation

Stratified k-fold cross-validation is a variant of k-fold cross-validation. In k-fold cross-validation, the dataset is divided into k folds and k iterations are performed; in each iteration, one fold is used as the test set and the remaining k-1 folds as the training set. Finally, the accuracies obtained across the k iterations are averaged. Stratified k-fold cross-validation ensures that each fold preserves approximately the same class proportions as the original dataset, which is especially useful for imbalanced datasets [33].

3.3. Data Augmentation

Data augmentation is a method that applies transformations to the original images while preserving their key characteristics, with the aim of reducing overfitting, improving model performance, and addressing class imbalance in classification problems [24].

3.4. Transfer Learning

Transfer learning is a deep learning technique in which knowledge acquired by a model previously trained on one task is transferred to a related task to improve the performance of the classification model. This approach typically takes a pretrained model on a general source task, retains its initial layers, and adapts its final layers to the new, specific task via fine-tuning. The main advantages of using transfer learning include reduced training time and improved model accuracy, especially when working with small datasets [34].

3.5. Proposed Approach

Figure 2 shows the general flow of the proposed approach. A loop is executed for each fold, where the data are first prepared using stratified k-fold cross-validation, using 15% as the test set and the remaining 85% as the training and validation set, the latter divided in an 80–20% ratio. Next, pre-processing and data augmentation are applied to mitigate overfitting and improve model performance. The augmentation techniques include geometric transformations such as rotations within a predefined angle range, horizontal and vertical flips, translations of up to 10% of the image dimensions, and cropping operations. Adjustments are also made in the color space by varying brightness, contrast, saturation, and hue. Data augmentation is dynamically applied during training to avoid duplication or information leakage between folds. Figure 3 illustrates the result of applying some of these transformations to an image exhibiting citrus canker.

Subsequently, transfer learning is applied to each of the proposed CNN models: MobileNetV2, DenseNet121, ResNet50, and EfficientNetB0. To reduce training time and lower the risk of overfitting, selective fine-tuning is employed by freezing the convolutional base of the pretrained models and training only the final classification layers. Next, the cross-entropy loss function is used; regularization techniques such as dropout and weight decay are applied; and, to optimize each model, key hyperparameters, including learning rate, batch size, number of training epochs, and optimizer type, are tuned. Table 2 presents the ranges and values of the hyperparameters used.

Each proposed approach is evaluated using the metrics Accuracy, Recall and F1-Score, whose formulas are presented in Equations (1)–(3). In these formulas, TP denotes true positives, TN denotes true negatives, FP denotes false positives, and FN denotes false negatives.

A c c u r a c y = (T P + T N) / (T P + T N + F P + F N),

(1)

R e c a l l = T P / (T P + F N),

(2)

F 1 - s c o r e = (2 \times P r e c i s i o n \times R e c a l l) / (P r e c i s i o n + R e c a l l),

(3)

To halt training when the validation accuracy stops improving, we use early stopping. Under this criterion, for each model and fold we retain the epoch with the best validation performance and then compute the mean of the metrics across folds.

Each step of the proposed methodology was designed to improve specific aspects of model performance. The preprocessing stage improves the visual quality and consistency of the input data. The data augmentation stage expands the diversity of the training set through transformations such as rotations, inversions, translations, and cutting operations, which favors generalization ability and prevents overfitting. Feature extraction using the selected CNNs (MobileNetV2, DenseNet121, ResNet50 and EfficientNetB0) allow to identify distinctive patterns and textures associated with Citrus Canker and HLB diseases in leaves. The optimization stage, fine tuning, and hyperparameter tuning improve the convergence and stability of training to achieve better generalization and accuracy. While the early stopping technique allows to stop the workout avoiding overfitting. Finally, the classification layer translates the learned representations into class probabilities.

4. Experiments and Results

Initially, the data were prepared using stratified k-fold cross-validation with k = 5, using 15% as the test set and the remaining 85% for training and validation, with the latter further split 80/20. During Stratified K-Fold Cross-Validation, each fold maintained strict separation from the original images, and data augmentation was dynamically applied during training to avoid duplication or information leakage between folds. Each image was resized to 224 × 224 pixels, and pixel values were normalized to the [0, 1] range. Data augmentation was then applied to the training set. Subsequently, transfer learning was applied to the four selected CNN models (MobileNetV2, DenseNet121, ResNet50, and EfficientNetB0) using selective fine-tuning to adjust only the final layers of the pretrained model. Cross-entropy was used as the loss function. Hyperparameter tuning was performed to ensure robust generalization and convergence stability for all evaluated CNN architectures. The optimization process focused on the learning rate, batch size, number of epochs, and optimizer type, as these parameters exert the greatest influence on model performance [35]. A two-stage strategy combining random search and grid search was adopted. In the first stage, a random search explored broad ranges of the learning rate (1 × 10⁻⁶–1 × 10⁻¹), batch sizes {16, 32, 64}, and optimizers (SGD, Adam, and AdamW). This allowed rapid identification of promising configurations with stable convergence. In the second stage, a refined grid search was applied within narrowed ranges around the top-performing candidates to fine-tune the models and minimize validation loss. This hybrid approach balanced efficiency and precision in the hyperparameter space exploration. The learning rate was dynamically adjusted using the ReduceLROnPlateau scheduler, initialized at 0.001. When the validation loss failed to decrease for a predefined number of epochs, the learning rate was reduced by a factor of 0.1. This adaptive mechanism enabled finer convergence during late training stages, helping the models escape suboptimal local minima and improving overall generalization. To mitigate overfitting, dropout and weight decay regularization were employed in all architectures. Weight decay penalized large weight magnitudes, while dropout stochastically deactivated neurons during training to promote robustness. Among the optimizers evaluated, Adam achieved the best trade-off between convergence speed and stability, outperforming SGD and AdamW in both validation loss and accuracy across folds. Training was governed by an early stopping criterion to prevent overfitting and unnecessary computation. The process was halted if the validation accuracy did not improve for ten consecutive epochs (patience = 10) or upon reaching the maximum of 100 epochs. The model parameters from the epoch with the highest validation performance were retained for final evaluation. For each model, the mean validation metrics across folds were used to select the final configuration, prioritizing the combination that achieved the lowest mean validation loss and the highest mean F1-score.

In Figure 4 are shown the convergence curves obtained during the training of the EfficientNetB0 model under different batch size settings {16, 32, 64}, while keeping all other parameters fixed: the Adam optimizer with a learning rate of 0.001, ReduceLROnPlateau scheduler, and early stopping (patience = 10). These graphs illustrate the evolution of validation accuracy and loss along epochs, as well as the comparison between training and validation accuracy for each configuration. In Figure 4a, the validation accuracy increases rapidly during the first few epochs in all configurations, exceeding 97% before stabilizing around 99%. Figure 4b shows a progressive and sustained decrease in the validation loss, which proves an effective optimization process and adequate convergence of the model. Finally, in Figure 4c, the training and validation curves allow us to observe the degree of generalization achieved: the batch sizes of 32 and 64 maintain smoother trajectories and with lower divergence, while the batch size of 16 presents a slight sensitivity to oscillations larger than 10 epoch, associated with later oscillations gradient. Overall, the convergence curves confirm that all configurations achieved stable and high accuracy training; however, batch size = 64 offered the best trade-off between accuracy, stability, and generalization ability, so it was selected as the optimal setting for the final experiments.

Figure 5 shows the confusion matrix obtained for each of the proposed models. The confusion matrices show that all models achieved adequate classification of Canker and Greening diseases. MobileNetV2 and DenseNet121 present slight confusions between both classes, mainly when identifying Greening samples, while ResNet50 and Efficient-NetB0 show a more balanced and accurate classification, with a minimal number of errors. Overall, the results reflect a good overall performance, highlighting ResNet50 and EfficientNetB0 for their higher consistency in correctly identifying both diseases. Table 3 reports the results of the selected metrics.

Among the evaluated models, EfficientNetB0 achieved the highest accuracy at 99.88% and the lowest loss at 0.0058. This result underscores the effectiveness of its compound scaling strategy, which balances depth, width, and resolution in a computationally efficient manner. Its superior performance indicates strong potential for real-time citrus disease diagnosis and for field-deployable systems. By comparison, ResNet50 obtained a slightly lower accuracy of 98.67% with a moderate loss of 0.0391. Although ResNet50 is known for robust residual learning capabilities, the higher loss suggests it may not generalize as effectively as EfficientNetB0 for this specific task. In contrast, MobileNetV2 achieved an accuracy of 96.52% with a loss of 0.0230, demonstrating that lightweight architectures can attain high classification accuracy while maintaining computational efficiency. This makes them particularly well suited for deployment on mobile devices and drones aimed at citrus disease detection. Finally, DenseNet121 exhibited the lowest performance among the four models, with an accuracy of 84.22% and a loss of 0.0771. This suggests that DenseNet121 may be more sensitive to the characteristics of the dataset or less suitable for this specific classification task, possibly due to its more complex connectivity pattern, which may require a larger or more diverse dataset to reach its full potential.

Figure 6 shows the training and validation loss and accuracy curves for the architectures 6 (a) MobileNetV2, 6 (b) ResNet50, 6 (c) DenseNet121, and 6 (d) EfficientNetB0, using a stratified k-fold cross-validation scheme with k = 5. MobileNetV2 exhibited rapid convergence, reaching accuracy values above 95% within a few epochs. However, the validation curves exhibited more pronounced fluctuations, indicating greater sensitivity to variation across folds and mild overfitting in some cases. This behavior reflects the lightweight and efficient nature of MobileNetV2, albeit with lower stability compared to deeper architectures. In the case of ResNet50, a more unstable start was observed, with very high initial loss values that drop sharply in the first few epochs. Validation accuracy varied across folds, showing noisier convergence trajectories. Although final accuracy values exceed 90%, the dispersion suggests that the model is more fold-dependent and less consistent in its generalization compared with the other evaluated architectures. In contrast, EfficientNetB0 demonstrated the best overall performance, with rapid and stable convergence across all folds. Training and validation losses remained close to zero, and the accuracy curves were tightly clustered, consistently exceeding 98%. Only one-fold exhibited atypical behavior. These results reflect EfficientNetB0’s capacity for efficient optimization and robust generalization, consolidating it as the most suitable architecture for citrus disease detection. Finally, DenseNet121 exhibited high and stable performance, with accuracy above 95% across all folds. Nevertheless, the validation curves showed greater oscillation than those of EfficientNetB0, suggesting intermediate behavior between the stability of EfficientNetB0 and the variability of ResNet50 and MobileNetV2.

In summary, the model that achieved the best performance, considering the loss and accuracy curves shown in Figure 6 and the metric values reported in Table 3 under the selected tuned hyperparameters, was EfficientNetB0.

To enhance interpretability of each selected model’s behavior when identifying regions in citrus leaf images, we used Gradient-weighted Class Activation Mapping (Grad-CAM), which generates heat maps highlighting the areas of an image that most influence a convolutional model’s prediction. Figure 7 shows the Grad-CAM visualization for each selected model: Figure 7a EfficientNetB0, Figure 7b DenseNet121, Figure 7c ResNet50, and Figure 7d MobileNetV2. The results in Figure 7 show that EfficientNetB0 mainly focuses on the shape and boundaries of the leaf, being a consistent model when the task is to classify injured leaves; however, this suggests that the model could have been influenced by the illumination of the image location of the leaf face by the leaf counter specific diseases, causing misclassifications. DenseNet121 for its part, shows scattered activations in several regions that feature dark spots; that is, the model is recognizing timely and decision-relevant lesions; however, not all of them can correspond to the main lesioned areas, showing that the speckle distribution model cannot correspond to the diseases occasioning classification errors, which suggests that the deep connections of the features could introduce noise to irrelevant textures. On the one hand, the activation in ResNet50 is found to be focused but more diffuse than with the DenseNet121 model; the model puts attention on certain lesions, but also on harmless areas, showing partial attention to the relevant symptoms, causing the mod-el-learned confusion, which implies background patterns discriminators; it still lacks a precise localization capability, causing erroneous classifications; it still lacks a precise localization capability, causing erroneous classifications. MobileNetV2, on the other hand, shows strong activation in the central region of the leaf that matches the lesioned zone; however, it ignores peripheral lesions; this could explain its lower sensitivity to subtle texture and color variations associated with the presence of the disease. Overall, these results indicate that misclassifications usually occur when models focus on non-discriminative regions such as background, edges, or uniform textures, rather than the symptoms of diseases. This analysis highlights the importance of improving the segmentation and pre-processing stages to remove the background and highlight the foliar region, apply data augmentation techniques that introduce variations in illumination and texture, and expand the dataset with more varied field examples; the model could also be strengthened by adding spatial generalization and multiscaling to the models to focus on the regions of the lesions of the diseases, reducing the influence of the background or irrelevant parts, and the models could analyze images at different levels of detail, including small local textures such as global spots or spots, and global structures such as color patterns.

Finally, Table 4 reports a comparison of the accuracy obtained by the best-performing method (EfficientNetB0) against other state-of-the-art approaches. The proposed method, with the applied adjustments, achieves higher accuracy than the methods compared.

5. Limitations and Future Work

Although satisfactory results were obtained for automatic detection of citrus leaf diseases, this study presents some limitations to consider. First of all, the dataset used was limited to images captured under controlled lighting and background conditions, which could affect the performance of the model in real field scenarios, where environmental variations are more pronounced. On the other hand, the number of samples per class, although sufficient for initial training, could be expanded to improve the generalization ability of the model and reduce possible biases towards certain diseases.

As future work, the expansion of the dataset by incorporating images taken under different environmental and geographic conditions is proposed, as well as the use of more advanced data augmentation techniques, such as the employment of Generative Adversarial Networks (GANs), CutMix, MixUp or AutoAugment, among others. Similarly, it would be relevant to integrate the developed system into a real-time monitoring platform that allows producers to accurately detect diseases in their crops early on, thus contributing to more sustainable and efficient phytosanitary management.

6. Conclusions

In this work, a detailed analysis of four deep learning models (EfficientNetB0, DenseNet121, ResNet50, and MobileNetV2) is presented, addressing the detection of two major citrus diseases: citrus canker and Huanglongbing (HLB). For the analysis, stratified k-fold cross-validation with k = 5 was used, splitting the dataset into a test set (15%) and the remaining 85% for training and validation, with the latter further divided 80/20. Preprocessing and data augmentation were applied to adapt the images to the models and to expand the training set, respectively. Transfer learning with selective fine-tuning was applied to each model. Cross-entropy was used as the loss function; regularization techniques such as dropout and weight decay were employed to prevent overfitting; and hyperparameters including learning rate, batch size, number of epochs, and optimizer type were optimized. A ReduceLROnPlateau scheduler with an initial learning rate of 0.001 was used, together with early stopping with a maximum of 100 training epochs to halt the process if no improvements were observed. The models were evaluated using accuracy, loss, recall, and F1-score. Training and validation loss and accuracy curves were also generated and compared for each architecture. Subsequently, Grad-CAM was employed to interpret the behavior of each selected model. EfficientNetB0 presented the best overall performance, with a precision of 99.88%, recall of 0.9989, F1-score of 0.9988 and minimum loss of 0.0058. Its main advantage lies in its excellent balance between accuracy and computational efficiency, although it requires careful hyperparameter optimization. ResNet50 achieved a high accuracy of 98.67% and showed great generalization ability thanks to its residual connections; however, its structural complexity increases training time and resource usage. MobileNetV2, with an accuracy of 96.52%, stood out for its lightweight and fast architecture, which represents a significant advantage for deployments on mobile devices or resource-constrained environments, although its performance was slightly inferior to that of the more complex models. Finally, DenseNet121 obtained an accuracy of 84.22%; its main strength is efficient feature reuse and good gradient propagation, although it presents higher memory consumption and longer training time. Collectively, this comparison proves that EfficientNetB0 offers the best trade-off between performance and efficiency. The analysis of the training and validation curves further supported EfficientNetB0 as the most suitable model for citrus disease detection. In the Grad-CAM analysis, EfficientNetB0 demonstrated consistent behavior for classifying diseases in citrus leaves. Finally, EfficientNetB0 with all selected adjustments was compared by accuracy against other state-of-the-art models, positioning it as the best-performing method. Although the high accuracy of 99.88% obtained with EfficientNetB0 could suggest risks of overfitting due to the limited size of the dataset, this performance is consistent with values reported in the literature for similar studies of disease detection in citrus leaves, which range between 87.5% and 99.72%. The methodology employed Stratified K-Fold Cross-Validation, data augmentation, and transfer learning with selective fine-tuning (which contributes to minimizing these risks and guarantees robust internal validation). To expand the generalization of the results, future studies could incorporate external validation using independent datasets or evaluations between datasets obtained under different acquisition conditions.

Author Contributions

Conceptualization and writing—original draft preparation, B.L.-B.; methodology and validation, M.D.-G.; investigation, J.C.M.-P.; supervision, A.A.-P.; visualization, Ú.S.M.-R.; writing—review and editing, M.D.-G., B.L.-B., A.A.-P., J.C.M.-P. and Ú.S.M.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

The datasets used are the property of kaggle dataset and were developed for research purposes.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://www.kaggle.com/datasets/dtrilsbeek/citrus-leaves-prepared; https://www.kaggle.com/datasets/kaku321/citrus-plant-disease (accessed on 11 November 2025).

Acknowledgments

The authors would like to thank the Instituto Politécnico Nacional (Secretaría Académica, COFAA, EDD, EDI, SIP, ESCOM and CIC) and SECIHTI for their financial support to develop this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wu, H.; Li, Z.; Deng, X.; Zhao, Z. Enhancing agricultural sustainability: Optimizing crop planting structures and spatial layouts within the water-land-energy-economy-environment-food nexus. Geogr. Sustain. 2025, 6, 100258. [Google Scholar] [CrossRef]
Lo Vetere, M.; Iobbi, V.; Lanteri, A.P.; Minuto, A.; Minuto, G.; De Tommasi, N.; Bisio, A. The biological activities of Citrus species in crop protection. J. Agric. Food Res. 2025, 22, 102139. [Google Scholar] [CrossRef]
Kato-Noguchi, H.; Kato, M. Pesticidal Activity of Citrus Fruits for the Development of Sustainable Fruit-Processing Waste Management and Agricultural Production. Plants 2025, 14, 754. [Google Scholar] [CrossRef] [PubMed]
Zhang, M.; Meng, Q. Automatic citrus canker detection from leaf images captured in field. Pattern Recognit. Lett. 2011, 32, 2036–2046. [Google Scholar] [CrossRef]
Matshwene, M.; Loyiso, M. Identification of Citrus Canker on Citrus Leaves and Fruit Surfaces in the Grove Using Deep Learning Neural Networks. J. Agric. Sci. Technol. 2020, 10, 49–53. [Google Scholar] [CrossRef]
Deng, X.; Zhu, Z.; Yang, J.; Zheng, Z.; Xianbo, H.; Shujin Wei, Y.; Lan, Y. Detection of Citrus Huanglongbing Based on Multi-Input Neural Network Model of UAV Hyperspectral Remote Sensing. Remote Sens. 2020, 12, 2678. [Google Scholar] [CrossRef]
Gómez-Flores, W.; Garza-Saldaña, J.J.; Varela-Fuentes, S.E. Detection of Huanglongbing disease based on intensity-invariant texture analysis of images in the visible spectrum. Comput. Electron. Agric. 2019, 162, 825–835. [Google Scholar] [CrossRef]
Omaye, J.D.; Ogbuju, E.; Ataguba, G.; Jaiyeoba, O.; Aneke, J.; Oladipo, F. Cross-comparative review of Machine learning for plant disease detection: Apple, cassava, cotton and potato plants. Artif. Intell. Agric. 2024, 12, 127–151. [Google Scholar] [CrossRef]
Harakannanavar, S.; Rudagi, J.; Puranikmath, V.I.; Siddiqua, A.; Pramodhini, R. Plant leaf disease detection using computer vision and machine learning algorithms. Glob. Transit. Proc. 2022, 3, 305–310. [Google Scholar] [CrossRef]
Vinay, K.; Vempalli, S.; Thushar, S.; Tripty, S.; Apurvanand, S. A Deep Learning Framework for Early Detection and Diagnosis of Plant Diseases. Procedia Comput. Sci. 2025, 258, 1435–1445. [Google Scholar] [CrossRef]
Andrew, J.; Eunice, J.; Elena Popescu, D.; Kalpana Chowdary, M.; Hemanth, J. Deep Learning-Based Leaf Disease Detection in Crops Using Images for Agricultural Applications. Agronomy 2022, 12, 2395. [Google Scholar] [CrossRef]
Konstantinos, D.; Dimitrios, T.; Dimitrios, T.; Lykourgos, M.; Lazaros, I.; Panayotis, K. Pandemic Analytics by Advanced Machine Learning for Improved Decision Making of COVID-19 Crisis. Processes 2021, 9, 1267. [Google Scholar] [CrossRef]
Ching-Nam, H.; Yi-Zhen, T.; Pei-Duo, Y.; Jiasi, C.; Chee-Wei, T. Privacy-Enhancing Digital Contact Tracing with Machine Learning for Pandemic Response: A Comprehensive Review. Big Data Cogn. Comput. 2023, 7, 108. [Google Scholar] [CrossRef]
Sharif, M.; Attique Khan, M.; Iqbal, Z.; Faisal Azam, M.; Ikram, M.; Lali, U.; Younus Javed, M. Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection. Comput. Electron. Agric. 2018, 150, 220–234. [Google Scholar] [CrossRef]
Wetterich, C.B.; Oliveira Neves, R.F.; Belasque, J.; Marcassa, L.G. Detection of citrus canker and Huanglongbing using fluorescence imaging spectroscopy and support vector machine technique. Appl. Opt. 2016, 55, 400–407. [Google Scholar] [CrossRef] [PubMed]
Abdulridha, J.; Batuman, O.; Ampatzidis, Y. UAV-Based Remote Sensing Technique to Detect Citrus Canker Disease Utilizing Hyperspectral Imaging and Machine Learning. Remote Sens. 2019, 11, 1373. [Google Scholar] [CrossRef]
Qin, J.; Burks, T.F.; Ritenour, M.A.; Bonn, W.G. Detection of citrus canker using hyperspectral reflectance imaging with spectral information divergence. J. Food Eng. 2009, 93, 183–191. [Google Scholar] [CrossRef]
Narayanan, K.B.; Krishna Sai, D.; Akhil Chowdary, K.; Reddy, S. Applied Deep Learning approaches on canker effected leaves to enhance the detection of the disease using Image Embedding and Machine Learning Techniques. EAI Endorsed Trans. Internet Things 2024, 10. [Google Scholar] [CrossRef]
Yan, K.; Fang, X.; Yang, W.; Xu, X.; Lin, S.; Zhang, Y.; Lan, Y. Multiple light sources excited fluorescence image-based non-destructive method for citrus Huanglongbing disease detection. Comput. Electron. Agric. 2025, 237, 110549. [Google Scholar] [CrossRef]
Yan, K.; Song, X.; Yang, J.; Xiao, J.; Xu, X.; Guo, J.; Zhu, H.; Lan, Y.; Zhang, Y. Citrus huanglongbing detection: A hyperspectral data-driven model integrating feature band selection with machine learning algorithms. Crop Prot. 2025, 188, 107008. [Google Scholar] [CrossRef]
Kong, L.; Liu, T.; Qiu, H.; Yu, X.; Wang, X.; Huang, Z.; Huang, M. Early diagnosis of citrus Huanglongbing by Raman spectroscopy and machine learning. Laser Phys. Lett. 2024, 21, 015701. [Google Scholar] [CrossRef]
Kent, M.G.; Schiavon, S. Predicting Window View Preferences Using the Environmental Information Criteria. LEUKOS 2022, 19, 190–209. [Google Scholar] [CrossRef]
Goyal, A.; Lakhwani, K. Integrating advanced deep learning techniques for enhanced detection and classification of citrus leaf and fruit diseases. Sci. Rep. 2025, 15, 12659. [Google Scholar] [CrossRef]
Faisal, S.; Javed, K.; Ali, S.; Alasiry, A.; Marzougui, M.; Attique Khan, M.; Cha, J.H. Deep Transfer Learning Based Detection and Classification of Citrus Plant Diseases. Comput. Mater. Contin. 2023, 76, 895–914. [Google Scholar] [CrossRef]
Sharma, P.; Abrol, P. Multi-component image analysis for citrus disease detection using convolutional neural networks. Crop Prot. 2025, 193, 107181. [Google Scholar] [CrossRef]
Butt, N.; Munwar Iqbal, M.; Ramzan, S.; Raza, A.; Abualigah, L.; Latif Fitriyani, M.; Gu, Y.; Syafrudin, M. Citrus diseases detection using innovative deep learning approach and Hybrid Meta-Heuristic. PLoS ONE 2025, 20, e0316081. [Google Scholar] [CrossRef]
Syed-Ab-Rahman, S.F.; Hesam Hesamian, M.; Prasad, M. Citrus disease detection and classification using end-to-end anchor-based deep learning model. Appl. Intell. 2022, 52, 927–938. [Google Scholar] [CrossRef]
Zhang, X.; Xun, Y.; Chen, Y. Automated identification of citrus diseases in orchards using deep learning. Biosyst. Eng. 2022, 223, 249–258. [Google Scholar] [CrossRef]
Arthi, A.; Sharmili, N.; Althubiti, S.A.; Laxmi Lydia, E.; Alharbi, M.; Alkhayyat, A.; Gupta, D. Duck optimization with enhanced capsule network based citrus disease detection for sustainable crop management. Sustain. Energy Technol. Assess. 2023, 58, 103355. [Google Scholar] [CrossRef]
Dhiman, P.; Kaur, A.; Hamid, Y.; Alabdulkreem, E.; Elmannai, H.; Ababneh, N. Smart Disease Detection System for Citrus Fruits Using Deep Learning with Edge Computing. Sustainability 2023, 15, 4576. [Google Scholar] [CrossRef]
Mahesh, T.R.; Kumar, V.; Kumar, D.; Geman, O.; Margala, M.; Guduri, M. The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification. Healthc. Anal. 2023, 4, 100247. [Google Scholar] [CrossRef]
Dtrilsbeek. Citrus Leaves Prepared; [Dataset]. Kaggle. Available online: https://www.kaggle.com/datasets/dtrilsbeek/citrus-leaves-prepared (accessed on 30 October 2025).
Kaku321. Citrus Plant Disease; [Dataset]. Kaggle. Available online: https://www.kaggle.com/datasets/kaku321/citrus-plant-disease (accessed on 30 October 2025).
Huang, D.; Xiao, K.; Luo, H.; Yang, B.; Lan, S.; Jiang, Y.; Li, Y.; Ye, D.; Sun, D.; Weng, H. Implementing transfer learning for citrus Huanglongbing disease detection across different datasets using neural network. Comput. Electron. Agric. 2025, 238, 110886. [Google Scholar] [CrossRef]
Li, S.; Zhao, P.; Zhang, H.; Sun, X.; Wu, H.; Jiao, D.; Wang, W.; Liu, C.; Fang, Z.; Xue, J. Surge phenomenon in optimal learning rate and batch size scaling. In Proceedings of the 38th International Conference on Neural Information Processing Systems (NIPS ‘24), Vancouver, BC, Canada, 10–15 December 2024; Curran Associates Inc.: Red Hook, NY, USA, 2025; Volume 37, pp. 132722–132746. [Google Scholar]
Kobir Siam, F.; Bishshash, P.; Asraful Sharker Nirob, M.D.; Bin Mamun, S.; Assaduzzaman, M.; Rashed Haider Noori, S. A comprehensive image dataset for the identification of lemon leaf diseases and computer vision applications. Data Brief 2025, 58, 111244. [Google Scholar] [CrossRef] [PubMed]
Zhang, F.; Jin, X.; Lin, G.; Jiang, J.; Wang, M.; Junhua Hu, S.; Lyu, Q. Hybrid attention network for citrus disease identification. Comput. Electron. Agric. 2024, 220, 108907. [Google Scholar] [CrossRef]
Lawal Rukuna, A.; Zambuk, F.U.; Gital, A.Y.; Muhammad Bello, U. Citrus diseases detection and classification based on efficientnet-B5. Syst. Soft Comput. 2025, 7, 200199. [Google Scholar] [CrossRef]
Sujatha, R.; Moy Chatterjee, J.; Jhanjhi, N.Z.; Nawaz Brohi, S. Performance of deep learning vs machine learning in plant leaf disease detection. Microprocess. Microsyst. 2021, 80, 103615. [Google Scholar] [CrossRef]
Qin, J.; Burks, T.F.; Zhao, X.; Niphadkar, N.; Ritenour, M.A. Development of a two-band spectral imaging system for real-time citrus canker detection. J. Food Eng. 2012, 108, 87–93. [Google Scholar] [CrossRef]
Dinesh, A.; Balakannan, S.P.; Maragatharaja, M. A novel method for predicting plant leaf disease based on machine learning and deep learning techniques. Eng. Appl. Artif. Intell. 2025, 155, 111071. [Google Scholar] [CrossRef]
Yang, B.; Yang, Z.; Xu, Y.; Cheng, W.; Zhong, F.; Ye, D.; Weng, H. A 1D-CNN model for the early detection of citrus Huanglongbing disease in the sieve plate of phloem tissue using micro-FTIR. Chemom. Intell. Lab. Syst. 2024, 252, 105202. [Google Scholar] [CrossRef]
Cao, L.; Xiao, W.; Hu, Z.; Li, X.; Wu, Z. Detection of Citrus Huanglongbing in Natural Field Conditions Using an Enhanced YOLO11 Framework. Mathematics 2025, 13, 2223. [Google Scholar] [CrossRef]
Qiu, R.-Z.; Chen, S.-P.; Chi, M.-X.; Wang, R.-B.; Huang, T.; Fan, G.-C.; Zhao, J.; Weng, Q.-Y. An automatic identification system for citrus greening disease (Huanglongbing) using a YOLO convolutional neural network. Front. Plant Sci. 2022, 13, 1002606. [Google Scholar] [CrossRef]

Figure 1. Example images infected with: (a) and (c) Canker; (b) and (d) Huanglongbing.

Figure 2. Pipeline of the proposed approach.

Figure 3. Data augmentation applied to an image showing citrus canker.

Figure 4. Convergence curves for the EfficientNetB0 model under different batch size settings (16, 32, and 64). In (a) Validation Accuracy by Batch Size; (b) Validation Loss by Batch Size; (c) Train vs. Accuracy (All BS).

Figure 5. Confusion matrix results. In (a) MobileNetV2; in (b) DenseNet121; in (c) ResNet50, and (d) EfficientNetB0.

Figure 6. Training and validation loss and accuracy curves for the selected models: (a) MobileNetV2; (b) ResNet50; (c) EfficientNetB0, and (d) DenseNet121.

Figure 7. Grad-CAM visualizations highlighting disease-affected regions in citrus images for each selected model: (a) EfficientNetB0; (b) DenseNet121; (c) ResNet50, and (d) MobileNetV2.

Table 1. Summary of the dataset information.

Disease Type	Training (n)	Test (n)
Canker	451	79
HLB	461	81
Total	912	160

Table 2. Range of Selected Hyperparameters.

Hyperparameter	Range
Learning Rate	1 × 10⁻⁶–0.1
Batch size	16, 32, 64
No. epochs	10–100
Optimizer type	SGD, Adam, AdamW

Table 3. Performance comparison of CNN models using Stratified K-Fold Cross-Validation con K = 5.

Model	Accuracy	Loss	Recall	F1-Score
MobileNetV2	0.9652	0.0230	0.9629	0.9690
DenseNet121	0.8422	0.0771	0.7852	0.8952
Resnet50	0.9867	0.0391	0.9863	0.9863
EfficientNetB0	0.9988	0.0058	0.9989	0.9988

Table 4. Comparison with other methods.

Method	Accuracy
Current work. Data augmentation + Transfer learning + selective fine-tuning + Optimization + EfficientNetB0	99.88
GA + multi-feature fusion of vegetation index + SAE [6]	99.72
intensity-invariant texture analysis + Ranklet transform + Random Forest [7]	95
Optimized weighted segmentation method + hybrid feature selection method + M-SVM [14]	97.00
FIS + SVM [15]	97.8
SID [17]	96.2
EEMs + FCR1-FCR4 + Random Forest [19]	87.5
SPA-STD-SVM [20]	97.46
PCA-SVM [21]	95.56
Data augmentation + InceptionV3 [23]	99.12
Transfer learning + EfficientNetB3 [24]	99.58
RPN + Faster RCNN [27]	97.2
YOLOV4 + EfficientNet [28]	89.00
DOECN-CDDCM [29]	98.40
CNN-LSTM [30]	98.25
DenseNet121 [36]	With augmentation 98.56
DenseNet121 [36]	Without augmentation 96.19
FdaNet + HaNet50 [37]	98.83
SMOTE + efficientNet-B5 [38]	99.22
VGG-16 [39]	89.5
two-band ratio approach (R830/R730) [40]	95.3
M-D-C-A-S-ASVM [41]	99
1D-CNN + PLSR + LS-SVR [42]	98.65
DCH-YOLO11 [43]	91.6
Yolov5l-HLB2 [44]	85.19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Devora-Guadarrama, M.; Luna-Benoso, B.; Alarcón-Paredes, A.; Martínez-Perales, J.C.; Morales-Rodríguez, Ú.S. Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images. Computers 2025, 14, 500. https://doi.org/10.3390/computers14110500

AMA Style

Devora-Guadarrama M, Luna-Benoso B, Alarcón-Paredes A, Martínez-Perales JC, Morales-Rodríguez ÚS. Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images. Computers. 2025; 14(11):500. https://doi.org/10.3390/computers14110500

Chicago/Turabian Style

Devora-Guadarrama, Maryjose, Benjamín Luna-Benoso, Antonio Alarcón-Paredes, Jose Cruz Martínez-Perales, and Úrsula Samantha Morales-Rodríguez. 2025. "Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images" Computers 14, no. 11: 500. https://doi.org/10.3390/computers14110500

APA Style

Devora-Guadarrama, M., Luna-Benoso, B., Alarcón-Paredes, A., Martínez-Perales, J. C., & Morales-Rodríguez, Ú. S. (2025). Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images. Computers, 14(11), 500. https://doi.org/10.3390/computers14110500

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Citrus Canker and Huanglongbing Disease Detection Using Leaf Images

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Dataset

3.2. Stratified K-Fold Cross-Validation

3.3. Data Augmentation

3.4. Transfer Learning

3.5. Proposed Approach

4. Experiments and Results

5. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI