Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection

Terzioğlu, Hakan; Gölcük, Adem; Shakarji, Adnan Mohammad Anwer; Al-Bayati, Mateen Yilmaz

doi:10.3390/agronomy15071509

Open AccessArticle

Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection

by

Hakan Terzioğlu

^1,*

,

Adem Gölcük

²

,

Adnan Mohammad Anwer Shakarji

³

and

Mateen Yilmaz Al-Bayati

⁴

¹

Department of Electrical-Electronics Engineering, Selcuk University, Konya 42031, Turkey

²

Department of Computer Engineering, Selcuk University, Konya 42031, Turkey

³

Institute of Sciences, Selcuk University, Konya 42031, Turkey

⁴

Collage of Agriculture, University of Kirkuk, Kirkuk 36001, Iraq

^*

Author to whom correspondence should be addressed.

Agronomy 2025, 15(7), 1509; https://doi.org/10.3390/agronomy15071509

Submission received: 9 May 2025 / Revised: 30 May 2025 / Accepted: 5 June 2025 / Published: 21 June 2025

(This article belongs to the Section Pest and Disease Management)

Download

Browse Figures

Versions Notes

Abstract

In recent years, significant advancements in artificial intelligence, particularly in the field of deep learning, have increasingly been integrated into agricultural applications, including critical processes such as disease detection. Tomato, being one of the most widely consumed agricultural products globally and highly susceptible to a variety of fungal, bacterial, and viral pathogens, remains a prominent focus in disease detection research. In this study, we propose a deep learning-based approach for the detection of tomato diseases, a critical challenge in agriculture due to the crop’s vulnerability to fungal, bacterial, and viral pathogens. We constructed an original dataset comprising 6414 images captured under real production conditions, categorized into three image types: leaves, green tomatoes, and red tomatoes. The dataset includes five classes: healthy samples, late blight, early blight, gray mold, and bacterial cancer. Twenty-one deep learning models were evaluated, and the top five performers (EfficientNet-b0, NasNet-Large, ResNet-50, DenseNet-201, and Places365-GoogLeNet) were selected for feature extraction. From each model, 1000 deep features were extracted, and feature selection was conducted using MRMR, Chi-Square (Chi²), and ReliefF methods. The top 100 features from each selection technique were then used for reclassification with traditional machine learning classifiers under five-fold cross-validation. The highest test accuracy of 92.0% was achieved with EfficientNet-b0 features, Chi² selection, and the Fine KNN classifier. EfficientNet-b0 consistently outperformed other models, while the combination of NasNet-Large and Wide Neural Network yielded the lowest performance. These results demonstrate the effectiveness of combining deep learning-based feature extraction with traditional classifiers and feature selection techniques for robust detection of tomato diseases in real-world agricultural environments.

Keywords:

tomato disease detection; deep learning; feature extraction; machine learning; image classification; agricultural AI

1. Introduction

Tomatoes are among the most consumed vegetables worldwide. While the global average per capita consumption is approximately 5 kg per year, certain countries like Turkey and Italy report significantly higher rates, with per capita consumption reaching up to 138 kg and 109 kg per year, respectively [1]. One of the most significant factors negatively affecting yield and quality in tomato production is diseases caused by harmful organisms. Due to climate and environmental factors, tomatoes are susceptible to various diseases during the sowing and growing processes. Detecting these diseases and taking necessary precautions is crucial for producers.

Improving agricultural productivity and reducing crop losses are of great importance for global food security. Especially tomatoes (Solanum lycopersicum), due to their economic value and widespread consumption, are among the most widely cultivated vegetables worldwide. However, tomato production faces serious threats from various plant diseases. These diseases, caused by fungal, bacterial, and viral agents, can result in yield losses ranging from 20% to 80% if not diagnosed in a timely and accurate manner [2]. Early detection of tomato diseases not only increases yield but also helps reduce pesticide use. Unconscious and excessive use of chemical treatments poses significant risks to environmental pollution, soil health, and human health. In this context, fast and reliable detection of diseases is of critical importance for sustainable agricultural practices [3].

Traditional visual diagnostic methods require expertise, are time-consuming, and prone to errors. Therefore, the use of artificial intelligence-based methods, such as image processing and machine learning in agriculture, has significantly increased in recent years. Systems employing deep learning techniques have particularly attracted attention for their high accuracy in the automatic recognition and classification of diseases [4].

Many studies on detecting tomato diseases using image processing and machine learning techniques have shown that early diagnosis can improve productivity. For instance, common tomato diseases such as Alternaria alternata and Phytophthora infestans were classified with 97% accuracy using the ResNet50 architecture. This study demonstrated the high performance of deep learning-based models [5]. Similarly, the Inception-V3 architecture achieved 87.2% accuracy in detecting Tuta absoluta (tomato leafminer) pests [6].

Studies using the YOLOv4 architecture have also produced successful results, with a 96.29% accuracy rate in detecting tomato leaf diseases [7]. Using the Transformer-based TomFormer model, accuracy (mAP) scores of 87%, 81%, and 83% were achieved on the KUTomaDATA, PlantDoc, and PlantVillage datasets, respectively [8]. The MobileNetV2 architecture, developed for mobile and embedded systems, was evaluated with a 99.30% accuracy rate. These results show that even systems with low computational power can achieve high accuracy [9]. In a comparative analysis using DenseNet, ResNet50, and MobileNet architectures, the highest accuracy (99%) was achieved with DenseNet, highlighting the superior performance of densely connected networks in processing health data [10]. Kotwal et al. (2024) classified tomato diseases with 98.7% accuracy using the EfficientNet-B4 architecture [11].

Molecular biotechnology-based studies have also provided important findings in identifying tomato diseases. A real-time PCR method developed using an LNA probe was shown to detect Dickeya chrysanthemi with a sensitivity of 2 cells [12]. Using a multiplex RT-PCR method, the tomato viral diseases ToMV and PVY were detected with a 20.1% infection rate [13]. Another study combining morphological and molecular analyses reported a 27.23% occurrence rate of Alternaria alternata [14]. In tomato fields in Elazığ, phytoplasma diseases were detected using qPCR and Nested qPCR methods [15].

Another study focused on creating and analyzing an image-based dataset for classifying diseases in Agaricus bisporus (J.E. Lange) Imbach cultures. The dataset included images of healthy mushrooms and those affected by various disease classes. The study aimed to create a dataset useful for identifying and classifying mushroom diseases using deep learning or other machine learning techniques. During dataset creation, a portable mushroom imaging system developed for the study was used during visits to mushroom farms, resulting in approximately 7250 diseased mushroom images and 1800 healthy images (about 3000 images for each lighting condition). Four different disease classes commonly found in cultivated mushrooms were observed, and each mushroom was imaged under three different lighting conditions [16].

Similar deep learning approaches have been used in different agricultural fields such as wheat variety identification [17,18], but the focus in this work is on tomato diseases. Deep learning techniques have been widely adopted in plant disease detection, especially for economically important vegetables like tomatoes. Convolutional Neural Networks (CNNs) have demonstrated significant success in classifying tomato leaf diseases such as early blight, late blight, and leaf mold. For instance, Mohanty et al. (2016) used a deep CNN trained on the PlantVillage dataset to classify 14 crop species and 26 diseases, achieving over 99% accuracy in tomato disease recognition [19]. Similarly, Fuentes et al. (2017) proposed a real-time system combining Faster R-CNN and VGG16 for detecting multiple diseases in tomato plants under natural field conditions [20]. More recent approaches integrate transfer learning with models such as ResNet, EfficientNet, and MobileNet, enabling high accuracy even with limited datasets. These studies highlight the increasing role of deep learning in building reliable, fast, and scalable systems for automated diagnosis of tomato diseases, which aligns closely with the objective of the present study [19,20].

In this study, a five-class classification problem was addressed to detect tomato diseases using a dataset consisting of original tomato images collected under real-world conditions. The five classes include tomato downy mildew, early leaf blight, gray mold, bacterial canker and spot, and healthy tomatoes. These categories were selected due to their prevalence and economic importance in tomato production. Initially, the dataset consisted of 3207 images collected from tomato plants at various growth stages and different plant parts including leaves, green tomatoes, and red tomatoes. To improve the generalization ability of the model and address the class imbalance issue, data augmentation techniques such as rotation, translation, scaling, and contrast adjustment were applied, resulting in a balanced and enriched dataset of 6414 images. This augmented dataset was then split into training and testing subsets in the ratio of 80:20. A total of 21 different deep learning models, including state-of-the-art convolutional neural networks (CNNs) such as ResNet, EfficientNet, DenseNet, and NasNet, were trained and evaluated to determine their classification performance. Based on accuracy and other evaluation metrics, five architectures with a success rate above 85% were selected for further analysis. To better understand the learned representations and potentially increase the classification accuracy, deep feature extraction was performed on these first five models. From each model, 1000 deep features representing the high-level visual patterns learned during training were initially extracted. These features were then subjected to feature selection using five well-established statistical and information-theoretic methods: minimum redundancy maximum relevance (MRMR), Chi-Square (Chi²), ReliefF, ANOVA, and Kruskal–Wallis. The aim was to identify the top 100 most discriminative features from each cluster, thereby reducing dimensionality and improving the performance of traditional machine learning classifiers.

Finally, the selected features were used as input to various machine learning algorithms, and five-fold cross-validation was performed on the same 80:20 split to ensure robust evaluation. This hybrid approach, combining deep learning-based feature extraction with classical machine learning classifiers, enabled a comprehensive performance analysis. The resulting metrics were compared across models, feature selection techniques, and classifiers to determine the optimal configuration for tomato disease detection.

2. Materials and Methods

In this article, diseases that can be detected using visual analysis methods were selected based on a literature review. Studies focusing on visually perceptible plant diseases were examined. Five categories were created from original data consisting of visually identifiable diseases commonly observed in tomatoes: late blight, early blight, gray mold, bacterial canker, and healthy tomatoes.

These diseases can be encountered at any stage of plant cultivation and typically appear as lesions on the leaves, stems, and fruits, and may cause root rot or stem collar rot in seedlings. Early blight (Alternaria solani) [21,22] causes infections in stems, fruits, and leaves. Gray mold (Botrytis cinerea) [23] leads to epidermal cracking and water loss in the host, and lesions on stems and fruit stalks that cause fruit drop. Tomato Late blight (Phytophthora infestans) [22,24] initially appears as small, pale green or yellowish spots on leaves, which turn brown to black as the disease progresses, spreading to petioles, branches, and stems. In advanced stages, lesions may tear, dry out, or cause rotting. Bacterial Spot in Tomato (Xanthomonas vesicatoria) [22,25] starts with lesions on leaves resembling oil droplets, surrounded by yellow halos, which turn brownish-black and merge as the disease progresses, leading to yellowing and drying of leaves. Tomato Bacterial Canker and Wilt (Clavibacter michiganensis subsp. michiganensis) [22,26] shows early symptoms as inward curling, browning, and wilting of leaflets in a localized area of the plant. Infections occurring during the seedling stage can result in stunted growth or rapid wilting and death. In later stages, brown discoloration of vascular tissues may lead to cracks called “cankers” in the stem and branches. Bacterial Speck of Tomato (Pseudomonas syringae pv. tomato) [22,27] manifests as lesions on all above-ground organs, beginning during the seedling stage with numerous brown-black spots on leaves and stems, which can ultimately lead to complete drying of the seedling.

As shown in Figure 1, a five-class classification was carried out using original data of common tomato diseases—late blight, early blight, gray mold, bacterial canker—and healthy tomatoes. The images were collected in real field conditions at different times of the day (morning, noon, afternoon) to capture variability in natural lighting and shading. Multiple angles and distances were used to reflect realistic scenarios for tomato disease detection. Initially, a total of 3207 data samples were expanded to 6414 using data augmentation techniques. These 6414 data samples were then used to train models using 21 different deep learning algorithms, with 80% used for training and 20% for testing. From the results, the top 5 deep learning algorithms were identified. Feature extraction was applied using these 5 algorithms, resulting in the extraction of 1000 features. Then, 100 features were selected using MRMR, Chi², ReliefF, ANOVA, and Kruskal–Wallis methods. These selected features were split into 80% for training and 20% for testing, and reclassification was performed using machine learning algorithms with 5-fold cross-validation. The results obtained from this process were evaluated.

2.1. Five-Fold Cross-Validation

To evaluate the performance and generalizability of the classification models, five-fold cross-validation was employed. In this method, the dataset is randomly partitioned into five equal-sized subsets (folds). For each iteration, four folds are used for training, and the remaining one is used for testing. This process is repeated five times, with each fold used exactly once as the testing set. The average of the five performance metrics is then calculated to obtain a more robust estimate of the model’s accuracy. This technique helps reduce the risk of overfitting and provides a more reliable assessment compared to a single train-test split [28,29].

2.2. Model Performance Metrics

Training and test accuracy are two basic metrics commonly used to evaluate model performance. Training accuracy shows how well the model performed on the data it used in the learning process, while test accuracy reflects the model’s ability to generalize on independent data that it has not seen before. A high training accuracy alone indicates that the model may have memorized the data and may fail on new data (overfitting). In contrast, test accuracy reveals the model’s predictive power on real-world data. If training accuracy is high but test accuracy is low, the model has over-learned; if both accuracies are low, the model has not learned enough (underfitting). Therefore, the success of a model should be evaluated not only on the training data but also on the test data [29].

In this study, a dataset composed of unstructured images of red and green tomatoes and leaves, as shown in Figure 2, was used. This study aims to fill the gap in the literature using this dataset and contribute to the development of more effective pest management strategies for agriculture.

The tomato disease data were collected from the tomato greenhouse established in Kirkuk province, as shown in Figure 3. Tomato planting was carried out, and their development was monitored daily.

The images of the cultivated tomato plants were captured using the Redmi Note 9 Pro smartphone (Xiaomi Corporation, Beijing, China), which features an AI-powered quad-camera system. This system includes a 64-megapixel main camera with an f/1.89 aperture, an 8-megapixel ultra-wide-angle camera with an f/2.2 aperture, a 2-megapixel depth camera with an f/2.4 aperture, and a 5-megapixel macro camera also with an f/2.4 aperture and focal length.

As shown in Table 1, the dataset consists of 3207 original tomato images, including leaves, red tomatoes, and green tomatoes, affected by diseases such as late blight, early blight, gray mold, bacterial canker, and bacterial spot.

The data augmentation techniques used in this work are implemented through the ImageDataGenerator class of the TensorFlow Keras library. These techniques are designed to improve the ability of our model to recognize different image variations by increasing the diversity of our image-based datasets. The main data augmentation techniques used were horizontal flip, vertical flip, and zoom.

These parameters were used in the training process of the model and the ratios determined for each transformation technique were carefully selected to maximize the adaptability of the model to scenarios that the model may encounter on real world data. In terms of the number of data augmentations, approximately 2 to 4 augmented images were produced for each original image. This significantly increased the diversity of our dataset and allowed the model to learn from a wider range of data during training. As a result, these data augmentation techniques significantly improved the overall performance of our model and allowed it to work more effectively on complex image recognition tasks. As a result of the data augmentation method, as shown in Table 1, the study was carried out with a total of 6414 data, including mildew disease (1730), early leaf blight (968), canker mildew (1006), bacterial cancer and spot (1046), and normal healthy tomato (1664).

The computer used for training the deep learning algorithms was equipped with a 12th Gen Intel^® Core™ i9-12900HX processor operating at 2.30 GHz, 32.0 GB of RAM, and a 64-bit operating system.

In the classification of tomato diseases, a total of 21 deep learning algorithms were employed, including CNN, Inception-ResNet-v2, GoogleNet, Places365-GoogleNet, SqueezeNet, DarkNet-53, ResNet-50, AlexNet, MobileNet-v2, EfficientNet-b0, DenseNet201, NasNetMobile, ResNet-18, DarkNet-19, VGG-19, ShuffleNet, Xception, Inception-v3, ResNet-101, VGG-16, and NasNet-Large. These models were trained using randomly selected data split into 80% for training and 20% for testing. The training parameters of the top five algorithms, those that achieved the highest classification accuracy, are presented in Table 2. MATLAB (R2024b, MathWorks, Natick, MA, USA) training result graphs and values of the NasNet-Large algorithm are given in Figure 4.

3. Results

This study aims to detect and classify four common diseases affecting tomato plants. The augmented dataset presented in Table 3 includes the training and testing accuracy rates, as well as the execution times for 21 different deep learning algorithms. Upon examining Table 3, it is observed that the NasNet-Large algorithm achieved the highest accuracy, with a training accuracy of 88.07% and a testing accuracy of 87.23%. However, this performance came at a significant computational cost, with a training time of 2729 min and 39 s, indicating a disproportionate time-to-performance ratio.

Considering both time and performance metrics, the ResNet-50 algorithm demonstrated a more balanced profile, achieving a training accuracy of 88.07% and a testing accuracy of 86.85%, with a training duration of 625 min and 25 s. This indicates that ResNet-50 offers a favorable trade-off between computational efficiency and classification performance.

Moreover, the EfficientNet-b0 algorithm yielded 86.53% training accuracy and 85.76% testing accuracy, while completing its training in just 140 min and 51 s. This highlights its rapid execution time combined with high classification accuracy, making it a promising candidate for time-sensitive applications.

Based on the evaluations and as illustrated in Table 3, the CNN algorithm, with a training accuracy of 66.17% and a testing accuracy of 60.23%, and the DarkNet-19 algorithm, with a training accuracy of 62.67% and a testing accuracy of 57.16%, were identified as the two least effective models in terms of classification performance.

Figure 5 presents the confusion matrices and classification accuracy rates for the four most successful deep learning algorithms—NasNet-Large, ResNet-50, DenseNet201, and EfficientNet-b0—in the context of tomato disease classification. These matrices provide a comparative overview of each model’s ability to distinguish between disease classes and demonstrate their overall predictive effectiveness.

In the confusion matrix:

Green cells indicate high correct classification rates or true positives for each class (diagonal cells), representing accurate model predictions. The darker the green, the higher the percentage of correctly classified instances.

Yellow to orange cells represent misclassifications, with varying shades indicating the severity (i.e., proportion) of the error. Brighter orange signifies higher misclassification counts, while light yellow suggests relatively minor misclassifications.

Bottom row and rightmost column summarize precision and recall values respectively, using similar color scaling for easy visual interpretation of per-class performance.

Table 4 shows the accuracy rates of 21 deep learning methods in disease classification using test comparison matrices.

As shown in Table 4, late blight (Mildiyö) was classified with over 80% accuracy by 17 models, early blight by 8 models, gray mold by 9 models, and healthy tomato class by 20 models. In contrast, the bacterial canker and spot class was generally classified with relatively lower accuracy across most models. Late blight achieved the highest classification accuracy of 92.2% using the Inception-ResNet-v2 algorithm, while the lowest accuracy of 34.68% was observed with VGG-16. Early blight was best classified with accuracies ranging between 87% and 89% by Place365-GoogLeNet, ResNet-50, and EfficientNet-b0, whereas CNN and DarkNet-19 performed poorly, with accuracies of 41.75% and 44.33%, respectively. Gray mold was best classified by the VGG-16 model, with accuracy values of 91.54% and 90.05%. Bacterial canker and spot disease reached its highest accuracies with VGG-16 and VGG-19, at 91.54% and 78.61%, respectively; however, for most models, the accuracy in this class remained below 60%. For the healthy tomato class, very high accuracy levels exceeding 98% were achieved by Inception-ResNet-v2, DenseNet-201, Xception, and NasNet-Mobile.

Table 5 compares the classification performance of different deep learning models (NasNet-Large, DenseNet201, Res-Net-50, EfficientNetB0, and Places365-GoogLeNet) in five different classes according to Precision, Recall (TPR), F1-Score, and ROC AUC metrics. The highest performance was generally observed in NasNet-Large and Dense-Net201. The NasNet-Large algorithm is the model with the highest average F1-Score and ROC AUC values, and it is seen that it has a very good performance especially for Class 5 with the values of Precision 0.953, Recall 0.973, F1: 0.963, ROC AUC: 0.978. It also has a relatively low F1: 0.867 value in Class 3, but this can be evaluated as not a bad result. DenseNet201 offers a strong performance close to NasNet. Class 5 again stands out with very high scores (F1: 0.962, ROC AUC: 0.986), while for Class 2, Precision is low (0.832) and Recall is high (0.901). This means that the model tends not to miss positive classes. ResNet-50 has low Precision (0.801, 0.847) in Class 1 and 3, but high Recall values. Its overall performance is behind DenseNet and NasNet. EfficientNetB0 has difficulties in Class 3 and 4 with low F1-Scores: (0.815 and 0.810) values. It is moderately successful in other classes. Places365-GoogLeNet is the model with the lowest overall performance (Precision: 0.815, 0.834 in Class 1 and 3). It shows lower success compared to other models with F1: 0.828, 0.835 values. While it shows a high performance with F1: 0.963 in Class 5, it remains weak in terms of general average.

Based on the five most successful algorithms in terms of overall classification, Mildy mildew disease with ResNet-50 algorithm is 89.3%, leaf blight with ResNet-50 algorithm is 89.2%, and Places365-GoogLeNet algorithm is 89.7%. The NasNet-Large algorithm for Tomato Bacterial Mildew disease is the model with the highest classification rates (89.5%), the DenseNet-201 algorithm for bacterial cancer and spot disease has an accuracy rate of 77.99%, and the DenseNet-201 algorithm for Healthy Tomato disease has 98.8%.

Considering the training and test success rates from Figure 5, the NasNet-Large, ResNet-50, DenseNet-201, EfficientNet-b0, and Places365-GoogLeNet algorithms were identified as the top five performers. Therefore, Figure 6 and Table 6 present further analysis and evaluation results specific to these selected models, enabling a more detailed comparison of their diagnostic capabilities.

Based on the results presented in Table 4 and the heatmap in Figure 6, the five deep learning algorithms with the highest test accuracy are identified as NasNet-Large, ResNet-50, DenseNet-201, EfficientNet-b0, and Places365-GoogLeNet. After training these top-performing models, feature extraction was carried out by removing their fully connected layers and using the output of the last convolutional layer. This process generated 1000-dimensional feature vectors for each image, representing high-level learned features.

To reduce computational complexity and enhance classification performance, five feature selection methods—MRMR, Chi², ReliefF, ANOVA, and Kruskal–Wallis—were applied, reducing the feature set from 1000 to 100 dimensions. These methods select the most informative features based on statistical relevance and redundancy criteria. The aim of this reduction was to improve efficiency while preserving classification accuracy.

It was observed that the ANOVA and Kruskal–Wallis methods resulted in lower classification performance compared to the others; therefore, their results were omitted to maintain focus on the more successful techniques. The newly generated datasets (with 100 selected features) were then split into 80% training and 20% testing sets. Classification was performed using various machine learning algorithms with five-fold cross-validation. Table 5 presents the results of the three machine learning algorithms that achieved the best classification performance. Table 6 shows the results of the three machine learning algorithms that produced the best classification results using 100 features.

In Table 6, it is seen that the Subspace KNN classifier provides the highest test accuracies in almost all datasets with 100 features, the Cubic SVM classifier is generally good but does not reach as high results as the Subspace KNN classifier, and the Wide Neural Network classifier generally gives the lowest test accuracies. It is also seen that the Subspace KNN classifier generally provides the highest test accuracy rates in the DenseNet-201 and ResNet-50 deep learning algorithms.

Table 7 compares training and test accuracy rates based on 100 features selected from a set of 1000 deep features using various feature selection methods, including MRMR, Chi², and ReliefF, applied to different deep feature extraction techniques. According to this table, the highest test accuracy of 92% was achieved when features extracted from the EfficientNet-b0 model were selected using the Chi² method and classified using the Fine K-Nearest Neighbors (Fine KNN) machine learning algorithm. To determine the most consistent classification performance, the mean, standard deviation, and variance of the obtained results were calculated. Based on these calculations, the ReliefF method demonstrated the most stable performance, with an average test accuracy of 89.32%.

Table 8 and Table 9 show the effects of feature selection methods (MRMR, Chi², ReliefF, ANOVA, Kruskal–Wallis) on different deep learning models (NasNet-Large, DenseNet201, ResNet-50, EfficientNetB0, and Placod_GoogLe) for each combination, with Precision, Recall, F1-Score, and ROC AUC metrics given on a class basis.

When Table 8 is examined, the highest F1-Score and ROC AUC for the NasNet-Large model are generally obtained with the MRMR and Chi² methods. As can be understood from the values of MRMR: F1 = 0.968 and ROC AUC = 0.991 for Class 5, it is seen that the classification is successful. For the DenseNet201 model, the Chi² and MRMR methods stand out. It is understood from the values of MRMR: F1 = 0.978 and ROC AUC = 0.996 for Class 5 that the classification is successful. In the ResNet-50 model, it is seen that the Chi² method is successful in Class 1 F1 = 0.910 and Class 5 F1 = 0.976 classes. In addition, the MRMR method also works correctly for this model.

Table 9 shows that the EfficientNetB0 model is a more balanced model in all feature selection methods. The highest classification was achieved with F1 > 0.95 in Class 5 with the MRMR and Chi² feature selection methods. It is understood from Table 9 that the Places365-GoogLeNet model exhibits a lower performance compared to the other 4 models examined.

Among the deep feature extraction models examined, features derived from EfficientNet-b0 consistently yielded the highest accuracy rates across all feature selection methods. Features extracted from DenseNet-201 and ResNet-50 also showed strong classification performance with high accuracy values. In contrast, features derived from the NasNet-Large model resulted in relatively lower classification accuracy, around 86%, compared to other models.

When considering standard deviation and variance analyses, the ReliefF method exhibited lower standard deviation values in both training and test accuracies, indicating more stable performance. In contrast, the Chi² method showed greater variability in accuracy rates.

As a result, it can be said that the combination that produces the most stable results in the classifications using 100 features is the classification of the features extracted from the EfficientNet-b0 model with machine learning algorithms by selecting them with the ReliefF method.

Since the best result was obtained using the EfficientNet-b0 algorithm, the training and test confusion matrices based on the Chi² feature selection method applied to this algorithm are presented in Figure 7. As seen in Figure 7, when the number of features was reduced using the Chi² method with the EfficientNet-b0 algorithm, a training accuracy of 88.4% and a test accuracy of 92% were achieved.

In this study, a five-class classification was performed, covering the most common tomato diseases, namely late blight, early blight, gray mold, bacterial cancer, and healthy tomatoes. The dataset used in the study is entirely original, consisting of 6414 images collected from three groups: leaves, red tomatoes, and green tomatoes, captured in the production field. These data were processed using 21 deep learning algorithms, and their results were evaluated. Among these, the best-performing models were identified as NasNet-Large, ResNet-50, DenseNet201, EfficientNet-b0, and Places365-GoogLeNet.

From each of the selected models, 1000 features were extracted. Feature selection algorithms (MRMR, Chi², ReliefF, ANOVA, and Kruskal–Wallis) were then applied to select 100 features from each set. The newly constructed datasets were split into 80% training and 20% test sets, and models were trained using five-fold cross-validation with various machine learning algorithms. As a result, a total of 51 combinations were evaluated, comprising 21 deep learning algorithms and 5 feature selection methods.

Among these combinations, the highest performance was achieved with the EfficientNet-b0 algorithm, where the Chi² feature selection method yielded a training accuracy of 88.4% and a test accuracy of 92%. Furthermore, it can be concluded that the most stable results in classifications using 100 features were obtained when features extracted from EfficientNet-b0 were selected using the ReliefF method and classified with machine learning algorithms.

In conclusion, this study demonstrates the potential and applicability of deep learning-based disease diagnosis systems in the agricultural sector. More effective integration of technology in agricultural practices can enhance both production processes and product quality, thus contributing to the widespread adoption of sustainable agricultural practices. In this context, the study can be considered an important step that may inspire future research.

In this study, it was observed that MRMR and Chi² reliably increased performance. Since the obtained results may differ depending on the model and class, each class was evaluated separately.

4. Discussion

The tomato plant is one of the most widely cultivated agricultural products globally, and various diseases affecting its leaves, stems, or fruits can significantly reduce yield. Early detection of these diseases is of critical importance, not only to prevent economic losses but also to enable the implementation of environmentally sustainable pest management strategies. Artificial intelligence-based approaches, particularly those utilizing deep learning-supported image processing techniques, have demonstrated high success in this regard.

In this study, a performance analysis of classification models was conducted by combining deep features extracted from 21 deep learning architectures—including NasNet-Large, ResNet-50, and DenseNet-201—with five different feature selection methods, namely MRMR, Chi², ReliefF, ANOVA, and Kruskal–Wallis.

Table 10 shows the accuracy rates of the classification operations performed with the deep learning method, feature extraction and feature selection operations. As can be understood from this table, it is seen that the success rates of the classification operations performed after the feature extraction and feature selection operations increase.

Table 11 presents a summary of previous studies related to the classification of tomato diseases, including the methods employed and the reported accuracy rates.

Unlike the academic studies and other studies in Table 11, the studies carried out so far have been based on only leaf, tomato data, and limited images. In this study, red tomato, green tomato, and leaf data taken in the natural environment were used. In previous studies, deep learning methods were used and the algorithms with the best results were determined. In this study, the NasNet-LArge algorithm gave the best results without feature extraction, but it was observed that the accuracy rates of the EfficientNet-b0 algorithm were higher after feature extraction. In this study, 21 deep learning algorithms, the feature extraction of the first five algorithms among these algorithms, the application of five different feature selection methods to these algorithms, and even the classification process using machine learning methods (KNN, SVM, and ANN) within each feature were considered. Therefore, the originality of this study in the current situation and the comparisons made will guide future studies to be carried out not only for tomato diseases but also for the classification of different types of diseases.

Although some studies report higher classification accuracies (e.g., 99.30% with MobileNetV2 [8] or 98.7% with EfficientNet-B4 [10]), these results are usually achieved using tomato leaf images on ideal datasets such as PlantVillage. Our dataset, collected under field conditions, includes changes in illumination, occlusions, and background complexity. Moreover, our model is designed with computational efficiency in mind for potential deployment in mobile or embedded systems. Our dataset also includes red and green tomato images along with tomato leaves. Considering these constraints, 92.0% accuracy is a promising result, indicating a good balance between performance and practicality in real-world scenarios.

5. Conclusions

In this study, a comprehensive performance analysis was performed using deep learning and machine learning techniques for the detection of tomato diseases. While other studies on this subject only exhibit approaches based on ready-made datasets and tomato leaf images, in this study, a more diverse and realistic dataset consisting of red tomatoes, green tomatoes, and leaves collected in natural conditions was used.

A total of 1000 deep features were extracted from 21 different deep learning models. These features were reduced to 100 with five different feature selection algorithms, and the classification performance was significantly increased. While the best result in the classification process without feature extraction was obtained with the NasNet-Large algorithm, the highest classification accuracy after feature extraction and selection was obtained with the EfficientNet-b0 algorithm. After feature extraction with deep learning methods and feature selection processes to select the most effective 100 features in classification, a hybrid classification approach with machine learning algorithms, such as KNN, SVM, and ANN, has been used to provide an effective and computationally efficient solution.

Although some studies report higher accuracy rates with ready-made datasets consisting of tomato leaves, the 92% accuracy rate obtained in this study is quite promising when the naturalness and difficulty level of the dataset used are taken into account. These results demonstrate the robustness and practicality of the developed method; they show that it is suitable for real-time applications, especially in environments with limited resources such as mobile or embedded systems. This study provides original contributions that will guide future studies not only in the diagnosis of tomato diseases but also in the classification of different plant diseases.

Author Contributions

A.M.A.S. collected the images that constitute the dataset and conducted literature research. A.G. acted as the thesis advisor and managed and organized the study. H.T. performed all deep-learning and machine-learning training in the study. M.Y.A.-B. performed labeling operations on the collected dataset. A.G. and H.T. conducted material and method research and prepared the article. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Selcuk University, the Scientific Research Projects Coordination Office grant number 25601023.

Data Availability Statement

Data is contained within the article.

Acknowledgments

This study was produced by Adnan Mohammad Anwer Shakarji (unpublished Ph.D. thesis).

Conflicts of Interest

The authors declare no conflicts of interest.

References

IndexBox.Global Tomato Market Report: Market Trends and per Capita Consumption. Available online: https://www.indexbox.io/blog/tomato-world-market-overview-2023 (accessed on 28 May 2025).
Takale, D.G.; Mahalle, P.N.; Deshpande, V.; Banchhor, C.B.; Gawali, P.P.; Deshmukh, G.; Khan, V.; Maral, V.B. Image Processing and Machine Learning for Plant Disease Detection. In Proceedings of the International Conference on Artificial-Business Analytics, Quantum and Machine Learning, Faridabad, India, 14–15 July 2023; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
Ferentinos, K.P. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 2018, 145, 311–318. [Google Scholar] [CrossRef]
Too, E.C.; Li, Y.; Njuki, S.; Liu, Y. A comparative study of fine-tuning deep learning models for plant disease identification. Comput. Electron. Agric. 2019, 161, 272–279. [Google Scholar] [CrossRef]
Demirci, D.; Saraçbaşı, E.; Emrah, E.; Uzun, I.; Genç, Y.; Özkan, K. Domates hastaliği tahmini için gerçek zamanli uygulama. Eskişeh. Osman. Üniv. Mühendis. Mimar. Fak. Derg. 2022, 30, 90–95. [Google Scholar] [CrossRef]
Rubanga, D.P.; Loyani, L.K.; Richard, M.; Shimada, S. A deep learning approach for determining effects of Tuta Absoluta in tomato plants. arXiv 2020, arXiv:2004.04023. [Google Scholar]
Roy, A.M.; Bhaduri, J. A deep learning enabled multi-class plant disease detection model based on computer vision. Ai 2021, 2, 413–428. [Google Scholar] [CrossRef]
Khan, A.; Nawaz, U.; Kshetrimayum, L.; Seneviratne, L.; Hussain, I. Early and accurate detection of tomato leaf diseases using tomformer. In Proceedings of the 2023 21st International Conference on Advanced Robotics (ICAR), Abu Dhabi, United Arab Emirates, 5–8 December 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
Wagle, S.A. A Deep Learning-Based Approach in Classification and Validation of Tomato Leaf Disease. Trait. Signal 2021, 38, 699–709. [Google Scholar] [CrossRef]
Kılıçarslan, S.; Pacal, I. Domates Yapraklarında Hastalık Tespiti İçin Transfer Öğrenme Metotlarının Kullanılması. Mühendis. Bilim. Araştırmaları Derg. 2023, 5, 215–222. [Google Scholar] [CrossRef]
Kotwal, J.G.; Kashyap, R.; Shafi, P.M. Artificial driving based EfficientNet for automatic plant leaf disease classification. Multimed. Tools Appl. 2024, 83, 38209–38240. [Google Scholar] [CrossRef]
Baki, D. Domates (Solanum lycopersicum L.) bakteriyel öz nekrozu hastalık etmenleri Dickeya chrysanthemi, Pectobacterium carotovorum subsp. carotovorum, Pseudomonas cichorii, Pseudomonas corrugata, Pseudomonas fluorescens, Pseudomonas mediterranea ve Pseudomonas viridiflava’nın LNA probe kullanılarak real-time PCR tanısı ve hastalıklı bitki dokularından tespiti. Master’s Thesis, University of Akdeniz, Antalya, Turkey, 2014. [Google Scholar]
Fidan, H.; Sarı, N. Domateste Tomato spotted wilt virüs’ üne karşı dayanıklılığı kıran izolatının fenotipik karakterizasyonu. Mediterr. Agric. Sci. 2019, 32, 307–314. [Google Scholar]
Mutlu, G.; Üstüner, T. Elazığ ili domates alanlarında fungal hastalıkların yaygınlığı ve şiddetinin saptanması. Türk Tarım Doğa Bilim. Derg. 2017, 4, 416–425. [Google Scholar]
Çaplık, D.; Kara, S.; Çiftçi, O.; Yılmaz, F. Elazığ ili domates ve biber alanlarında fitoplazma hastalıklarının tespiti ve karakterizasyonu. Mustafa Kemal Üniv. Tarım Bilim. Derg. 2022, 28, 269–278. [Google Scholar] [CrossRef]
Albayrak, Ü.; Gölcük, A.; Aktaş, S. Agaricus bisporus’ ta görüntü tabanlı hastalık sınıflandırması için kapsamlı veri seti. Mantar Derg. 2024, 15, 29–42. [Google Scholar]
Yasar, A.; Golcuk, A.; Sari, O.F. Classification of bread wheat varieties with a combination of deep learning approach. Eur. Food Res. Technol. 2024, 250, 181–189. [Google Scholar] [CrossRef]
Golcuk, A.; Yasar, A. Classification of bread wheat genotypes by machine learning algorithms. J. Food Compos. Anal. 2023, 119, 105253. [Google Scholar] [CrossRef]
Mohanty, S.P.; Hughes, D.P.; Salathé, M. Using deep learning for image-based plant disease detection. Front. Plant Sci. 2016, 7, 1419. [Google Scholar] [CrossRef]
Fuentes, A.; Yoon, S.; Kim, S.C.; Park, D.S. A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition. Sensors 2017, 17, 2022. [Google Scholar] [CrossRef]
Chaerani, R.; Voorrips, R.E. Tomato early blight (Alternaria solani): The pathogen, genetics, and breeding for resistance. J. Gen. Plant Pathol. 2006, 72, 335–347. [Google Scholar] [CrossRef]
Shakarji, A.; Gölcük, A. Classification of Tomato Diseases Using Deep Learning Method. J. Intell. Syst. Internet Things 2025, 14, 213. [Google Scholar]
Soylu, E.M.; Kurt, Ş.; Soylu, S. In vitro and in vivo antifungal activities of the essential oils of various plants against tomato grey mould disease agent Botrytis cinerea. Int. J. Food Microbiol. 2010, 143, 183–189. [Google Scholar] [CrossRef]
Nowicki, M.; Foolad, M.R.; Nowakowska, M.; Kozik, E.U. Potato and tomato late blight caused by Phytophthora infestans: An overview of pathology and resistance breeding. Plant Dis. 2012, 96, 4–17. [Google Scholar] [CrossRef]
Çelik, İ.; Özalp, R.; Çelik, N.; Polat, İ.; Sülü, G. Domates lekeli solgunluk virüsü (TSWV)’ne dayanıklı sivri biber hatlarının geliştirilmesi. Derim 2018, 35, 27–36. [Google Scholar] [CrossRef]
Felipe, V.; Romero, A.M.; Montecchia, M.S.; Vojnov, A.A.; Bianco, M.I.; Yaryura, P.M. Xanthomonas vesicatoria virulence factors involved in early stages of bacterial spot development in tomato. Plant Pathol. 2018, 67, 1936–1943. [Google Scholar] [CrossRef]
Horuz, S.; Karut, Ş.T.; Aysan, Y. Domates bakteriyel kanser ve solgunluk hastalığı etmeni Clavibacter michiganensis subsp. michiganensis’ in tohumda aranması ve tohum uygulamalarının patojen gelişimine etkisinin belirlenmesi. Tekirdağ Ziraat Fak. Derg. 2019, 16, 284–296. [Google Scholar] [CrossRef]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: New York, NY, USA, 2013; Volume 112. [Google Scholar]
Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 20–25 August 1995; Volume 2, pp. 1137–1143. [Google Scholar]
Bakr, M.; Abdel-Gaber, S.; Nasr, M.; Hazman, M. Tomato disease detection model based on densenet and transfer learning. Appl. Comput. Sci. 2022, 18, 56–70. [Google Scholar] [CrossRef]
Zayani, H.M.; Ammar, I.; Ghodhbani, R.; Maqbool, A.; Saidani, T.; Ben Slimane, J.; Kachoukh, A.; Kouki, M.; Kallel, M.; Alsuwaylimi, A.A.; et al. Deep learning for tomato disease detection with yolov8. Eng. Technol. Appl. Sci. Res. 2024, 14, 13584–13591. [Google Scholar] [CrossRef]
Abbas, A.; Jain, S.; Gour, M.; Vankudothu, S. Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput. Electron. Agric. 2021, 187, 106279. [Google Scholar] [CrossRef]
Trivedi, N.K.; Gautam, V.; Anand, A.; Aljahdali, H.M.; Villar, S.G.; Anand, D.; Goyal, N.; Kadry, S. Early detection and classification of tomato leaf disease using high-performance deep neural network. Sensors 2021, 21, 7987. [Google Scholar] [CrossRef] [PubMed]
Khasawneh, N.; Faouri, E.; Fraiwan, M. Automatic detection of tomato diseases using deep transfer learning. Appl. Sci. 2022, 12, 8467. [Google Scholar] [CrossRef]
Albahli, S.; Nawaz, M. DCNet: DenseNet-77-based CornerNet model for the tomato plant leaf disease detection and classification. Front. Plant Sci. 2022, 13, 957961. [Google Scholar] [CrossRef]
Ahmed, S.; Hasan, B.; Ahmed, T.; Sony, R.K.; Kabir, H. Less is more: Lighter and faster deep neural architecture for tomato leaf disease classification. IEEE Access 2022, 10, 68868–68884. [Google Scholar] [CrossRef]
Kabir Oni, M.; Tanzin Prama, T. Optimized Custom CNN for Real-Time Tomato Leaf Disease Detection. arXiv 2025, arXiv:2502.18521. [Google Scholar]

Figure 1. Tomato disease classification block diagram.

Figure 2. Sample images of red and green tomatoes and leaves.

Figure 3. Images from the greenhouse where the data were collected.

Figure 4. Training result graphs and performance metrics of the NasNet-Large algorithm in MATLAB.

Figure 5. Confusion matrices and classification accuracy rates of the five best-performing deep learning models for tomato disease classification. ((a) NasNet-Large, (b) ResNet-50, (c) DenseNet201, (d) EfficientNet-b0, (e) Place365-GoogLeNet).

Figure 6. Heat map showing the class-based success rates (%) of the five deep learning models with the highest overall test accuracy. (The rows represent the model names, while the columns indicate the disease classes and Healthy class. The color intensity corresponds to the classification success rate).

Figure 7. EfficientNet-b0 algorithm Chi² method training and test comparison matrices. (a) Training comparison matrix and disease classification accuracy rates using training comparison matrices. (b) Test comparison matrix and disease classification accuracy rates using test comparison matrices.

Table 1. Augmented data counts.

Disease	Original Data				Augmented Data
Disease	Red	Leaf	Green	Total	Red	Leaf	Green	Total
Tomato Mildew Disease	374	369	122	865	748	738	244	1730
Tomato Early Leaf Blight	110	277	97	484	220	554	194	968
Tomato Powdery Mildew	157	223	123	503	314	446	246	1006
Tomato Bacterial Cancer and Spot	215	183	125	523	430	366	250	1046
Normal Healthy Tomato	458	264	110	832	916	528	220	1664
Total	1314	1316	577	3207	2628	2632	1154	6414

Table 2. Training parameters for the first 5 algorithms.

Parameter	NasNet-Large	ResNet-50	DenseNet-201	EfficientNet-b0	Places365-GoogleNet
Initial Learning Rate	0.0001	0.00005	0.0001	0.0001	0.00005
Verification Frequency (iterations)	30	30	30	30	30
Maximum Epoch Number (epochs)	10	20	10	10	20
Mini-Batch Size (samples per batch)	16	32	16	32	32
Mixing	Each epoch	Each epoch	Each epoch	Each epoch	Each epoch

Table 3. Training and test accuracy rates and run times of deep learning algorithms.

No	Algorithms	Training Accuracy Rate	Test Accuracy Rate	Time
1	Nasecet-Large	88.07	87.23	2729 min 39 s
2	Resecet-50	88.07	86.85	625 min 25 s
3	DenseNet-201	86.59	85.82	604 min 50 s
4	EfficientNet-b0	86.53	85.76	140 min 51 s
5	Places365-GoogLeNet	86.36	85.76	273 min 55 s
6	Inception-v3	85.58	84.31	359 min 54 s
7	Xception	84.8	83.15	733 min 17 s
8	Nasecet-Mobile	84.18	82.53	214 min 7 s
9	Resecet-101	84.1	82.18	1371 min 33 s
10	Inception-Resecet-v2	84.02	80.94	594 min 8 s
11	GoogleNet	82.93	82.17	140 min 39 s
12	ShuffleNet	82.7	80.75	60 min 2 s
13	DarkNet-53	80.75	77.6	566 min 44 s
14	MobileNet-v2	79.58	77.14	319 min 33 s
15	AlexNet	79.27	76.37	95 min 34 s
16	SquuzeNet	77.71	74.31	110 min 39 s
17	Resecet-18	77.63	74.34	187 min 36 s
18	VGG-19	75.14	72.49	368 min 19 s
19	VGG-16	68.28	69.60	382 min 26 s
20	CNN	66.17	60.23	403 min 13 s
21	DarkNet-19	62.67	57.16	262 min 26 s

Table 4. Accuracy rates of deep learning methods for disease classification with test data.

No	Algorithms	Mildew Disease (%)	Early Leaf Blight (%)	Powdery Mildew (%)	Tomato Bacterial Cancer and Spot (%)	Normal Healthy Tomato (%)	Test Data Accuracy Rate (%)
1	NasNet-Large	86.13	86.60	89.55	76.56	97.30	87.23
2	ResNet-50	89.31	89.18	85.57	73.21	97.00	86.85
3	DenseNet-201	81.21	84.02	87.06	77.99	98.80	85.82
4	EfficientNet-b0	84.68	87.11	81.09	77.99	97.90	85.76
5	Places365-GoogLeNet	81.5	89.69	83.08	77.51	97	85.76
6	Inception-v3	82.66	81.44	82.09	76.56	98.80	84.31
7	Xception	83.82	80.41	77.11	75.60	98.80	83.15
8	NasNet-Mobile	82.37	73.71	84.08	73.68	98.80	82.53
9	ResNet-101	85.55	80.41	79.10	67.94	97.90	82.18
10	Inception-ResNet-v2	92.2	78.87	65.17	69.38	99.1	80.94
11	GoogleNet	75.43	78.87	90.05	68.9	97.6	82.17
12	ShuffleNet	83.82	79.90	78.61	64.11	97.30	80.75
13	DarkNet-53	90.46	70.10	74.63	57.89	94.89	77.6
14	MobileNet-v2	81.79	74.23	74.63	58.37	96.70	77.14
15	AlexNet	83.53	65.46	74.63	61.24	97.00	76.37
16	SquuzeNet	89.31	72.16	67.16	50.72	92.19	74.31
17	ResNet-18	86.42	72.16	67.16	51.67	94.29	74.34
18	VGG-19	75.72	60.31	78.61	52.63	95.20	72.49
19	VGG-16	34.68	65.98	91.54	60.29	95.50	69.60
20	CNN	85.84	41.75	48.26	34.93	90.39	60.23
21	DarkNet-19	88.73	44.33	62.69	11.96	78.08	57.16

Table 5. Performance comparison of the top five deep learning-based classification models.

Yöntemler	Sınıf	Precision	Recall (TPR)	F1-Score	ROC AUC
NasNet-Large	Class 1	0.849	0.861	0.855	0.903
	Class 2	0.866	0.866	0.866	0.921
	Class 3	0.841	0.896	0.867	0.932
	Class 4	0.870	0.766	0.814	0.872
	Class 5	0.953	0.973	0.963	0.978
DenseNet 201	Class 1	0.854	0.812	0.833	0.880
	Class 2	0.853	0.840	0.847	0.907
	Class 3	0.866	0.871	0.868	0.923
	Class 4	0.751	0.780	0.765	0.865
	Class 5	0.972	0.988	0.972	0.986
ResNet-50	Class 1	0.801	0.893	0.844	0.905
	Class 2	0.915	0.892	0.903	0.939
	Class 3	0.847	0.856	0.851	0.914
	Class 4	0.874	0.732	0.797	0.856
	Class 5	0.974	0.970	0.974	0.981
EfficientNetb0	Class 1	0.809	0.847	0.828	0.887
	Class 2	0.889	0.871	0.880	0.926
	Class 3	0.815	0.811	0.813	0.888
	Class 4	0.827	0.780	0.803	0.874
	Class 5	0.978	0.979	0.978	0.985
Placed_Google	Class 1	0.815	0.815	0.815	0.873
	Class 2	0.883	0.897	0.890	0.938
	Class 3	0.843	0.831	0.837	0.901
	Class 4	0.794	0.775	0.785	0.868
	Class 5	0.963	0.970	0.963	0.977

Table 6. Performance comparison of classification results obtained by using 100 features obtained by feature selection methods.

Deep Feature Extraction Meth.	MRMR			Chi²			ReliefF
Deep Feature Extraction Meth.	Machine Learning Model	Training Accuracy Rate	Test Accuracy Rate	Machine Learning Model	Training Accuracy Rate	Test Accuracy Rate	Machine Learning Model	Training Accuracy Rate	Test Accuracy Rate
NasNet-Large	Fine KNN	83.53	86.70	Fine KNN	82.40	87.40	Fine KNN	84.30	86.60
	Fine Gaussian SVM	77.70	80.80	Cubic SVM	79.50	81.90	Cubic SVM	78.40	83.20
	Wide Neural Network	72.80	75.60	Wide Neural Network	73.10	75.80	Wide Neural Network	72.60	75.40
ResNet-50	KSubspace KNN	88.3	88.8	Subspace KNN	86.7	89.4	Subspace KNN	87.8	90.7
	Cubic SVM	83.4	84.4	Cubic SVM	82	83.7	Cubic SVM	83	86.2
	Wide Neural Network	77.8	79.3	Wide Neural Network	77.6	78.6	Wide Neural Network	76.8	79.1
DenseNet-201	Subspace KNN	86.6	90.8	Subspace KNN	86.3	90.4	Subspace KNN	87.5	90.2
	Cubic SVM	82.6	84.6	Cubic SVM	81.9	86.2	Cubic SVM	82.9	84.9
	Wide Neural Network	76	78.9	Wide Neural Network	76	78.4	Wide Neural Network	77	79.8
EfficientNet-b0	Subspace KNN	89.3	91.7	Fine KNN	88.4	92	Subspace KNN	89.6	91.2
	Cubic SVM	84.7	85.4	Cubic SVM	83.6	86.6	Cubic SVM	85.4	87.1
	Wide Neural Network	79.3	77.7	Wide Neural Network	78	79.4	Wide Neural Network	78.9	81
Places365-GoogLeNet	Subspace KNN	84.4	86.7	Fine KNN	82.8	85.4	Subspace KNN	85.6	87.9
	Cubic SVM	79.4	83.1	Cubic SVM	79.6	82.4	Cubic SVM	80.5	81.2
	Wide Neural Network	73.3	75	Wide Neural Network	73.8	76.2	Wide Neural Network	73.7	74.4

Table 7. Training and test accuracy results obtained with 100 features.

Deep Feature Extraction Methods	MRMR			Chi²			ReliefF
Deep Feature Extraction Methods	Machine Learning Model	Training Accuracy Rate	Test Accuracy Rate	Machine Learning Model	Training Accuracy Rate	Test Accuracy Rate	Machine Learning Model	Training Accuracy Rate	Test Accuracy Rate
NasNet-Large	Fine KNN	83.53	86.70	Fine KNN	82.40	87.40	Fine KNN	84.30	86.60
ResNet-50	KSubspace KNN	88.3	88.8	Subspace KNN	86.7	89.4	Subspace KNN	87.8	90.7
DenseNet-201	Subspace KNN	86.6	90.8	Subspace KNN	86.3	90.4	Subspace KNN	87.5	90.2
EfficientNet-b0	Subspace KNN	89.3	91.7	Fine KNN	88.4	92	Subspace KNN	89.6	91.2
Places365-GoogLeNet	Subspace KNN	84.4	86.7	Fine KNN	82.8	85.4	Subspace KNN	85.6	87.9
Ortalama		86.43	88.94		85.32	88.92		86.96	89.32
Standart Sapma		2.20	2.06		2.33	2.31		1.84	1.77
Varyans Analiz		4.86	4.23		5.45	5.32		3.38	3.13

Table 8. The impact of feature selection methods on deep learning models: Class-based performance comparison (NasNet-Large, DenseNet 201, and ResNet-50).

Feature Selection Methods	Class	NasNet-Large				DenseNet 201				ResNet-50
Feature Selection Methods	Class	Precision	Recall	F1-Score	ROC AUC	Precision	Recall	F1-Score	ROC AUC	Precision	Recall	F1-Score	ROC AUC
MRMR	1	0.88	0.79	0.83	0.87	0.91	0.85	0.88	0.91	0.86	0.87	0.87	0.91
	2	0.84	0.85	0.84	0.91	0.92	0.95	0.93	0.97	0.92	0.86	0.89	0.92
	3	0.86	0.88	0.87	0.93	0.92	0.89	0.90	0.94	0.88	0.89	0.88	0.93
	4	0.80	0.81	0.81	0.89	0.85	0.85	0.85	0.91	0.84	0.80	0.82	0.89
	5	0.95	0.98	0.95	0.98	0.96	0.99	0.96	0.98	0.95	0.98	0.95	0.98
Chi²	1	0.85	0.79	0.82	0.87	0.89	0.86	0.87	0.91	0.85	0.88	0.87	0.91
	2	0.92	0.87	0.89	0.93	0.91	0.91	0.91	0.95	0.89	0.89	0.89	0.94
	3	0.88	0.87	0.87	0.92	0.91	0.94	0.92	0.96	0.94	0.87	0.90	0.93
	4	0.77	0.84	0.80	0.90	0.85	0.81	0.83	0.89	0.84	0.81	0.83	0.89
	5	0.97	0.99	0.97	0.98	0.97	0.99	0.97	0.99	0.96	0.98	0.96	0.98
ReliefF	1	0.86	0.79	0.83	0.87	0.87	0.84	0.86	0.90	0.88	0.88	0.88	0.92
	2	0.87	0.89	0.88	0.93	0.92	0.94	0.93	0.96	0.92	0.88	0.90	0.93
	3	0.88	0.85	0.86	0.91	0.88	0.89	0.88	0.93	0.91	0.89	0.90	0.94
	4	0.77	0.81	0.79	0.88	0.87	0.85	0.86	0.91	0.85	0.86	0.85	0.91
	5	0.95	0.98	0.95	0.97	0.96	0.98	0.96	0.98	0.97	1.00	0.97	0.99

Table 9. The impact of feature selection methods on deep learning models: Class-based performance comparison (EfficientNetb0, Places365-GoogLeNet).

Feature Selection Methods	Class	EfficientNetb0				Places365-GoogLeNet
Feature Selection Methods	Class	Precision	Recall	F1-Score	ROC AUC	Precision	Recall	F1-Score	ROC AUC
MRMR	1	0.92	0.87	0.89	0.92	0.87	0.80	0.83	0.88
	2	0.91	0.92	0.91	0.95	0.83	0.89	0.86	0.93
	3	0.90	0.91	0.90	0.94	0.89	0.84	0.86	0.91
	4	0.88	0.88	0.88	0.93	0.82	0.78	0.80	0.87
	5	0.97	0.99	0.97	0.99	0.94	0.99	0.94	0.98
Chi²	1	0.91	0.88	0.90	0.93	0.84	0.81	0.82	0.87
	2	0.94	0.91	0.93	0.95	0.84	0.87	0.85	0.92
	3	0.95	0.93	0.94	0.96	0.85	0.85	0.85	0.91
	4	0.86	0.87	0.87	0.92	0.79	0.75	0.77	0.86
	5	0.96	0.99	0.96	0.98	0.94	0.97	0.94	0.97
ReliefF	1	0.89	0.88	0.89	0.92	0.86	0.83	0.85	0.89
	2	0.90	0.95	0.93	0.97	0.84	0.87	0.85	0.92
	3	0.89	0.88	0.88	0.93	0.87	0.88	0.87	0.93
	4	0.88	0.84	0.86	0.91	0.86	0.79	0.82	0.88
	5	0.98	0.99	0.98	0.99	0.96	0.99	0.96	0.98

Table 10. Progressive performance analysis of different deep learning models.

Deep Learning Model	NasNet-Large (%)	ResNet-50 (%)	DenseNet 201 (%)	EfficientNetb0 (%)	Places365-GoogLeNet (%)
Data Augmentation	87.23	86.85	85.82	85.76	85.76
Feature Extraction	85.43	87.17	87.59	90.45	86.72
Feature Selection	87.4	90.7	90.8	92	87.9

Table 11. Similar studies on tomato diseases.

Reference	Method	Accuracy Rate (%)
Bakr, Abdel-Gaber et al. 2022 [30]	DenseNet201, Transfer Learning	Training: 99.84
Bakr, Abdel-Gaber et al. 2022 [30]	DenseNet201, Transfer Learning	Verification: 99.30
Zayani, Ammar et al. 2024 [31]	YOLOv8	66.67
Abbas, Jain et al. 2021 [32]	C-GAN + DenseNet121	5 Classes: 99.51
		7 Classes: 98.65
		10 Classes: 97.11
Trivedi, Gautam et al. 2021 [33]	Recommended Deep Neural Network	98.49
Khasawneh, Faouri et al. 2022 [34]	DenseNet201, Darknet-53	99.2
Albahli and Nawaz 2022 [35]	DenseNet-77 + CornerNet	98.4
Ahmed, Hasan et al. 2022 [36]	MobileNetV2 + Special Classifier	99.30
Kabir Oni and Tanzin Prama 2025 [37]	Special CNN	95.2
This study	EfficientNet-b0 + Chi² + Fine KNN	92

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Terzioğlu, H.; Gölcük, A.; Shakarji, A.M.A.; Al-Bayati, M.Y. Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection. Agronomy 2025, 15, 1509. https://doi.org/10.3390/agronomy15071509

AMA Style

Terzioğlu H, Gölcük A, Shakarji AMA, Al-Bayati MY. Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection. Agronomy. 2025; 15(7):1509. https://doi.org/10.3390/agronomy15071509

Chicago/Turabian Style

Terzioğlu, Hakan, Adem Gölcük, Adnan Mohammad Anwer Shakarji, and Mateen Yilmaz Al-Bayati. 2025. "Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection" Agronomy 15, no. 7: 1509. https://doi.org/10.3390/agronomy15071509

APA Style

Terzioğlu, H., Gölcük, A., Shakarji, A. M. A., & Al-Bayati, M. Y. (2025). Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection. Agronomy, 15(7), 1509. https://doi.org/10.3390/agronomy15071509

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of Deep Learning-Based Feature Extraction and Traditional Classification Approaches for Tomato Disease Detection

Abstract

1. Introduction

2. Materials and Methods

2.1. Five-Fold Cross-Validation

2.2. Model Performance Metrics

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI