1. Introduction
Agricultural productivity is constantly challenged by various environmental stressors, among which pest infestations are a leading cause of crop yield reduction. Invertebrate pests alone are responsible for up to 20% of major grain crop losses worldwide, a percentage expected to rise due to ongoing climate change [
1]. Maize (
Zea mays), a staple crop essential to global food security, is particularly vulnerable to pests. Among these,
Zyginidia pullula, a leafhopper species, significantly impacts maize productivity by feeding on the mesophyll cells of maize leaves, causing chlorophyll depletion, necrotic lesions, and impaired photosynthetic activity. Severe infestations can lead to premature leaf senescence, ultimately reducing overall crop yield and economic viability.
Traditional pest monitoring methods typically rely on manual field inspections, which are not only time-consuming and labor-intensive but also prone to human error [
2]. These limitations underscore the urgent need for automated, scalable, and reliable pest damage detection systems. In recent years, advancements in computer vision and artificial intelligence (AI) have revolutionized various agricultural applications, particularly in the detection and classification of plant diseases and pest-induced damage. Machine learning (ML)-based approaches, especially deep learning models, offer a promising alternative by enabling real-time, high-accuracy assessment of crop health through image analysis [
3].
Despite significant progress in ML-driven plant disease and pest detection, most existing studies focus on identifying the presence or absence of pests or diseases, often through binary classification [
4]. While machine learning and deep learning techniques have been widely used for crop damage assessment, limited attention has been given to classifying pest-induced damage severity, especially in maize crops. In particular, there is a lack of studies addressing small-scale pests such as
Zyginidia pullula, whose symptoms are often subtle and hard to detect. This gap highlights the necessity for a robust and automated framework capable of quantifying varying severity levels of pest damage.
This study proposes a novel hybrid machine learning approach to detect and assess pest-induced damage in maize leaves using a self-compiled dataset. The methodology integrates both traditional and deep learning-based feature extraction techniques, leveraging Gabor filters, Gray Level Co-occurrence Matrix (GLCM), Hue-Saturation-Value (HSV) color space, and Convolutional Neural Networks (CNNs), specifically ResNet-50, DenseNet-201, and EfficientNet-B2. Image preprocessing and segmentation are performed using Contrast Limited Adaptive Histogram Equalization (CLAHE) and U2Net, respectively, followed by Principal Component Analysis (PCA) for dimensionality reduction. Classification is conducted using Support Vector Machines (SVM), Random Forest (RF), and Artificial Neural Networks (ANN).
The primary contributions of this study are:
The development of an automated framework for pest-induced damage detection and severity assessment in maize crops, addressing a critical gap in the literature.
The integration of both traditional hand-crafted and deep learning-based features to improve classification performance.
A comparative evaluation of different machine learning classifiers to determine the most effective approach for maize pest damage classification.
To illustrate the dataset used in this study,
Figure 1 presents an example of a maize leaf affected by
Zyginidia pullula. The visible damage includes necrotic spots and chlorophyll depletion, which are characteristic symptoms of pest infestation. If left undetected, such damage can severely affect plant health, leading to reduced photosynthetic activity and eventual yield loss.
The proposed approach aims to enhance precision agriculture by facilitating early detection and targeted intervention strategies, ultimately minimizing crop losses and improving agricultural sustainability.
2. Related Works
The detection and classification of plant diseases play a crucial role in improving crop yield and maintaining agricultural sustainability. Recent advances in computer vision and deep learning have significantly contributed to automating this process, with convolutional neural networks (CNNs) emerging as the dominant approach. Several studies have proposed various deep learning models for disease detection in different crops, including tomato, maize, chili, bean, cucumber, and grape plants [
5,
6,
7].
For instance, ref. [
5] introduced a Modified InceptionResNet-V2 (MIR-V2) architecture, incorporating a transfer learning approach to classify seven distinct types of tomato leaf diseases, achieving an accuracy of 98.92%. Similarly, ref. [
8] developed a predictive model integrating feature extraction techniques such as shape, texture, and color, coupled with classifiers including Random Forest, SVM, and ANN. Their results demonstrated the effectiveness of machine learning models in plant disease classification.
Other studies, such as [
9], leveraged the PlantVillage dataset for maize disease classification, while [
10] utilized deep learning architectures, including Mask R-CNN, UNet, and PSPNet, to detect lesions in coffee plant leaves. Moreover, refs. [
11,
12] employed deep transfer learning techniques for the classification of corn disease types, achieving high accuracy and usability in real-world applications. These studies highlight the potential of CNN-based models for plant disease detection.
Although deep learning has significantly advanced plant disease classification, research focusing specifically on pest-induced damage assessment remains scarce. Most existing approaches primarily classify diseases based on visual symptoms such as discoloration, necrotic lesions, and fungal infections [
2,
13,
14,
15,
16]. However, pest damage differs significantly from plant diseases, as it manifests through progressive degradation of leaf structure, including chlorophyll depletion and mesophyll cell destruction [
17,
18]. This structural complexity makes automated detection of pest damage more challenging than disease classification.
Few studies have attempted to address pest-induced plant damage using machine learning models. For instance, ref. [
15] explored the identification of chili pest and disease using deep learning feature extraction combined with SVM, RF, and ANN classifiers, reporting superior performance over conventional methods. Similarly, ref. [
6] proposed meta-architectures integrating VGG-Net and ResNet to classify tomato diseases and pests. However, most of these studies focused on pest detection rather than damage assessment and severity quantification.
Despite the significant economic impact of insect pests on global crop production, there remains a notable gap in automated pest damage quantification. This study aims to bridge this research gap by developing a hybrid machine learning approach for detecting and assessing Zyginidia pullula-induced damage in maize leaves. By integrating CNN-based feature extraction with traditional hand-crafted features and machine learning classifiers, this research presents a novel framework for the quantification and classification of pest damage severity, setting it apart from prior studies that primarily focus on disease identification.
3. Materials and Methods
This section outlines the data acquisition process, preprocessing steps, feature extraction techniques, feature fusion strategy, and the classification methods employed in this study. A detailed description of the dataset is first provided, followed by the methodology for model training and evaluation.
3.1. Dataset
The dataset used in this study was obtained from maize crops cultivated by the authors in an agricultural field at Ege University, Izmir, Turkey. The approximate coordinates of the field are 38.4567° N latitude and 27.2211° E longitude. Images were captured periodically throughout the plant growth cycle, specifically in June 2023 and September 2023, to ensure variability in lighting, environmental conditions, and pest damage progression. This real-world dataset enhances the model’s generalization capability by incorporating naturally occurring variations found in agricultural settings.
Unlike conventional datasets, where plant specimens are manually examined under controlled laboratory conditions, this dataset was constructed to reflect real-world agricultural scenarios, ensuring a more practical and scalable approach for automated image-based analysis. The images were acquired using a Samsung Galaxy S23 smartphone, equipped with a 50-megapixel sensor. The primary camera features an f/1.8 aperture, multi-frame image processing, and an advanced AI-driven image enhancement system, ensuring high-quality images under natural lighting conditions. These specifications allow for the accurate capture of fine-grained texture details, which are crucial for distinguishing different levels of pest-induced damage.
The dataset primarily consists of maize leaves affected by
Zyginidia pullula, a pest known to cause damage by feeding on plant tissues, leading to a reduction in photosynthetic capacity and potential yield loss [
19]. Since publicly available datasets focusing specifically on pest-induced damage in maize crops are scarce, this study introduces a dedicated dataset to support machine learning-based classification models.
The primary objective of this dataset is to facilitate the development of automated pest damage detection and severity assessment models. All images were collected in an open field under natural daylight, specifically on clear, sunny days to ensure consistent lighting. Although the acquisition was conducted under controlled weather conditions, the images still reflect natural variations in perspective, leaf orientation, and background elements that typically occur in agricultural fields. This approach enhances the generalizability and robustness of the model by simulating real-world conditions where automated detection systems would operate.
One of the main challenges in constructing such datasets is the limited availability of labeled data for pest-induced damage assessment. To address this, a dataset tailored for feature extraction using both deep learning-based and handcrafted techniques was developed. By leveraging real-time data acquisition, this study aims to build a scalable and adaptable solution applicable to precision agriculture, supporting automated decision-making for early pest damage detection.
A total of 2350 images were collected and categorized into five distinct classes: healthy, infected (low, medium, high severity).
Figure 2 illustrates sample maize leaves classified across different infection severity levels.
Table 1 summarizes the dataset composition.
Labelling Process
A two-stage labeling approach was employed to ensure accurate classification. In the first stage, the images were classified into healthy and infected categories based on visual inspection.
Figure 2 illustrates sample maize leaves classified as healthy (a) and infected (b, c, d). Images with no visible signs of pest damage were labeled as healthy, while those exhibiting structural damage were categorized as infected.
In the second stage, as illustrated in
Figure 2, infected samples were further categorized based on severity: (b) low infection, (c) medium infection, and (d) high infection, representing the increasing impact of pest infestation. The specific criteria used to define these severity levels are summarized in
Table 2, including visual symptoms and the estimated percentage of leaf area affected. Specifically, low infection refers to images displaying minor discoloration and localized damage. Medium infection includes images showing more extensive chlorophyll depletion and pest-induced tissue destruction. High infection corresponds to images with severe structural degradation, indicating significant crop damage.
The labeling process was conducted manually by domain experts, ensuring a high level of precision. No dedicated annotation software was used; instead, each image was visually inspected and categorized based on expert judgment, then organized into class-specific folders. Although visual inspection may involve a degree of subjectivity, symptom-based criteria were used to ensure consistency, including thresholds based on discoloration, necrosis, and the estimated percentage of affected leaf area. While no standardized reference exists specifically for
Zyginidia pullula, the labeling approach followed common practices used in field-based pest severity evaluation studies [
20]. After labeling, all images were converted to PNG format and resized to 244 × 244 pixels for efficient model training.
This dataset serves as a critical resource for advancing precision agriculture. Unlike existing plant disease datasets, which primarily focus on fungal or bacterial infections, this dataset specifically addresses pest-induced damage, making it unique and essential for the development of pest management systems.
To ensure that differences observed in severity classification were due to pest damage and not external factors, all images were collected under consistent field conditions (clear, sunny days in the same location) and processed using the same image preparation pipeline. The dataset was designed to include four severity levels (healthy, low, medium, high), which were determined based on expert-defined visual criteria. All preprocessing and labeling steps were applied consistently across all samples, following a structured and controlled image-based classification procedure.
3.2. Methodology
The methodology of this study consists of five key steps: image acquisition, image preprocessing and segmentation, feature extraction, feature fusion, and classification. The goal is to develop an automated system that accurately detects and assesses pest-induced damage in maize plants using a hybrid machine learning approach.
The overall workflow of the proposed methodology is illustrated in
Figure 3, highlighting the sequential steps from image acquisition to final classification.
3.3. Image Segmentation Using U2Net
In this study, the U2Net architecture [
21] was employed for segmenting maize leaves from complex field backgrounds. Originally developed for salient object detection, U2Net features a nested U-structure with residual blocks that enable multi-scale feature extraction and fine boundary preservation with low computational cost. These capabilities make it particularly suitable for agricultural images captured under natural lighting, where leaf edges often blend with soil, weeds, and surrounding vegetation. U2Net was chosen for its proven performance in isolating foreground objects with high precision—an essential requirement for ensuring accurate downstream feature extraction from pest-damaged leaf regions.
The segmentation framework is illustrated in
Figure 4, where U2Net refines segmentation masks by integrating high-level semantic and low-level spatial features.
Figure 5 presents examples of original and segmented maize leaf images, demonstrating U2Net’s effectiveness in accurately extracting the leaf region.
By leveraging U2Net’s segmentation capability, only relevant leaf regions were retained for further analysis, enhancing feature extraction accuracy in later stages.
3.4. Image Pre-Processing
To enhance the quality of maize leaf images before feature extraction and classification, a series of image preprocessing techniques were applied. These techniques were designed to improve contrast, reduce noise, and ensure consistency across the dataset. The primary preprocessing steps included contrast enhancement, image resizing, and normalization.
3.4.1. Contrast Enhancement Using CLAHE
Contrast Limited Adaptive Histogram Equalization (CLAHE) [
22] was applied to improve the visibility of pest-induced damage on maize leaves. CLAHE is an advanced version of Adaptive Histogram Equalization (AHE) that enhances local contrast while preventing over-amplification of noise. Unlike standard histogram equalization, which applies a uniform contrast enhancement across the entire image, CLAHE operates on small, localized regions (tiles), redistributing lightness values to enhance details without causing artificial distortions. It was chosen in this study due to its suitability for natural, outdoor images where illumination may vary across regions of the leaf. Based on its demonstrated effectiveness in prior studies, CLAHE was considered a robust choice for highlighting subtle pest-related symptoms without introducing noise artifacts.
Figure 6 illustrates an example of a maize leaf before and after applying CLAHE, demonstrating the improved contrast that facilitates more accurate segmentation and feature extraction.
3.4.2. Image Resizing and Normalization
To standardize the dataset and optimize model performance, all images were resized to pixels. This resolution was chosen to balance computational efficiency and feature preservation. Resizing ensures that images maintain a uniform scale, preventing size discrepancies from affecting the feature extraction process.
Additionally, pixel intensity values were normalized to the range [0, 1] to improve the stability of deep learning models. Normalization is essential for CNN-based models, as it allows faster convergence during training and reduces sensitivity to illumination variations.
These preprocessing steps collectively improve the clarity and consistency of the dataset, facilitating more accurate feature extraction and classification.
3.5. Feature Extraction Using CNNs
Deep feature extraction was performed using three pre-trained Convolutional Neural Networks (CNNs): DenseNet201, EfficientNetB2, and ResNet50. These models were selected to provide a balance of depth, computational efficiency, and representational diversity.
DenseNet201 [
23], with its densely connected layers, improves feature reuse and captures fine-grained texture and structural patterns. EfficientNetB2 [
24] offers a good trade-off between accuracy and computational cost, making it suitable for real-time agricultural applications. ResNet50 [
25] uses residual connections to extract deep hierarchical features and is widely adopted for general-purpose image classification.
All models were initialized with ImageNet weights and used as fixed feature extractors. Preprocessed maize leaf images were fed into each CNN, and high-level feature vectors were obtained from the final convolutional layers.
These deep features were then fused with hand-crafted features to form a comprehensive representation, enhancing classification robustness and accuracy. The fusion process is detailed in
Section 3.7.
3.6. Hand-Crafted Feature Extraction
To enhance robustness and interpretability, hand-crafted features were used alongside CNN-based deep features. These methods capture low-level image characteristics that may be overlooked by deep models, particularly fine-grained textural and color variations caused by pest infestation.
Three techniques were employed: Gabor filters for texture, Gray Level Co-occurrence Matrix (GLCM) for spatial relationships, and Hue-Saturation-Value (HSV) color space for color variation—each contributing critical cues for identifying pest-induced damage.
3.6.1. Gabor Filters
Gabor filters are widely used for texture analysis due to their ability to capture spatial frequency, orientation, and edge information [
26]. The response of a Gabor filter is mathematically expressed as:
where
, and
. Here,
is the wavelength,
is the orientation,
is the scale, and
represents the aspect ratio.
In this study, Gabor filters were utilized to highlight the texture patterns associated with Zyginidia pullula damage on maize leaves. The characteristic feeding patterns of this pest create distinctive textures on the leaf surface, which are efficiently captured by Gabor filter responses at multiple orientations and scales.
3.6.2. Gray Level Co-Occurrence Matrix (GLCM)
The Gray Level Co-occurrence Matrix (GLCM) [
27] is a second-order statistical method that quantifies the spatial relationships between pixel intensities in an image. It provides useful texture descriptors such as:
Contrast: Measures the intensity variation between adjacent pixels.
Correlation: Quantifies the linear dependency of gray levels.
Energy: Represents image uniformity.
Homogeneity: Evaluates the closeness of distributed gray levels.
These GLCM-derived texture descriptors were computed for maize leaf images to capture subtle structural changes caused by pest infestation. By analyzing these statistical dependencies, GLCM aids in distinguishing healthy and infected leaves, as well as identifying different severity levels of pest damage.
3.6.3. Hue-Saturation-Value (HSV)
Color-based analysis is essential for detecting discoloration and chlorophyll depletion, which are key symptoms of pest damage. The Hue-Saturation-Value (HSV) color space was chosen instead of the RGB model due to its perceptual relevance and robustness to lighting variations [
28]. The transformation from RGB to HSV is defined as:
where
, and
represents saturation, while
denotes brightness.
By converting maize plant images to HSV, the hue component was utilized to detect discoloration patterns, saturation was analyzed to assess chlorophyll depletion, and brightness variations were studied to highlight infection areas. These color-based features complement both Gabor and GLCM-based texture features, ensuring a comprehensive feature representation.
3.6.4. Integration with Deep Features
To improve classification performance, hand-crafted features (Gabor, GLCM, HSV) were fused with deep CNN-based features. This integration combines the discriminative power of deep models with the interpretability of handcrafted descriptors. The fusion process is explained in
Section 3.7.
This multi-feature approach enhances the system’s robustness by leveraging spatial, textural, and color-based information—particularly useful for distinguishing different levels of pest-induced damage in maize leaves.
3.7. Feature Fusion and Classification
The fusion and classification stages were designed to optimize maize damage detection accuracy. As shown in
Figure 3, the process begins with image enhancement via CLAHE and segmentation via U2Net, followed by deep and hand-crafted feature extraction from the contrast-enhanced and segmented images.
3.7.1. Feature Fusion Strategy
To leverage the advantages of both deep CNN features and hand-crafted features, a feature fusion strategy was implemented. The extracted features from each method were concatenated to create a comprehensive feature set. Handcrafted features and deep features were concatenated into a single feature vector, and then reduced in dimensionality using PCA before classification. The fused feature vector is defined as:
where:
- -
represents the deep features extracted from pre-trained CNN models (DenseNet201, EfficientNetB2 and ResNet50),
- -
denotes the textural features obtained from Gabor filters,
- -
consists of statistical texture descriptors from the Gray Level Co-occurrence Matrix,
- -
captures the color-based characteristics of the maize leaf images.
Since the feature vectors from different methods have varying dimensions, Principal Component Analysis (PCA) was applied to reduce dimensionality while preserving the most significant features. The PCA transformation is defined as:
where
W is the matrix of eigenvectors, and
is the mean of the feature vectors. PCA helps mitigate redundancy, improves computational efficiency, and enhances model performance.
3.7.2. Classification Algorithms
After feature extraction and dimensionality reduction, the fused features were fed into three different machine learning classifiers: Support Vector Machines (SVM), Random Forest (RF), and Artificial Neural Networks (ANN). These classifiers were chosen based on their effectiveness in handling high-dimensional feature spaces and their proven performance in agricultural image classification tasks.
SVM is a widely used supervised learning algorithm that finds an optimal hyperplane that separates data points from different classes [
29]. Given a training dataset
, where
represents feature vectors and
denotes class labels, the decision function of an SVM is given by:
where
is the kernel function and
are the Lagrange multipliers. The Radial Basis Function (RBF) kernel was used in this study due to its ability to handle non-linearly separable data.
RF is an ensemble learning method that constructs multiple decision trees and aggregates their predictions to improve classification accuracy [
30]. The classification decision is made based on majority voting:
where
represents the prediction of the
t-th decision tree. RF was selected for its robustness against overfitting and ability to capture complex patterns in high-dimensional feature spaces.
ANN consists of multiple layers of neurons, each applying a nonlinear activation function to process input features. The output of a neuron in a hidden layer is computed as:
where
are the weights,
are the inputs, and
is the bias term. The Rectified Linear Unit (ReLU) activation function was used in hidden layers, defined as:
A feedforward ANN with multiple hidden layers was trained using the Adam optimization algorithm and categorical cross-entropy loss function. ANN was chosen for its ability to model complex feature interactions and learn hierarchical representations.
After training, the final classification decision for each maize leaf image was made based on the highest probability score assigned by the classifiers.
3.8. Performance Metrics
Four widely used evaluation metrics—accuracy, precision, recall, and F1-score—were employed to assess the classification performance of the proposed model. These metrics provide a comprehensive evaluation of the model’s effectiveness in distinguishing between healthy and damaged maize plants, as well as in determining the severity levels of pest-induced damage.
Accuracy measures the proportion of correctly classified instances relative to the total number of samples:
where:
- -
(True Positives) represents correctly classified infected samples,
- -
(True Negatives) denotes correctly classified healthy samples,
- -
(False Positives) corresponds to healthy samples incorrectly classified as infected,
- -
(False Negatives) denotes infected samples misclassified as healthy.
Accuracy provides a general measure of model performance; however, in imbalanced datasets, it may not always be the most reliable indicator.
Precision evaluates the fraction of true positive predictions among all instances classified as positive:
A high precision value indicates fewer false positives, making it particularly important in scenarios where misclassifying healthy leaves as infected could lead to unnecessary interventions.
Also known as sensitivity, recall measures the proportion of actual positive cases that were correctly classified:
A high recall value is essential in agricultural pest detection, where missing an infected plant could result in widespread crop damage if left untreated.
F1-score provides a balanced measure of precision and recall, making it useful in cases where there is an uneven class distribution:
This metric is particularly beneficial for datasets where both false positives and false negatives have significant consequences.
To enhance the robustness and generalizability of the model, five-fold stratified cross-validation was applied during performance assessment. Unlike standard k-fold cross-validation, stratified cross-validation ensures that each fold maintains the same class distribution as the original dataset, preventing any class imbalance from skewing the results.
Additionally, since the dataset was imbalanced, Precision-Recall (PR) curves were also examined to provide deeper insight into the trade-off between precision and recall across different decision thresholds.
By incorporating these comprehensive evaluation metrics, this study ensures that the proposed model effectively detects and classifies pest-induced damage in maize plants while minimizing misclassification errors.
4. Results
This section presents the details of the experimental setup, including implementation specifics, feature extraction and classification strategies, and model performance evaluation. The proposed framework was rigorously tested using various classifiers and feature extraction methods to assess its effectiveness in distinguishing pest-induced damage on maize leaves. The evaluation was conducted in a controlled computing environment, ensuring the reproducibility of results.
4.1. Implementation Details
To ensure robustness and generalization, a five-fold stratified cross-validation approach was adopted, given the imbalanced nature of the dataset. This technique divides the dataset into five equal folds while preserving the original class distribution. In each iteration, four folds were used for training, while the remaining fold was reserved for testing. This process was repeated five times, ensuring that every sample contributed to both training and testing, effectively minimizing bias and variance.
Feature extraction was performed using three CNN-based models (DenseNet201, EfficientNetB2, and ResNet50), while classification was carried out using three machine learning classifiers: Support Vector Machines (SVM), Random Forest (RF), and Artificial Neural Networks (ANN). To enhance classification performance and reduce dimensionality, Principal Component Analysis (PCA) was applied, selecting only the components that retained 95% of the total variance [
31]. Each CNN model extracted 1024 features per image, which were subsequently fused with hand-crafted descriptors.
The dataset was partitioned into 80% training and 20% testing subsets to ensure reliable performance evaluation. The classification performance was assessed using standard metrics, including accuracy, precision, recall, and F1-score, providing a comprehensive evaluation of the proposed system.
Hyperparameters in
Table 3 were selected to optimize model accuracy and prevent overfitting. The linear kernel was chosen for SVM due to its superior performance in high-dimensional feature spaces. Random Forest was configured with 100 decision trees, ensuring a balance between computational efficiency and model accuracy. The ANN model was designed with three hidden layers, incorporating ReLU activation and adaptive learning rates for enhanced convergence.
By employing this experimental setup, the study aimed to develop a robust and scalable pest damage classification system, ensuring high accuracy in real-world agricultural applications.
4.2. Classification Performance Analysis
The classification performance of the proposed method was evaluated using various CNN-based and handcrafted feature extraction techniques, in combination with different machine learning classifiers. The accuracy results obtained from different feature extraction and classification models are presented in
Table 4.
Among the single feature descriptors, ResNet50 achieved the highest accuracy, obtaining 91.58% with ANN and 90.69% with SVM. However, when hybrid feature combinations were considered, ResNet50 + HSV + GLCM + Gabor yielded the highest accuracy across all classifiers, reaching 92.55% with ANN, 87.22% with RF, and 91.05% with SVM.
These results demonstrate that combining deep learning-based features (ResNet50) with handcrafted features (HSV, GLCM, and Gabor) significantly improves classification performance. While individual CNN features provided strong results, integrating handcrafted descriptors enhanced the ability to distinguish infected leaves.
To further assess the model performance, a confusion matrix was generated for the best-performing feature combination (ResNet50 + HSV + GLCM + Gabor), as presented in
Table 5.
From the confusion matrix, the model correctly identified 2119 positive cases (infected maize leaves), but 128 samples were misclassified as negative. Similarly, 1686 negative samples (healthy leaves) were correctly identified, while 192 were falsely classified as positive. This indicates a high recall rate for infected leaves, but slight misclassifications between mild infections and healthy samples.
To further evaluate classification performance, Precision, Recall, and F1-score were analyzed for different feature combinations.
Figure 7,
Figure 8 and
Figure 9 illustrate the respective scores.
The highest precision was obtained using the ResNet50 + HSV + GLCM + Gabor combination, reaching 93.61% with the ANN classifier (
Figure 7). This suggests that hybrid features are effective at minimizing false positives, ensuring that classified infected samples truly belong to the infected category.
RF showed lower recall values compared to ANN and SVM, meaning it struggled to correctly identify all infected leaves. ANN and SVM performed consistently well across different feature sets, with the highest recall being observed in ResNet50 + HSV + GLCM, reaching 90.3% with the ANN classifier (
Figure 8).
Since F1-score balances precision and recall, it provides a comprehensive evaluation of classification performance. The ResNet50 + HSV + GLCM + Gabor combination once again achieved the highest score across all classifiers, reaching 91.65% with the ANN classifier, confirming that multi-feature fusion improves robustness in distinguishing infected maize leaves (
Figure 9).
These findings validate that hybrid feature combinations outperform single descriptors, ensuring more reliable and accurate pest damage classification in maize crops.
4.3. Infected Density Study Results
In this study, the severity of damage caused by
Zyginidia pullula on maize plants was analyzed using various feature extraction techniques and machine learning classifiers, ANN, RF and SVM. The classification performance for different feature extractors and classifiers is presented in
Table 6.
Among single descriptor models, ResNet50 with ANN achieved the highest accuracy 82.65%, outperforming EfficientNetB2 and DenseNet201. Combining ResNet50 with Gabor or HSV features further improved performance, reaching 83.03% and 82.46% accuracy with ANN, respectively. The best-performing combination was ResNet50 + Gabor + HSV, achieving 83.78% accuracy with ANN and 79.96% with SVM. RF consistently performed lower than ANN and SVM across all feature sets, reaching a maximum accuracy of 71.75%.
To better understand the model’s classification behavior at the class level, the confusion matrix for the ResNet50 + Gabor + HSV combination using the ANN classifier is presented in
Table 7. This table provides detailed insights into the true and misclassified instances across low, medium, and high severity categories.
The highest precision was achieved with the ResNet50 + Gabor + HSV combination, reaching 83.24% when using the ANN classifier (
Figure 10). In contrast, RF classifier yielded the lowest precision values across all feature extraction methods. Similarly, the highest recall was obtained with the ResNet50 + Gabor + HSV combination, achieving 83.78% with ANN, while RF again exhibited lower recall values (
Figure 11). Overall, ANN-based models consistently outperformed RF and SVM, with the best F1-score recorded at 83.43% using the ResNet50 + Gabor + HSV feature combination (
Figure 12).
The integration of CNN-based deep features with handcrafted descriptors significantly enhanced the classification performance, demonstrating the effectiveness of combining multiple feature extraction techniques. Among the classifiers, ANN consistently outperformed RF and SVM, particularly when hybrid feature combinations were employed. The best-performing feature set for predicting infection severity was ResNet50 + Gabor + HSV, achieving an accuracy of 83.78% with ANN, highlighting its superior ability to distinguish different levels of infection.
5. Discussion
This study introduces a novel approach for detecting and assessing the damage caused by Zyginidia pullula in maize plants by integrating deep learning-based and handcrafted feature extraction techniques. The experimental results indicate that hybrid feature extraction methods significantly enhance classification accuracy compared to individual feature sets. Among the evaluated models, the combination of ResNet50, HSV, GLCM, and Gabor features, when used with an Artificial Neural Network (ANN), achieved the highest classification accuracy of 92.55%. Furthermore, for severity classification, the best-performing model reached an accuracy of 83.78%, demonstrating the effectiveness of multi-feature fusion in pest damage detection.
While recent studies have effectively used deep learning models for pest and disease detection in maize, most focus either on general pest classification or disease type identification, without addressing the severity of damage. For instance, approaches such as [
32] or [
33] targeting diseases like Maize Lethal Necrosis (MLN) and Maize Streak Virus (MSV) emphasize classifying visual symptoms but overlook multi-level damage assessment. Some studies [
34] further employ adaptive thresholding on CNN class activation maps to estimate severity without expert input; however, these rely on pixel-level estimations and lack structured severity categories. In contrast, our study introduces a hybrid feature extraction approach that combines deep and hand-crafted features to classify the severity of damage caused specifically by
Zyginidia pullula. By integrating expert-defined visual criteria with interpretable severity levels, the proposed method addresses a critical gap in pest damage quantification for maize, offering both precision and practicality for real-world applications.
Unlike conventional studies that primarily focus on disease classification in plants, this research specifically addresses pest-induced damage assessment, a relatively unexplored area in agricultural image analysis. While previous studies have successfully employed Convolutional Neural Networks (CNNs) for plant disease detection, a significant gap remains in the quantification of pest-induced damage, particularly for small-scale pests such as Zyginidia pullula, where the damage is often subtle and not visually distinct. The findings of this study show that hybrid feature extraction offers superior performance over single-feature-based approaches, aligning with previous research in plant disease classification but extending its applicability to pest impact assessment.
The primary contribution of this research lies in the development of an automated framework capable of accurately detecting and quantifying pest-induced damage levels in maize leaves. By integrating handcrafted and deep learning-based features, a more comprehensive representation of infected regions was achieved, leading to improved classification accuracy. The results also underscore the importance of feature fusion, revealing that combining multiple feature extraction techniques enhances model robustness and reliability. These insights contribute to precision agriculture by facilitating early pest detection and enabling more effective pest management strategies. In future studies, a comparative evaluation of different segmentation networks such as UNet, DeepLab, and PSPNet could be conducted to further investigate the impact of segmentation choice on overall model performance.
Although the dataset used in this study is relatively limited in size, it addresses a critical gap in the literature, as no publicly available dataset focusing on Zyginidia pullula-induced damage in maize was found. Therefore, a custom dataset was created under real field conditions and labeled by expert-defined visual criteria. To ensure the reliability of the results despite the dataset size, several precautions were taken, including stratified five-fold cross-validation and dimensionality reduction through PCA. The proposed dataset and methodology together offer a valuable foundation for future studies aiming to develop robust pest severity classification systems.
This study has some limitations. First, the dataset was collected under controlled lighting conditions, which may limit its performance in more diverse environments. Second, only one segmentation network (U2Net) was used without comparison to other architectures. Additionally, the labeling process, although based on expert-defined visual criteria, may still involve some degree of subjectivity due to the absence of a universally accepted standard for Zyginidia pullula severity. Future work will address these aspects by incorporating more diverse data, performing comparative model evaluations, and exploring objective, quantifiable labeling methods.
6. Conclusions
This study presents a novel approach for detecting and assessing Zyginidia pullula-induced damage in maize crops by leveraging both deep learning and traditional feature extraction techniques. By integrating CNN-based models (ResNet50, DenseNet-201, and EfficientNetB2) with handcrafted features (Gabor filters, GLCM, and HSV), a robust feature representation was achieved, enabling the effective classification of infected maize leaves. The experimental results demonstrate that hybrid feature combinations significantly enhance classification accuracy compared to individual feature sets. Notably, the combination of ResNet50, HSV, GLCM, and Gabor, when paired with an ANN classifier, achieved the highest accuracy of 92.55%, underscoring the importance of multi-feature integration in pest damage detection.
Furthermore, a detailed analysis of infection severity levels was conducted, where the combination of ResNet50, Gabor, and HSV achieved an accuracy of 83.78% with ANN, highlighting the effectiveness of hybrid feature fusion in classifying infection intensity. These findings emphasize the advantages of integrating both deep and traditional feature extraction methods for automated pest damage detection in agricultural applications.
Unlike prior studies that primarily focus on plant disease classification, this research introduces an innovative approach for quantifying and classifying the damage caused by a specific pest, Zyginidia pullula, on maize leaves. To the best of our knowledge, no prior research systematically evaluates the severity of pest-induced damage using a combination of deep learning-based and handcrafted features. The proposed methodology bridges this gap by offering an automated and accurate framework for both binary classification (healthy vs. infected) and severity assessment of infection levels. The results strongly support the potential of hybrid feature extraction in enhancing classification accuracy, demonstrating that integrating multiple feature representations leads to improved pest damage detection.
Future research may focus on extending this approach to other pest species, incorporating larger and more diverse datasets, and optimizing computational efficiency for real-time agricultural applications. Additionally, to enhance scalability in field conditions, future work may explore UAV-based automated image collection pipelines.